[openssl-dev] The new OpenSSL license should be made GPLv2 compatible

2017-03-23 Thread Brian Smith
Hi,

I'm one of the people that received the email asking for permission to
relicense code to the new license, Apache 2.0. A major problem with
the Apache 2.0 license is that it is frequently seen as being
incompatible with the GPL2 license. Although many people consider it
to be compatible with the GPL3 license, many people also object to the
GPL3 license for important (to them) reasons. Therefore, I think it is
important for the OpenSSL license to be compatible with GPL2 too.

In the past, I created a library licensed under Apache 2.0,
mozilla::pkix. However, Red Hat and Mozilla requested that I
additionally license it under the GPLv2 so they could use it in
GPLv2-licensed contexts, and I did so.

Similarly, LLVM is working on moving to the Apache 2.0 license and
they ran into similar problems. They also made the effort to
explicitly grant the right to use the relicensed code under the GPLv2.
See [1] for details.

I think it is important that OpenSSL do something similar to
explicitly allow using OpenSSL code under the GPLv2 before any
relicensing takes place.

Thanks for your consideration.

[1] http://lists.llvm.org/pipermail/llvm-dev/2016-September/104778.html

Cheers,
Brian
-- 
https://briansmith.org/
-- 
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


Re: [openssl-dev] Partially- vs. full- reduced inputs to ecp_nistz256_neg

2016-08-18 Thread Brian Smith
Andy Polyakov  wrote:
> It appears to me that with multiplication, squaring, subtraction,
> negation, halving *preserving* property of being fully reduced (i.e. if
> inputs are fully reduced, then output is too), we only have to watch out
> for mul_by_[23], i.e. ensure that their outputs are fully reduced. This
> would ensure that output will always be fully reduced.

Let me state thing in a different way, and see if this is what you
meant: Every function will have as a prerequisite that its inputs are
fully reduced and will have a postcondition that its output is fully
reduced. We know that ecp_nistz256_add doesn't fully reduce its output
even if the input is fully reduced, and we know that
ecp_nistz256_mul_by_[23] are implemented in terms of ecp_nistz256_add
(or equivalent logic, in the case of some of the ASM stuff).
Accordingly, the plan of action is:

* Fix ecp_nistz256_mul_by_2 and ecp_nistz256_mul_by_3 to fully reduce
their outputs.

* Fix ecp_nistz256_add to fully reduce its output.

* Ensure in ecp_nistz256_points_mul that all the input coordinates are
fully reduced.

After all of this, we won't have to worry about the handling of
partially-reduced values anywhere.

Is that correct? In particular, you said "we only have to watch out
for mul_by_[23]" but you didn't mention ecp_nistz256_add, which *is*
used directly in ecp_nistz256_point_double, according to the reference
C implementation.

I think that is a reasonable course of action.

I'd like to suggest the following additional steps:

* Verify with Vlad that he agrees with our assessment of the current
state of the code and the plan. In particular, I still worry that what
we agreed on here doesn't match what Vlad told me before. Maybe Vlad
was thinking about some other code he wrote that uses the Almost
Montgomery math instead of regular Montgomery math. But maybe he knows
something we are overlooking. In particular, we should ask him and
Shay whether they intentionally made ecp_nistz256_add (et al.) return
partially-reduced results because they'd proven that full reduction
isn't necessary given the subsequent statements.

* Document in the top of ecp_nistz256.c the input and output bounds
are [0, P256) for all field operations.

* Test some interesting edge cases of all the functions to verify that
they generate the correct output. (This series of bug reports started
by me doing this, starting with ecp_nistz256_add.)

* Because we can't easily test it, audit the inlined code inside the
assembly langauge implementations of ecp_nistz256_{double, add,
add_affine} to ensure that it matches the logic within the standalone
functions that we can write tests for. (As you noted, in many cases,
the standalone field arithmetic functions and the inline variants
actually share code; in some cases, however, they don't.)

> In this and RT#4621 combined context one can conclude that *as long as*
> inputs to ecp_nistz256_point_add are fully reduced, is_equal calls work
> correctly, because there are no non-full-reduction-preserving calls
> prior them. Would you agree?

If all functions have the postcondition that they always fully reduce
their outputs then I agree that the is_equal calls are correct, and
also that the ecp_nistz256_neg function seems to be correct since it
gives the expected results when given fully reduced inputs.

Cheers,
Brian
-- 
https://briansmith.org/
-- 
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


Re: [openssl-dev] ecp_nistz256 is_one is too liberal with what it considers to be one

2016-08-18 Thread Brian Smith
Andy Polyakov  wrote:
>>if (P256_LIMBS == 8) {
>>  res |= a[4] ^ ONE[4];
>>  res |= a[5] ^ ONE[5];
>>  res |= a[6] ^ ONE[6];
>> +res |= a[7] ^ ONE[7];
>>}
>
> It's not actually a coincidence that it ends with a[6]. If you have
> close look at ecp_nistz256_is_affine_G, you'll see that it also check
> for generator->Z.top being P256_LIMBS - P256_LIMBS / 8, or 7[!] on
> 32-bit platform. I.e. we can't assume that a[7] is actually an
> initialized value. Quite contrary actually, because there is
> configuration flag that will put some junk there on purpose. But yes, it
> contradicts second usage case of is_one... Which should be complemented
> with additional
>
> if (P256_LIMBS == 8 && r->Z_is_one)
> r->Z_is_one = (bn_get_top(r->Z) == 7);

1. The is_one function should work like is_zero and look at all the
limbs. Clearly, that's what people expected it to do, given the
current bug. If ecp_nistz256_is_affine_G needs different logic then
that different logic should be inlined into it. Note that the use in
ecp_nistz256_is_affine_G doesn't need to be constant-time so there are
lots of options for changing it.

2. The Z_is_one member isn't very useful and is error-prone in
general. It can be removed relatively easily. See [1].

So, I recommend removing Z_is_one, and then inlining the is_one
function into ecp_nistz256_is_affine_G so that nobody misuses it in
the future for something else.

[1] 
https://boringssl.googlesource.com/boringssl/+/081e3f34a2b324edce50b7a5df9b2e283781af7b%5E%21/

Cheers,
Brian
-- 
https://briansmith.org/
-- 
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


Re: [openssl-dev] Partially- vs. full- reduced inputs to ecp_nistz256_neg

2016-08-16 Thread Brian Smith
Andy Polyakov  wrote:
>> My understand after talking with Vlad that the "sbb \$0, $acc2" makes
>> this equivalent to (r >= 2**256) ? (r - q) : r. If the "sbb \$0,
>> $acc2" line were removed then it would be equivalent to (r >= q) ? (r
>> - q) : r. My understanding is that the difference in semantics is
>> exactly the difference between partially reduced results and fully
>> reduced results.
>
> Let's recall that result of multiplication prior final reduction is
> actually n+1-limb value, with +1 limb being single bit, that's $acc2,
> 5th limb in the context. So that the $0 in last 'sbb \$0,$acc2'
> represents 5th ("imaginary") limb of modulus[!]. And since we're looking
> at borrow from this 5-limb subtraction, outcome is actually
>
> if (ret > P) ret -= P'

OK, let's agree on that.

I think you are assuming that ret is in the range [0, 2P), so that if
you subtract P, the result would be in the range [0, P). That is the
case in normal Montgomery multiplication, where the inputs are in the
range [0, P). But, my understanding is that if the inputs are in the
range [P, 2**256), e.g. they are the result of ecp_nistz256_add, then
that assumption doesn't necessarily hold.

I understand Almost Montgomery Multiplication (AMM) as described in
[1], where one accepts inputs in the range [0, 2**n] and returns a
result in the range [0, 2**n), not [0, P).

I also read the original paper on the ecp_nistz256 implementation [2],
which has "0 ≤ a, b < p" as a precondition for the Montgomery
multiplication.

To put my concern another way: Earlier in the thread, you said the
function can take inputs that aren't fully reduced--i.e. are in in the
range [0, 2**n)--and returns outputs that are fully reduced--i.e. in
the range [0, P)--but it isn't clear how that is achieved. My
understanding is that the domain and range of the function are the
same.

[1] https://eprint.iacr.org/2011/239.pdf
[2] https://eprint.iacr.org/2013/816.pdf

Cheers,
Brian
-- 
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


Re: [openssl-dev] Partially- vs. full- reduced inputs to ecp_nistz256_neg

2016-08-16 Thread Brian Smith
Andy Polyakov  wrote:
> And it's not only that multiplication (and squaring) result is fully
> reduced, it, multiplication (ans squaring) subroutine can actually
> manage partially reduced input. On related note one can point out that
> result of addition (and mul_by_[2|3]) is partially reduced. But it's
> multiplication's ability to handle it that ties things up. One should
> also remember that it always ends with multiplication when result is
> converted from Montgomery representation. As well as that it starts with
> multiplication when input is converted to Montgomery representation...

Andy, thanks! Let's rewind a bit:

Originally I was writing tests and the first function I tested the
reduction for was ecp_nistz256_add.

I was surprised to find that the result is not necessarily unreduced.
I emailed the original authors from Intel (Vlad and Shay) asking if
this was expected and whether any other functions return unreduced
results. Vlad's reply was that *all* of the functions are
expected/assumed to return unreduced inputs. I then followed up asking
specifically about whether the result of the multiplication could be
only partially reduced. I received another reply that the result of
the multiplication could be only partially reduced and that some code
I'd written for BoringSSL that assumes that the multiplication result
was always fully reduced was wrong.

After that, I re-read the code for the conditional subtraction at the
end of ecp_nistz256_mul_mont (__ecp_nistz256_mul_montq, actually) and
I couldn't convince myself that the result was always fully reduced.

My concern is that what you say and what Vlad said is contradictory.
You both understand the code way better than me, so I feel like I'm
not so useful in resolving the contradiction. But, I will try anyway:

sbb $poly3, $acc1 # .Lpoly[3]
sbb \$0, $acc2

cmovc $t0, $acc4
cmovc $t1, $acc5

My understand after talking with Vlad that the "sbb \$0, $acc2" makes
this equivalent to (r >= 2**256) ? (r - q) : r. If the "sbb \$0,
$acc2" line were removed then it would be equivalent to (r >= q) ? (r
- q) : r. My understanding is that the difference in semantics is
exactly the difference between partially reduced results and fully
reduced results.

Another way to see this is that there are 5 sbb instructions issued
for the conditional subtraction, which means 5 limbs are involved.
But, a full reduction mod q should only have 4 sbb instructions,
right?

Again, I could very well be horrifically misunderstanding things.
Perhaps it would be a good idea to ask Vlad to weigh in again?

Cheers,
Brian
-- 
https://briansmith.org/
-- 
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


Re: [openssl-dev] Partially- vs. full- reduced inputs to ecp_nistz256_neg

2016-08-15 Thread Brian Smith
Andy Polyakov  wrote:
> No, it subtraction subroutine uses *borrow* to determine if modulus is
> to be added. I.e. (a >= b) ? (a - b) : (P - (b - a)). If both a and b
> are less than P, then result is less than P.

Consider the case where a > P and a >= b and b is very small (e.g. 1).
For example, a == P + 2 and b == 1, so a >= b, and a - b == P + 2 - 1
== P + 1.

Of course, this reduces the question of whether the multiplication
that precedes the subtraction can ever have a result in [P, 2**256 -
1).

Cheers,
Brian
-- 
https://briansmith.org/
-- 
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


Re: [openssl-dev] Partially- vs. full- reduced inputs to ecp_nistz256_neg

2016-08-15 Thread Brian Smith
Andy Polyakov  wrote:
>> Note in particular that, IIUC, ecp_nistz256_neg will never get an
>> unreduced input when applied to the the based point multiples, because
>> those are already fully reduced. But, when it is used in
>> ecp_nistz256_windowed_mul, it isn't clear whether or how the input Y
>> coordinate is fully reduced mod P before passed to ecp_nistz256_neg.
>
> Is it correctly understood that concern is that input to
> ecp_nistz256_windowed_mul, which in turn can be *user* input, would be
> not fully reduced?

Let's assume, for the purpose of this discussion, that the coordinates
of the user input point `p` is already reduced, or that we will reject
it or reduce it.

Given that input point, we calculate multiples 1*p, 2*p, ... 16*p
using ecp_nistz256_point_double and ecp_nistz256_point_add. Let's look
at how ecp_nistz256_point_double and ecp_nistz256_point_add calculates
the Y coordinate:

double:
...
ecp_nistz256_mul_mont(S, S, M);
ecp_nistz256_sub(res_y, S, res_y);

add:
...
ecp_nistz256_mul_mont(res_y, R, res_y);
ecp_nistz256_sub(res_y, res_y, S2);
memcpy(r->Y, res_y, sizeof(res_y));

Now, I was told by one of the authors of the original code that all
the ecp_nistz256_* field operations return a result in the range [0,
2**256), not [0, P). This can be seen in the reduction code, e.g. in
ecp_nistz256_sub. It uses the result of "sbb \$0, $t4" to determine if
the result is larger than 2**256 - 1. If the result is larger than
2**256 - 1 then it subtracts P. Otherwise, it returns the result. But,
in the case where it doesn't subtract P, the result might be in the
range [P, 2**256); i.e. not fully reduced.

NOTE: I may not be understanding it correctly! It's a very real
possibility. But, my understanding seems to match what I was told,
which is that the results of all the field operations are not fully
reduced.

Cheers,
Brian
-- 
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


[openssl-dev] Partially- vs. full- reduced inputs to ecp_nistz256_neg

2016-08-09 Thread Brian Smith
I was curious about the function ecp_nistz256_neg. This function seems
to work exactly how I expect for reduced inputs; i.e. inputs in the
range [0, P). And, it also seems to work how I expect for P:
ecp_nistz256_neg(P) == ecp_nistz256_neg(0) == 0. So, everything seems
fine for inputs in the range [0, P].

But, I don't understand how it works for the value P + 1. I expect
that one of the following is true:

ecp_nistz256_neg(P + 1) == ecp_nistz256_neg(1)
ecp_nistz256_neg(P + 1) == ecp_nistz256_neg(1) + P.

But, instead, ecp_nistz256_neg(P + 1) returns a result of 2**256 - 1. That is:

input = 00010001
expected = 0001fffe
actual: 

Similarly, I decided to test ecp_nistz256_neg(2**256 - 1). I expected
that one of the following would be true:

ecp_nistz256_neg(2**256 - 1) = ecp_nistz256_neg(2**256 - 1 - P)
ecp_nistz256_neg(2**256 - 1) == ecp_nistz256_neg(2**256 - 1 - P) + P.

But, instead, ecp_nistz256_neg(2**256 - 1) == 1. That is:

input = 0001fffe
expected = 
actual: 01

Based on these two results, then, it seems that when the input is in
the range [P + 1, 2**256 - 1], the result is actually the negation
taken modulo 2**256, instead of the negation modulo p256 (perhaps not
fully reduced).

I don't know if this is actually problematic, but it was surprising to me.

It seems to me that it would be a good idea for ecp_nistz256_neg to
first reduce its input modulo P (i.e. do a conditional subtraction of
P) before it does its calculation. At least, this would make it clear
that it is correct.

Note in particular that, IIUC, ecp_nistz256_neg will never get an
unreduced input when applied to the the based point multiples, because
those are already fully reduced. But, when it is used in
ecp_nistz256_windowed_mul, it isn't clear whether or how the input Y
coordinate is fully reduced mod P before passed to ecp_nistz256_neg.

More generally, I'm think it might be a good idea to unit test all of
the primitive operations in ecp_nistz256, with particular emphasis
placed on whether unreduced inputs are supposed to be accepted for
certain functions and, if so, whether unreduced inputs are handled
correctly.

And also, since many of the ecp_nistz256 field arithmetic functions
are inlined into the ecp_nistz256_point functions, I think it would be
worthwhile to review that the inlined versions of those functions
actually are operating in the same way as the analogous standalone
(C-callable) ecp_nistz256_* functions.

Sorry if I'm overlooking something fundamental, which I admit is
likely. Any help is appreciated.

Cheers,
Brian
--
https://briansmith.org/
-- 
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


[openssl-dev] [openssl.org #4626] Re: Are the point-at-infinity checks in ecp_nistz256 correct?

2016-07-22 Thread Brian Smith via RT
Brian Smith <br...@briansmith.org> wrote:
> The issue is particularly clear when we multiply the generator by
> zero. Note that in general, an application shouldn't multiply the
> generator by zero since there's no useful cryptographic purpose for
> doing so. But, this is a convenient example.

Sorry, I was wrong. From the definition of ECDSA:

H = Hash(M).
Convert the bit string H to an integer e.
w = s**−1 mod n
u1 = (e * w) mod n
R = u1*G + u2*Q

If the highest 256 bits of Hash(M) are zero, then e == 0 and then u1
== 0 * w == 0. So, it probably is important to handle g_scalar == 0 in
the way I described in my earlier message, using the conditional copy.

Cheers,
Brian


-- 
Ticket here: http://rt.openssl.org/Ticket/Display.html?id=4626
Please log in as guest with password guest if prompted

-- 
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


Re: [openssl-dev] Are the point-at-infinity checks in ecp_nistz256 correct?

2016-07-22 Thread Brian Smith
Brian Smith <br...@briansmith.org> wrote:
> The issue is particularly clear when we multiply the generator by
> zero. Note that in general, an application shouldn't multiply the
> generator by zero since there's no useful cryptographic purpose for
> doing so. But, this is a convenient example.

Sorry, I was wrong. From the definition of ECDSA:

H = Hash(M).
Convert the bit string H to an integer e.
w = s**−1 mod n
u1 = (e * w) mod n
R = u1*G + u2*Q

If the highest 256 bits of Hash(M) are zero, then e == 0 and then u1
== 0 * w == 0. So, it probably is important to handle g_scalar == 0 in
the way I described in my earlier message, using the conditional copy.

Cheers,
Brian
-- 
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


[openssl-dev] [openssl.org #4625] Re: Are the point-at-infinity checks in ecp_nistz256 correct?

2016-07-22 Thread Brian Smith via RT
The issue is particularly clear when we multiply the generator by
zero. Note that in general, an application shouldn't multiply the
generator by zero since there's no useful cryptographic purpose for
doing so. But, this is a convenient example.

In the code we have,

ecp_nistz256_gather_w7(, preComputedTable[0], wvalue >> 1);
ecp_nistz256_neg(p.p.Z, p.p.Y);
copy_conditional(p.p.Y, p.p.Z, wvalue & 1);
memcpy(p.p.Z, ONE, sizeof(ONE));

The generator is all zeros, so we'll start off with the point (0, 0,
1). Then we add the point at infinity over and over again, leaving
that point unchanged each time. Thus, we'll end with (0, 0, 1). Then:

r->Z_is_one = is_one(p.p.Z) & 1;

Thus, the resulting point will be (0, 0, 1).

After the memcpy quoted above, we need to do a copy_conditional(p.p.Z,
is_infinity, ZERO) or equivalent, where ZERO is all-zeros and where
is_infinity is the result of checking if (x, y) == (0, 0).

This is just one example of an edge case that is handled in a
surprising way. I think there are more, as described in the quoted
message below.

Cheers,
Brian


Brian Smith <br...@briansmith.org> wrote:
>
> Brian Smith <br...@briansmith.org> wrote:
>>
>> When doing math on short Weierstrass curves like P-256, we have to
>> special case points at infinity. In Jacobian coordinates (X, Y, Z),
>> points at infinity have Z == 0. However, instead of checking for Z ==
>> 0, p256-x86-64 instead checks for (X, Y) == (0, 0). In other words, it
>> does, in some sense, the opposite of what I expect it to do.
>
>
> I exchanged email with both of the original authors of the code, Shay and 
> Vlad. He that the ecp_nistz256_point_* functions indeed intend to represent 
> the point at infinity as (0, 0) and it is expected (but I did not verify) 
> that those functions should work when the point at infinity is encoded as (0, 
> 0, _).
>
>> The authors
>> instead decided to encode the point at infinity as (0, 0), since the
>> affine point (0, 0) isn't on the P-256 curve. It isn't clear why the
>> authors chose to do that though, since the point at infinity doesn't
>> (can't, logically) appear in the table of precomputed multiples of G
>> anyway.
>
>
> Actually, as the code says, the point at infinity implicitly occurs in the 
> table implicitly. Obviously the accumulator point will be at infinity until 
> at least a one bit is found in the scalar.
>
>>
>> But, it seems like the functions that do the computations, like
>>
>> ecp_nistz256_point_add and ecp_nistz256_point_add_affine, output the
>> point at infinity as (_, _, 0), not necessarily (0, 0, _). Also,
>> ecp_nistz256's EC_METHOD uses ec_GFp_simple_is_at_infinity and
>> ec_GFp_simple_point_set_to_infinity, which represent the point at
>> infinity with z == 0, not (x, y) == 0. Further ecp_nistz256_get_affine
>> uses EC_POINT_is_at_infinity, which checks z == 0, not (x, y) == 0.
>> This inconsistency is confusing, if not wrong. Given this, it seems
>> like the point-at-infinity checks in the ecp_nistz256_point_add and
>> ecp_nistz256_point_add_affine code should also be checking that z == 0
>> instead of (x, y) == (0, 0).
>
>
> Instead, when we convert a point from EC_POINT to P256_POINT or 
> P256_POINT_AFFINE, we should translate the (_, _, 0) form into the (0, 0, 0) 
> form. And/or reject points at infinity as inputs to the function. Similarly, 
> when we convert the resultant P256_POINT to an EC_POINT, we chould translate 
> the (0, 0) encoding of the point at infinity to the (0, 0, 0) form or at 
> least the (_, _, 0) form.
>
> In particular, consider the case where you have an EC_POINT that isn't at 
> infinity, e.g. (x, y, 1). Then you call EC_POINT_set_to_infinity on it. Then 
> it is (x, y, 0). Then you pass it to EC_POINT_mul/EC_POINTs_mul even though 
> you shouldn't.
>
> Maybe the precondition of EC_POINT_mul/EC_POINTs_mul is that none of the 
> input points are at infinity, in which case it doesn't matter. (But if so, 
> maybe these functions should do a point-at-infinity check.) But if it is 
> allowed to pass in the point at infinity, then the ecp_nistz256 code should 
> translate the input point from (_, _, 0) form to (0, 0, _) form before doing 
> the computation. It can use is_zero and copy_conditional and friends to do 
> this.
>
> Similarly, after the computation, it should translate the (0, 0, _) form to 
> (_, _, 0) form. In particular, it should do such a translation before the 
> code sets Z_is_one, AFAICT. Note that the nistz256 code might be using the 
> form (0, 0, 0) instead of (0, 0, _) in which case this translation might not 
> be necessary.
>
> Regardless, it would be useful to write tests fo

Re: [openssl-dev] Are the point-at-infinity checks in ecp_nistz256 correct?

2016-07-22 Thread Brian Smith
The issue is particularly clear when we multiply the generator by
zero. Note that in general, an application shouldn't multiply the
generator by zero since there's no useful cryptographic purpose for
doing so. But, this is a convenient example.

In the code we have,

ecp_nistz256_gather_w7(, preComputedTable[0], wvalue >> 1);
ecp_nistz256_neg(p.p.Z, p.p.Y);
copy_conditional(p.p.Y, p.p.Z, wvalue & 1);
memcpy(p.p.Z, ONE, sizeof(ONE));

The generator is all zeros, so we'll start off with the point (0, 0,
1). Then we add the point at infinity over and over again, leaving
that point unchanged each time. Thus, we'll end with (0, 0, 1). Then:

r->Z_is_one = is_one(p.p.Z) & 1;

Thus, the resulting point will be (0, 0, 1).

After the memcpy quoted above, we need to do a copy_conditional(p.p.Z,
is_infinity, ZERO) or equivalent, where ZERO is all-zeros and where
is_infinity is the result of checking if (x, y) == (0, 0).

This is just one example of an edge case that is handled in a
surprising way. I think there are more, as described in the quoted
message below.

Cheers,
Brian


Brian Smith <br...@briansmith.org> wrote:
>
> Brian Smith <br...@briansmith.org> wrote:
>>
>> When doing math on short Weierstrass curves like P-256, we have to
>> special case points at infinity. In Jacobian coordinates (X, Y, Z),
>> points at infinity have Z == 0. However, instead of checking for Z ==
>> 0, p256-x86-64 instead checks for (X, Y) == (0, 0). In other words, it
>> does, in some sense, the opposite of what I expect it to do.
>
>
> I exchanged email with both of the original authors of the code, Shay and 
> Vlad. He that the ecp_nistz256_point_* functions indeed intend to represent 
> the point at infinity as (0, 0) and it is expected (but I did not verify) 
> that those functions should work when the point at infinity is encoded as (0, 
> 0, _).
>
>> The authors
>> instead decided to encode the point at infinity as (0, 0), since the
>> affine point (0, 0) isn't on the P-256 curve. It isn't clear why the
>> authors chose to do that though, since the point at infinity doesn't
>> (can't, logically) appear in the table of precomputed multiples of G
>> anyway.
>
>
> Actually, as the code says, the point at infinity implicitly occurs in the 
> table implicitly. Obviously the accumulator point will be at infinity until 
> at least a one bit is found in the scalar.
>
>>
>> But, it seems like the functions that do the computations, like
>>
>> ecp_nistz256_point_add and ecp_nistz256_point_add_affine, output the
>> point at infinity as (_, _, 0), not necessarily (0, 0, _). Also,
>> ecp_nistz256's EC_METHOD uses ec_GFp_simple_is_at_infinity and
>> ec_GFp_simple_point_set_to_infinity, which represent the point at
>> infinity with z == 0, not (x, y) == 0. Further ecp_nistz256_get_affine
>> uses EC_POINT_is_at_infinity, which checks z == 0, not (x, y) == 0.
>> This inconsistency is confusing, if not wrong. Given this, it seems
>> like the point-at-infinity checks in the ecp_nistz256_point_add and
>> ecp_nistz256_point_add_affine code should also be checking that z == 0
>> instead of (x, y) == (0, 0).
>
>
> Instead, when we convert a point from EC_POINT to P256_POINT or 
> P256_POINT_AFFINE, we should translate the (_, _, 0) form into the (0, 0, 0) 
> form. And/or reject points at infinity as inputs to the function. Similarly, 
> when we convert the resultant P256_POINT to an EC_POINT, we chould translate 
> the (0, 0) encoding of the point at infinity to the (0, 0, 0) form or at 
> least the (_, _, 0) form.
>
> In particular, consider the case where you have an EC_POINT that isn't at 
> infinity, e.g. (x, y, 1). Then you call EC_POINT_set_to_infinity on it. Then 
> it is (x, y, 0). Then you pass it to EC_POINT_mul/EC_POINTs_mul even though 
> you shouldn't.
>
> Maybe the precondition of EC_POINT_mul/EC_POINTs_mul is that none of the 
> input points are at infinity, in which case it doesn't matter. (But if so, 
> maybe these functions should do a point-at-infinity check.) But if it is 
> allowed to pass in the point at infinity, then the ecp_nistz256 code should 
> translate the input point from (_, _, 0) form to (0, 0, _) form before doing 
> the computation. It can use is_zero and copy_conditional and friends to do 
> this.
>
> Similarly, after the computation, it should translate the (0, 0, _) form to 
> (_, _, 0) form. In particular, it should do such a translation before the 
> code sets Z_is_one, AFAICT. Note that the nistz256 code might be using the 
> form (0, 0, 0) instead of (0, 0, _) in which case this translation might not 
> be necessary.
>
> Regardless, it would be useful to write tests fo

Re: [openssl-dev] [openssl.org #4621] BUG: nistz256 point addition check for a = +/-b doesn't work for unreduced values

2016-07-21 Thread Brian Smith
Brian Smith via RT <r...@openssl.org> wrote:

> Finally, as I mentioned on the mailing list, it seems the function is_zero
> is missing a comparison of the last limb in the 32-bit case.
>

And of course, when I said "is_zero" I meant "is_one":
https://github.com/openssl/openssl/blob/aa6bb1352b1026b20a23b49da4efdcf171926eb0/crypto/ec/ecp_nistz256.c#L226



-- 
https://briansmith.org/
-- 
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


Re: [openssl-dev] [openssl.org #4621] BUG: nistz256 point addition check for a = +/-b doesn't work for unreduced values

2016-07-21 Thread Brian Smith via RT
Brian Smith via RT <r...@openssl.org> wrote:

> Finally, as I mentioned on the mailing list, it seems the function is_zero
> is missing a comparison of the last limb in the 32-bit case.
>

And of course, when I said "is_zero" I meant "is_one":
https://github.com/openssl/openssl/blob/aa6bb1352b1026b20a23b49da4efdcf171926eb0/crypto/ec/ecp_nistz256.c#L226



-- 
https://briansmith.org/

-- 
Ticket here: http://rt.openssl.org/Ticket/Display.html?id=4621
Please log in as guest with password guest if prompted

-- 
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


[openssl-dev] [openssl.org #4621] BUG: nistz256 point addition check for a = +/-b doesn't work for unreduced values

2016-07-21 Thread Brian Smith via RT
I verified with the authors of the nistz256 code that all the arithmetic is
done mod P256 (P), but the results might be unreduced if they are less than
2**256. Thus, all the arithmetic must handle the cases where its inputs are
in the range [0, 2**256 - 1], not just [0, P). And, it must deal with the
cases where some values have multiple representations. For example, the
value 0 can be represented as either 0 or P + 0, 1 can be represented
either as 1 or as P + 1, etc.

The nistz256 code contains code that looks like this:

  if (is_equal(U1, U2) && !in1infty && !in2infty) {
if (is_equal(S1, S2)) {
...
}
  }

This code only works for values that have one representation; it doesn't
work for all values. For example, consider U1 == 1, U2 = P + 1. Then,
is_equal(U1, U2) returns false, but in fact U1 == U2 (mod P). Thus, we
should run the code for the exceptional case (doubling or setting to the
point at infinity), but actually we just keep going with a normal addition.

You can see that the code in ecp_nistp256 and ecp_nistp224 such as
`is_zero` handles this correctly by considering both representations of 0:
0 and P + 0.

To fix this, a constant-time conditional subtraction of P should be done on
U1 and U2, and a conditional subtraction should also be done on S1 and S2.
I discussed this with some other people familiar with the code and they
agree that this seems to be a good way to do it.

Note again that it is possible for the value 0 to be represented as either
0 or as P + 0. This brings into question whether is_zero is correct,
because it doesn't consider P to be zero. Here there was some disagreement
about whether it is necessary to check for P. I personally think that it is
safer to check for both 0 and P like the nistp256 code does, because I it
is hard to make a proof that the values cannot be P.

Not also again that the value 1 can be represented as 1 or as P + 1. Thus,
the statement Z_is_one = is_one(...); appears to also be wrong, or at least
not obviously correct.

Finally, as I mentioned on the mailing list, it seems the function is_zero
is missing a comparison of the last limb in the 32-bit case.

Unfortunately, I don't have any test vectors for triggering any bugs that
would serve as a basis of regression tests; these were found from reading
the code.

Cheers,
Brian
-- 
https://briansmith.org/

-- 
Ticket here: http://rt.openssl.org/Ticket/Display.html?id=4621
Please log in as guest with password guest if prompted

-- 
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


Re: [openssl-dev] Are the point-at-infinity checks in ecp_nistz256 correct?

2016-07-08 Thread Brian Smith
Brian Smith <br...@briansmith.org> wrote:

> When doing math on short Weierstrass curves like P-256, we have to
> special case points at infinity. In Jacobian coordinates (X, Y, Z),
> points at infinity have Z == 0. However, instead of checking for Z ==
> 0, p256-x86-64 instead checks for (X, Y) == (0, 0). In other words, it
> does, in some sense, the opposite of what I expect it to do.
>

I exchanged email with both of the original authors of the code, Shay and
Vlad. He that the ecp_nistz256_point_* functions indeed intend to represent
the point at infinity as (0, 0) and it is expected (but I did not verify)
that those functions should work when the point at infinity is encoded as
(0, 0, _).

The authors
> instead decided to encode the point at infinity as (0, 0), since the
> affine point (0, 0) isn't on the P-256 curve. It isn't clear why the
> authors chose to do that though, since the point at infinity doesn't
> (can't, logically) appear in the table of precomputed multiples of G
> anyway.


Actually, as the code says, the point at infinity implicitly occurs in the
table implicitly. Obviously the accumulator point will be at infinity until
at least a one bit is found in the scalar.


> But, it seems like the functions that do the computations, like
>
ecp_nistz256_point_add and ecp_nistz256_point_add_affine, output the
> point at infinity as (_, _, 0), not necessarily (0, 0, _). Also,
> ecp_nistz256's EC_METHOD uses ec_GFp_simple_is_at_infinity and
> ec_GFp_simple_point_set_to_infinity, which represent the point at
> infinity with z == 0, not (x, y) == 0. Further ecp_nistz256_get_affine
> uses EC_POINT_is_at_infinity, which checks z == 0, not (x, y) == 0.
> This inconsistency is confusing, if not wrong. Given this, it seems
> like the point-at-infinity checks in the ecp_nistz256_point_add and
> ecp_nistz256_point_add_affine code should also be checking that z == 0
> instead of (x, y) == (0, 0).
>

Instead, when we convert a point from EC_POINT to P256_POINT or
P256_POINT_AFFINE, we should translate the (_, _, 0) form into the (0, 0,
0) form. And/or reject points at infinity as inputs to the function.
Similarly, when we convert the resultant P256_POINT to an EC_POINT, we
chould translate the (0, 0) encoding of the point at infinity to the (0, 0,
0) form or at least the (_, _, 0) form.

In particular, consider the case where you have an EC_POINT that isn't at
infinity, e.g. (x, y, 1). Then you call EC_POINT_set_to_infinity on it.
Then it is (x, y, 0). Then you pass it to EC_POINT_mul/EC_POINTs_mul even
though you shouldn't.

Maybe the precondition of EC_POINT_mul/EC_POINTs_mul is that none of the
input points are at infinity, in which case it doesn't matter. (But if so,
maybe these functions should do a point-at-infinity check.) But if it is
allowed to pass in the point at infinity, then the ecp_nistz256 code should
translate the input point from (_, _, 0) form to (0, 0, _) form before
doing the computation. It can use is_zero and copy_conditional and friends
to do this.

Similarly, after the computation, it should translate the (0, 0, _) form to
(_, _, 0) form. In particular, it should do such a translation before the
code sets Z_is_one, AFAICT. Note that the nistz256 code might be using the
form (0, 0, 0) instead of (0, 0, _) in which case this translation might
not be necessary.

Regardless, it would be useful to write tests for these cases:
1. Verify that the result is correct when any of the input points are (0,
0, 0)
2. Verify that the result is correct when any of the input points are (_,
_, 0).
3. Verify that the result is correct, and in particular that Z_is_one is
set correctly on the result, when the final result is at infinity,
especially for the cases where neither the input points are at infinity,
e.g. when adding (n-1)G to 1G.

Note that all of the above cases have interesting sub-cases: the G scalar
is NULL, the G scalar is non-NULL and zero-valued, G scalar is a multiple
of G, G scalar is larger than G. And same for the P scalars.

Cheers,
Brian
-- 
https://briansmith.org/
-- 
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


[openssl-dev] ecp_nistz256 is_one is too liberal with what it considers to be one

2016-07-04 Thread Brian Smith
Please see the attached program and consider the following change:

```
   if (P256_LIMBS == 8) {
 res |= a[4] ^ ONE[4];
 res |= a[5] ^ ONE[5];
 res |= a[6] ^ ONE[6];
+res |= a[7] ^ ONE[7];
   }
```

Cheers,
Brian
-- 
https://briansmith.org/
#include 
#include 
#include 
#include 

#define BN_BITS2	32
typedef uint32_t BN_ULONG;

#define TOBN(hi, lo) lo, hi

#define P256_LIMBS 8

/* One converted into the Montgomery domain */
static const BN_ULONG ONE[P256_LIMBS] = {
  TOBN(0x, 0x0001), TOBN(0x, 0x),
  TOBN(0x, 0x), TOBN(0x, 0xfffe),
};

static BN_ULONG is_zero(BN_ULONG in)
{
  in |= (0 - in);
  in = ~in;
  in >>= BN_BITS2 - 1;
  return in;
}

static BN_ULONG is_one(const BN_ULONG a[P256_LIMBS])
{
  BN_ULONG res;

  res = a[0] ^ ONE[0];
  res |= a[1] ^ ONE[1];
  res |= a[2] ^ ONE[2];
  res |= a[3] ^ ONE[3];
  if (P256_LIMBS == 8) {
res |= a[4] ^ ONE[4];
res |= a[5] ^ ONE[5];
res |= a[6] ^ ONE[6];
  }

  return is_zero(res);
}

int main() {
  BN_ULONG not_one[P256_LIMBS];
  memcpy(not_one, ONE, sizeof(not_one));
  not_one[7] ^= 1;

  BN_ULONG is_it_one = is_one(not_one);
  printf("%" PRIu32 "\n", is_it_one);
  if (is_it_one) {
return 1;
  }
  return 0;
}
-- 
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


[openssl-dev] A faster addition chain for use in P-256 inversion mod n

2016-06-29 Thread Brian Smith
Hi,

I saw Vlad Krasnov's patch to optimize inversion mod n for the P-256
curve. Please see [1], which presents an addition chain that uses 9
fewer multiplications (but two more squarings, IIRC). I spent some
non-trivial effort to optimize this chain, but I wouldn't be surprised
to see somebody remove 3-4 more multiplications.

I don't think you'll want to use the code directly, but I think you
can make use of the math.

The same code has an analogous addition chain for the P-384 curve, but
it isn't optimized to the same extent.

[1] 
https://github.com/briansmith/ring/commit/ad6528f98bd228208f93c179cbbae87604c282fe#diff-b0cdc11124e9960a1faeb7baa390b50fR76

Cheers,
Brian
-- 
https://briansmith.org/
-- 
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


Re: [openssl-dev] Are the point-at-infinity checks in ecp_nistz256 correct?

2016-06-28 Thread Brian Smith
:sigh: I forgot the attachment with my test vectors. Here it is.

On Tue, Jun 28, 2016 at 10:43 AM, Brian Smith <br...@briansmith.org> wrote:
> When doing math on short Weierstrass curves like P-256, we have to
> special case points at infinity. In Jacobian coordinates (X, Y, Z),
> points at infinity have Z == 0. However, instead of checking for Z ==
> 0, p256-x86-64 instead checks for (X, Y) == (0, 0). In other words, it
> does, in some sense, the opposite of what I expect it to do.
>
> I have built a testing framework for exploring things like this in
> *ring*. I will attach the input file for my tests which show that
> ecp_nistz256_point_add seems to fail to recognize the point at
> infinity correctly. However, it is also possible that I just don't
> understand how ecp_nistz256 intends to work. My questions are:
>
> 1. With respect to additions of the form (a + infinity == a) and
> (infinity + b == b), is the code in ecp_nistz256_point_add and
> ecp_nistz256_point_add_affine correct?
>
> 2. if it is correct, could we add more explanation as to why it is correct?
>
> 3. Given the specifics of the implementation of the ecp_nistz256
> implementation, is it even possible for us to encounter the point at
> infinity as one of the parameters to ecp_nistz256_point_add, other
> than in the very final addition that adds g_scalar*G + p_scalar*P? See
> Section 4.1 of [1].
>
> Background: For based point (G) multiplication, the code has a large
> table of multiples of G, in affine (not Jacobian) coordinates. The
> point at infinity cannot be encoded in affine coordinates. The authors
> instead decided to encode the point at infinity as (0, 0), since the
> affine point (0, 0) isn't on the P-256 curve. It isn't clear why the
> authors chose to do that though, since the point at infinity doesn't
> (can't, logically) appear in the table of precomputed multiples of G
> anyway. Regardless, if you represent the point at infinity as (0, 0)
> then it makes sense to check (x, y) == (0, 0).
>
> But, it seems like the functions that do the computations, like
> ecp_nistz256_point_add and ecp_nistz256_point_add_affine, output the
> point at infinity as (_, _, 0), not necessarily (0, 0, _). Also,
> ecp_nistz256's EC_METHOD uses ec_GFp_simple_is_at_infinity and
> ec_GFp_simple_point_set_to_infinity, which represent the point at
> infinity with z == 0, not (x, y) == 0. Further ecp_nistz256_get_affine
> uses EC_POINT_is_at_infinity, which checks z == 0, not (x, y) == 0.
> This inconsistency is confusing, if not wrong. Given this, it seems
> like the point-at-infinity checks in the ecp_nistz256_point_add and
> ecp_nistz256_point_add_affine code should also be checking that z == 0
> instead of (x, y) == (0, 0).
>
> Note that this is confusing because `EC_POINT_new` followed by
> `EC_POINT_to_infinity` initializes (X, Y, Z) = (0, 0, 0). Thus, the
> check of (x, y) == (0, 0) "works" as well as the check z == 0. But, it
> doesn't work in real-life cases where the point is infinity is
> encountered during calculations, because we'll have (X, Y) != (0, 0)
> but Z == 0.
>
> The assembly language code that does this check is hard to understand
> unless one is familiar with SIMD. However, the C reference
> implementation that the assembly language code used as a model is easy
> to understand. This code can be found in the ecp_nistz256.c file.
>
> Note the parameters of ecp_nistz256_point_add are P256_POINT, not
> P256_POINT_AFFINE, so "representation of the point at infinity as (0,
> 0)" doesn't make sense to me. But, that's exactly what it checks.
>
> In ecp_nistz256_point_add_affine, it makes more sense to me, because
> parameter |b| is in fact a |P256_POINT_AFFINE|. However, |a| is not a
> |P256_POINT_AFFINE|, so the (x, y) == (0, 0) check doesn't make sense
> to me. The x86-64 and x86 assembly language code seems to emulate this
> exactly. I didn't test the ARM code, but I'd guess it is similar.
>
> [1] https://eprint.iacr.org/2014/130.pdf (Section 4.1)
>
> Here's the specific logic I'm talking about (which is also present in
> the asm code):
>
> ```
> static void ecp_nistz256_point_add(P256_POINT *r,
>const P256_POINT *a, const P256_POINT *b) {
> [...]
>
> const BN_ULONG *in1_x = a->X;
> const BN_ULONG *in1_y = a->Y;
> const BN_ULONG *in1_z = a->Z;
>
> const BN_ULONG *in2_x = b->X;
> const BN_ULONG *in2_y = b->Y;
> const BN_ULONG *in2_z = b->Z;
>
> /* We encode infinity as (0,0), which is not on the curve,
>  * so it is OK. */
> in1infty = (in1_x[0] | in1_x[1] | in1_x[2] | in1_x[3] |
> in1_y[0] | in1_y[1] | in1_y[2] | in1_y[3])

[openssl-dev] Are the point-at-infinity checks in ecp_nistz256 correct?

2016-06-28 Thread Brian Smith
When doing math on short Weierstrass curves like P-256, we have to
special case points at infinity. In Jacobian coordinates (X, Y, Z),
points at infinity have Z == 0. However, instead of checking for Z ==
0, p256-x86-64 instead checks for (X, Y) == (0, 0). In other words, it
does, in some sense, the opposite of what I expect it to do.

I have built a testing framework for exploring things like this in
*ring*. I will attach the input file for my tests which show that
ecp_nistz256_point_add seems to fail to recognize the point at
infinity correctly. However, it is also possible that I just don't
understand how ecp_nistz256 intends to work. My questions are:

1. With respect to additions of the form (a + infinity == a) and
(infinity + b == b), is the code in ecp_nistz256_point_add and
ecp_nistz256_point_add_affine correct?

2. if it is correct, could we add more explanation as to why it is correct?

3. Given the specifics of the implementation of the ecp_nistz256
implementation, is it even possible for us to encounter the point at
infinity as one of the parameters to ecp_nistz256_point_add, other
than in the very final addition that adds g_scalar*G + p_scalar*P? See
Section 4.1 of [1].

Background: For based point (G) multiplication, the code has a large
table of multiples of G, in affine (not Jacobian) coordinates. The
point at infinity cannot be encoded in affine coordinates. The authors
instead decided to encode the point at infinity as (0, 0), since the
affine point (0, 0) isn't on the P-256 curve. It isn't clear why the
authors chose to do that though, since the point at infinity doesn't
(can't, logically) appear in the table of precomputed multiples of G
anyway. Regardless, if you represent the point at infinity as (0, 0)
then it makes sense to check (x, y) == (0, 0).

But, it seems like the functions that do the computations, like
ecp_nistz256_point_add and ecp_nistz256_point_add_affine, output the
point at infinity as (_, _, 0), not necessarily (0, 0, _). Also,
ecp_nistz256's EC_METHOD uses ec_GFp_simple_is_at_infinity and
ec_GFp_simple_point_set_to_infinity, which represent the point at
infinity with z == 0, not (x, y) == 0. Further ecp_nistz256_get_affine
uses EC_POINT_is_at_infinity, which checks z == 0, not (x, y) == 0.
This inconsistency is confusing, if not wrong. Given this, it seems
like the point-at-infinity checks in the ecp_nistz256_point_add and
ecp_nistz256_point_add_affine code should also be checking that z == 0
instead of (x, y) == (0, 0).

Note that this is confusing because `EC_POINT_new` followed by
`EC_POINT_to_infinity` initializes (X, Y, Z) = (0, 0, 0). Thus, the
check of (x, y) == (0, 0) "works" as well as the check z == 0. But, it
doesn't work in real-life cases where the point is infinity is
encountered during calculations, because we'll have (X, Y) != (0, 0)
but Z == 0.

The assembly language code that does this check is hard to understand
unless one is familiar with SIMD. However, the C reference
implementation that the assembly language code used as a model is easy
to understand. This code can be found in the ecp_nistz256.c file.

Note the parameters of ecp_nistz256_point_add are P256_POINT, not
P256_POINT_AFFINE, so "representation of the point at infinity as (0,
0)" doesn't make sense to me. But, that's exactly what it checks.

In ecp_nistz256_point_add_affine, it makes more sense to me, because
parameter |b| is in fact a |P256_POINT_AFFINE|. However, |a| is not a
|P256_POINT_AFFINE|, so the (x, y) == (0, 0) check doesn't make sense
to me. The x86-64 and x86 assembly language code seems to emulate this
exactly. I didn't test the ARM code, but I'd guess it is similar.

[1] https://eprint.iacr.org/2014/130.pdf (Section 4.1)

Here's the specific logic I'm talking about (which is also present in
the asm code):

```
static void ecp_nistz256_point_add(P256_POINT *r,
   const P256_POINT *a, const P256_POINT *b) {
[...]

const BN_ULONG *in1_x = a->X;
const BN_ULONG *in1_y = a->Y;
const BN_ULONG *in1_z = a->Z;

const BN_ULONG *in2_x = b->X;
const BN_ULONG *in2_y = b->Y;
const BN_ULONG *in2_z = b->Z;

/* We encode infinity as (0,0), which is not on the curve,
 * so it is OK. */
in1infty = (in1_x[0] | in1_x[1] | in1_x[2] | in1_x[3] |
in1_y[0] | in1_y[1] | in1_y[2] | in1_y[3]);
if (P256_LIMBS == 8)
in1infty |= (in1_x[4] | in1_x[5] | in1_x[6] | in1_x[7] |
 in1_y[4] | in1_y[5] | in1_y[6] | in1_y[7]);

in2infty = (in2_x[0] | in2_x[1] | in2_x[2] | in2_x[3] |
in2_y[0] | in2_y[1] | in2_y[2] | in2_y[3]);
if (P256_LIMBS == 8)
in2infty |= (in2_x[4] | in2_x[5] | in2_x[6] | in2_x[7] |
 in2_y[4] | in2_y[5] | in2_y[6] | in2_y[7]);

[...]
}

static void ecp_nistz256_point_add_affine(P256_POINT *r,
  const P256_POINT *a,
  const P256_POINT_AFFINE 

Re: [openssl-dev] Bugs fixed in one place but not another

2016-06-23 Thread Brian Smith
Salz, Rich  wrote:
>> Sometimes I report bugs and/or fix bugs which get fixed in [1] and/or [2].
>> Please make sure you consider the impact of those changes on your own
>> projects.
>
> Not sure what you're asking for.

In general, I noticed that OpenSSL and LibreSSL don't seem to pay
attention to the bugs that are fixed in BoringSSL and *ring*. See, for
example:
* 
https://boringssl.googlesource.com/boringssl/+/95b97693403d5c8f09b2870ad9a6d7d198246da4%5E!/
* 
https://boringssl.googlesource.com/boringssl/+/75b833cc819a9d189adb0fdd56327bee600ff9e9
* 
https://boringssl.googlesource.com/boringssl/+/44bedc348d9491e63c7ed1438db100a4b8a830be

I think it would be a good idea for OpenSSL and LibreSSL to fix the bugs too.

Cheers,
Brian
-- 
https://briansmith.org/
-- 
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


[openssl-dev] Bugs fixed in one place but not another

2016-06-23 Thread Brian Smith
Hi,

Sometimes I report bugs and/or fix bugs which get fixed in [1] and/or
[2]. Please make sure you consider the impact of those changes on your
own projects.

[1] https://boringssl.googlesource.com/boringssl/+log/
[2] https://github.com/briansmith/ring

Cheers,
Brian
-- 
https://briansmith.org/
-- 
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


Re: [openssl-dev] Assembler warns about constants in poly1306-x86_64.pl

2016-06-22 Thread Brian Smith
Andy Polyakov  wrote:
> I can't reproduce this.

Sorry! You already fixed it here:
https://github.com/openssl/openssl/commit/284116575d375729e672256cb2b754e8362c5bce
on May 4.

Cheers,
Brian
-- 
https://briansmith.org/
-- 
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


[openssl-dev] Assembler warns about constants in poly1306-x86_64.pl

2016-06-20 Thread Brian Smith
Yasm 1.3.0 (Like nasm, but it embeds debug symbols into the asm code
on Windows) reports:

poly1305-x86_64.asm(456): warning : value does not fit in 32 bit field
poly1305-x86_64.asm(459): warning : value does not fit in 32 bit field
poly1305-x86_64.asm(1346): warning : value does not fit in 32 bit field
poly1305-x86_64.asm(1349): warning : value does not fit in 32 bit field


For these lines:
and \$`-1<<31`,$d1
and \$`-1<<31`,$d2
and \$`-1<<31`,$d1
and \$`-1<<31`,$d2

I'm not sure what approach is preferable to silence these warnings though.

Cheers,
Brian
-- 
https://briansmith.org/
-- 
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


Re: [openssl-dev] Removing gcm128_context->H for non-1-bit builds

2016-06-11 Thread Brian Smith
Andy Polyakov  wrote:
> In other words we *are* talking about super-custom code with very
> special needs. As already mentioned, it would be next to impossible to
> justify customization of OpenSSL to accommodate overly specific
> requirements. And given above description it shouldn't be actually
> needed, not even previously posted patch facilitating omission of H
> should be required. I mean given knowledge about cases when H is not
> used, you can omit it from your compressed state and leave it zeroed on
> stack, right? *Or* [given that memory is seemingly at premium] you can
> choose to preserve H in your private structure, omit Htable[!] and
> initialize the latter in on-stack structure on per-call basis, per call
> to *your* super-custom subroutine that is.

Yes. Or, one could even through away everything in the GCM context and
restart everything from the raw key, which would make it more like the
Poly1305 code.

> But in case you choose to omit H, here is "manifest".

Thanks! That is very helpful.

Cheers,
Brian
-- 
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


Re: [openssl-dev] Stitched AES-NI AES-GCM code & AVX2

2016-06-11 Thread Brian Smith
Andy Polyakov  wrote:
>> But, I think the stitched AES-NI AES-GCM code requires AVX2, not just
>> AVX.
>
> No, it doesn't. It requires exactly AVX+MOVBE.

I see. I was confused because the code says:

 if ($avx>1) {{{

I had been thinking the whole time that "$avx > 1" means that AVX2 is required.

Thanks for the correction.

Cheers,
Brian
-- 
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


[openssl-dev] Stitched AES-NI AES-GCM code & AVX2

2016-06-09 Thread Brian Smith
Hi,

I see that the stitched AES-NI AES-GCM code will be used if :

gctx->ctr==aesni_ctr32_encrypt_blocks && \
gctx->gcm.ghash==gcm_ghash_avx)

In gcm128, I see that it decides to use gcm_ghash_avx if:

/* AVX+MOVBE */
if (((OPENSSL_ia32cap_P[1] >> 22) & 0x41) == 0x41) {

But, I think the stitched AES-NI AES-GCM code requires AVX2, not just
AVX. So, I think that to condition to execute the stitched code should
be changed to also test the AVX2 flag.

Maybe in practice there are no processors that have AVX and MOVBE but
which don't have AVX2. But, better safe than sorry.

Cheers,
Brian
-- 
https://briansmith.org/
-- 
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


[openssl-dev] Notes on stitched AESNI-GCM implementation details

2016-06-08 Thread Brian Smith
In the course of reviewing the code for the stitched AESNI-GCM code, a
group of people found it difficult to understand some aspects of how
the code works. After spending some time studying the code, I wrote
these notes:

https://boringssl.googlesource.com/boringssl/+/2477adcf6236c3040a291ad1bfd53f525e1af96d%5E%21/

Feel free to incorporate these notes into OpenSSL if you find them useful.

Cheers,
Brian
-- 
https://briansmith.org/
-- 
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


Re: [openssl-dev] Removing gcm128_context->H for non-1-bit builds

2016-06-08 Thread Brian Smith
On Wed, Jun 8, 2016 at 12:40 AM, Andy Polyakov  wrote:
>> I noticed that the `H` member of `gcm128_context` seems to be
>> unnecessary for builds that aren't using the 1-bit GCM math.
>
> Not true. It is actually used in s390x assembly module. And I mean both
> H and Htable.

I see. I admit I didn't look at the s390x code, mostly because I can't
fit one in my small apartment.

>> Could somebody adjust who understand the assembly code (probably Andy)
>> modify it to use symbolic names for the offsets that are used to
>> access Xi, H, Htable? If so, then I can write the patch to
>> conditionally exclude `H` on platforms that don't need it after
>> `CRYPTO_gcm128_init` finishes executing.

> But going the
> line of taking into consideration all corner cases is a stretch and
> should be weighed against 16 out of ~380[!] bytes waste. I'd say it's
> not worth it.

I see it both as an *optimization* and also a way to ensure
*correctness*. In particular, if the code doesn't expect H to be there
in configurations that don't use H, then some tricks that people might
use (in particular, a trick I am using) becomes safer.

In particular, notice that in the gcm128_context structure, there are
three kinds of state (again, only talking about non-s390x, non-1-bit
platforms):

1. State that is only used in the _init function: H.

2. State that needs to be preserved in between
authenticated-and-encrypted messages. This is `Htable`, `EK0`,
`gmult`, `ghash`, `block`, etc.

3. State that needs to be preserved only between the time you start an
authenticated-and-encrypted message and the time you end it. This
includes `len`, `EKi`, `mres`, `ares`, etc. currently. In theory this
could also include `gmult`, `ghash` and `block`, if the code were
refactored to recomputed them for each message and/or if things like
the OPENSSL_STATIC_ARMCAP-type optimization allowed one to omit them
from the structure completely in some configurations where there is no
way they could vary at runtime. Also, Htable is 256 bytes on its own
(on the platforms I care about), but actually in some
platforms/configurations not all of Htable is used.

In my code, after I call the _init function, I extract out all the
numbers in category #2 and store them in my per-connection context
structure on. Then, when I need to encrypt/decrypt a message, I
construct a full gcm128_context *on the stack*, zero it out, and then
fill in the values from category #2. Then I encrypt/decrypt the
message, and then throw away the gcm128_context.

> One can argue that "the most popular embedded three-letter
> platform" deserves this 4% space optimization [by being so popular], but
> then question would be if OpenSSL can actually execute in such
> constrained environment where 4% per GCM context (i.e. something that
> itself is percentage of something else) makes a difference. Aren't we
> likely to be talking about ripping out single component and using in the
> said environment? But then question is why this specific case would have
> to complicate maintenance for OpenSSL as whole?

In other words, there are *lots* of bytes in gcm128_context that could
be thrown away in between messages, if one really needed to save
memory. And then the size of `H` does matter quite a bit more as a
percentage of the size of this inter-message state. And, also, whether
or not `H` belongs in category #1 or category #2 is important for
correctness. Thus, my suggestion in this thread is an attempt to
clarify the code to make it more obvious that it is in category #2
(besides 1-bit-mult and s390x platforms).

Note, in contrast, Poly1305 only requires 32 bytes of state to be
preserved between messages. My goal is to bring the GCM inter-message
state storage requirements closer to this 32 bytes.

> However, I can tell that assembly module
> for "the most popular embedded three-letter platform" does *not* depend
> on relative position of Xi, H and Htable. One can *probably* discuss
> that it would be appropriate to *facilitate* omission of H in context
> *other than* OpenSSL by avoiding H during most of the setup procedure.
> See attached patch for example. But do note that I'm not saying that it
> works or suggesting to include it right away, I only want to show what
> *might* be matter of discussion.

Nice. Thanks for the patch. That is actually very similar to what I've
done in my experiments. But, I was also trying to get it to work on
x86 and x86-64, where the relative position does matter. The
x86/x86-64 code is where I got confused about the math in the
assembler modules.

>> Also, I wonder how important it is to keep the 1-bit math code?
>
> Look at it as insurance. The moment they come and say table-driven
> approach is no-go, we have 1-bit code to switch to.

Understood.

Thanks a ton!

Cheers,
Brian
-- 
https://briansmith.org/
-- 
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


Re: [openssl-dev] Making assembly language optimizations working onCortex-M3

2016-06-07 Thread Brian Smith
Peter Waltenberg  wrote:

> IF you are in the situation where you are compiling for a space
> constrained embedded processor then hopefully your engineers also have
> enough smarts to fix the code. I'd also point out that a lot of dev. setups
> for embedded aren't actually compiled on the target machine either so
> auto-detection at build time isn't that sensible anyway.
>
The BoringSSL works as follows:

1. The person building the code passes -DOPENSSL_STATIC_ARMCAP and some
other flags like -DOPENSSL_STATIC_ARMCAP_NEON, to indicate which features
are available on the target.

2. When OPENSSL_STATIC_ARMCAP is defined, the runtime detection of features
is disabled.

3. When OPENSSL_STATIC_ARMCAP is defined, OPENSSL_armcap_P is fixed to a
value based on which of DOPENSSL_STATIC_ARMCAP_NEON, etc. are defined.

4. When OPENSSL_STATIC_ARMCAP isn't defined, then everything works like it
currently does; i.e. features are detected at runtime.

The idea is that, instead having to go in and manually edit the code for
each different target system, one can just define these flags and the code
will auto-configure itself at build-time.

> The problem here is you can't have both and having the capability switch
> at runtime depending on hardware quirks is the better option for the
> majority of users.
>
That's usually true, but the topic of this thread is about how to get
OpenSSL working well on Cortex-M microcontrollers. In those situations, we
cannot really afford the dynamic detection or the many different
implementations of the same algorithm to exist in the final image.

Cheers,
Brian
-- 
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


Re: [openssl-dev] Making assembly language optimizations working on Cortex-M3

2016-06-07 Thread Brian Smith
Andy Polyakov  wrote:
>> > Cortex-M platforms are so limited that every bit of performance and
>> > space savings matters. So, I think it is definitely worthwhile to
>> > support the non-NEON ARMv7-M configuration. One easy way to do this
>> > would be to avoid building NEON code when __TARGET_PROFILE_M is 
>> defined.
>>
>> I don't see no __TARGET_PROFILE_M defined by gcc
>>
>>
>> I see. I didn't realize that GCC didn't emulate this ARM compiler
>> feature. Never mind.
>
> But gcc defines __ARM_ARCH_7M__, which can be used to e.g.

Thanks. That's useful to know.

>> I can try to make a patch to bring BoringSSL's OPENSSL_STATIC_ARMCAP
>> mechanism to OpenSSL, if you think that is an OK approach.
>
> I don't understand. Original question was about conditional *omission*
> of NEON code (which incidentally means even omission of run-time
> switches), while BoringSSL's OPENSSL_STATIC_ARMCAP is about *keeping*
> NEON as well as run-time switch *code*, just setting OPENSSL_armcap_P to
> a chosen value at compile time... I mean it looks like we somehow
> started to talk about different things... When I wrote "care to make
> suggestion" I was thinking about going through all #if __ARM_ARCH__>=7
> and complementing some of them with !defined(something_M)...

> Compiler might remove dead code it would generate itself, but it still
> won't omit anything from assembly module. Linker takes them in as
> monolithic blocks.

If the target is Cortex-M4, there is no NEON. So then, with the
OPENSSL_STATIC_ARMCAP, we won't set define OPENSSL_STATIC_ARMCAP_NEON
and so that bit of the armcap variable won't be set.

I think what you're trying to say is that, if we just stop there, then
all the NEON code will still get linked in. That's true. But, what I
mean is that we should then also change all the tests of the NEON bit
of OPENSSL_armcap_P (and, more generally, all tests of
OPENSSL_armcap_P) to use code that the C compiler can do constant
propagation and dead code elimination on. We can do this, for example,
by defining `OPENSSL_armcap_P` to be a macro that can be seen to have
a constant compile-time value, when using the OPENSSL_STATIC_ARMCAP
mechanism. And/or, we can surround the relevant code with `#if
!defined(OPENSSL_STATIC_ARMCAP ) ||
defined(OPENSSL_STATIC_ARMCAP_NEON)`, etc. This latter technique would
(IIUC) work even in the assembly language files.

In this way, if we know at build time that NEON will be available, we
can avoid compiling/linking the non-NEON code. Conversely, if we know
that NEON will NOT be available, we can avoid compiling/linking the
NEON code.

I hope this clarifies my suggestion.

Cheers,
Brian
-- 
https://briansmith.org/
-- 
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


Re: [openssl-dev] Why is `volatile` used in MOD_EXP_CTIME_COPY_FROM_PREBUF?

2016-06-07 Thread Brian Smith
Andy Polyakov <ap...@openssl.org> wrote:
>Brian Smith wrote:
>> See
>> https://github.com/openssl/openssl/commit/d6482a82bc2228327aa4ba98aeeecd9979542a31#diff-3aca3afd18ad75a8f6a09a9860bc6ef5R631
>>
>> + volatile BN_ULONG *table = (volatile BN_ULONG *)buf;
>>
>> Why is `volatile` used here? Is it to work around the effective type
>> (strict aliasing) violations or for some other reason?
>
> Isn't it obvious? Volatile is there to discourage compiler from
> reordering loads from the the table. I mean concern is that if reordered
> in specific manner loads might give away the information we are trying
> to conceal.

Sorry, maybe these things are obvious to many people but they're not
so obvious to me. I saw that after I posted this email, you added a
comment that says something similar to what you wrote above. But, just
to be absolutely clear: the concern is that the compiler might notice,
"hey, this code is scanning this input array in a weird way. I can
redo the math (in a way that will result in non-constant-time-access
to the buffers containing secrets) so that this is much faster." So,
maybe, it is not so much the order of the accesses that matter, but
rather that the compiler might choose to do different math that
arrives at the same results, but with different timing?

>> I think it would
>> be good to document this, or better, find a way to avoid needing to use
>> `volatile` in the first place.
>
> Well, the only guaranteed way is to implement it in assembly. Note that
> on most popular/relevant platform it *is* implemented in assembly.

Yes, understood. And, in general, pepole should be using blinding for
RSA and avoiding the other algorithms that use this code.

Thanks for taking the time to answer my questions! I appreciate it.

Cheers,
Brian
-- 
https://briansmith.org/
-- 
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


Re: [openssl-dev] Syncing OpenSSL and BoringSSL mont ASM code

2016-06-07 Thread Brian Smith
Andy Polyakov  wrote:
>> 1. Please
>> see 
>> https://boringssl.googlesource.com/boringssl/+/75b833cc819a9d189adb0fdd56327bee600ff9e9.
>>
>> I think it would be good for OpenSSL to work with Google to integrate
>> this patch.
>
> Will be looked into...

Awesome! Besides the functions that Google changed, I think it is also
necessary to change the other implementations of `bn_mul_mont`
(including the C implementation) and also `BN_from_montgomery_word`.

>> 2. Is the `__chkstk` code that was added [1] to `bn_mul_mont` really
>> necessary?
>
> The SEGV that is mentioned in the commit message and commentary was
> actually observed and reported. Well, it's not called SEGV on Windows,
> but equivalent has same devastating effect, i.e. application crash. It
> takes super-long RSA key, longer than you'd normally use, so it's not
> something that a lot of users risk suffering from. But the problem is
> real nevertheless.

Thanks for clarifying this. I agree that it seems like a good thing to do.

This is a very large `alloca` happening in the asm code, which is
probably surprising for people reading the code. Maybe it would be a
good idea to move the alloca out into the calling function so that it
is clear that a *lot* of stack space is being used, potentially.

Cheers,
Brian
-- 
https://briansmith.org/
-- 
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


[openssl-dev] Removing gcm128_context->H for non-1-bit builds

2016-06-07 Thread Brian Smith
I noticed that the `H` member of `gcm128_context` seems to be
unnecessary for builds that aren't using the 1-bit GCM math. Since
this member is large (128-bits) and some applications may have lots of
GCM contexts relative to the amount of memory they have, I think it
would be great to only put the `H` member into the structure when the
1-bit math is used.

I tried to write a patch to do this myself, but I got stuck, because
the assembly language code does pointer math assuming that the `H`
member (which it doesn't use, AFAICT) is there. And, I can't quite
understand the assembly language code well enough to adjust all the
offsets to make the code work with `H` removed.

Could somebody adjust who understand the assembly code (probably Andy)
modify it to use symbolic names for the offsets that are used to
access Xi, H, Htable? If so, then I can write the patch to
conditionally exclude `H` on platforms that don't need it after
`CRYPTO_gcm128_init` finishes executing.

Also, I wonder how important it is to keep the 1-bit math code? It
seems pretty much every platform have optimized code that uses 4-bit
or 8-bit math.

Thanks,
Brian
-- 
https://briansmith.org/
-- 
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


Re: [openssl-dev] [openssl.org #4362] chacha-x86.pl has stricter aliasing requirements than other files

2016-06-07 Thread Brian Smith via RT
Brian Smith <br...@briansmith.org> wrote:
> It seems that 32-bit ARM has the same limitation as x86 that the input and
> output pointers must match or the input and output buffers must not overlap
> at all. I'm not sure which ARM code path (NEON or non-NEON, or both) has
> this issue.

Just to follow up on this: I think this might actually be a QEMU ARM
(32-bit) emulator bug, or a configuration issue on my part. In one
version of the QEMU emulator, I have no trouble. But, in another,
newer, version of the QEMU emulator, I get results like this for
BoringSSL's chacha_test (modified to print all the results before
failing):

Mismatch at length 64 with in-place offset 1.
Mismatch at length 64 with in-place offset 2.
Mismatch at length 64 with in-place offset 5.
Mismatch at length 64 with in-place offset 6.
Mismatch at length 64 with in-place offset 9.

Notice, in particular, that it only happens when the input length is
64, and only for specific offsets. Like I said, I consistently get
these failures on the Android emulator but not in a newer version of
QEMU. It doesn't make any difference whether NEON is enabled or
disabled; I believe this is because the ARM code only uses NEON if
there are at least 3 blocks.

Anyway, I see in the ARM chacha code that there is a special case when
the length is 64, so it might be worth double-checking that code.

Just FYI.


-- 
Ticket here: http://rt.openssl.org/Ticket/Display.html?id=4362
Please log in as guest with password guest if prompted

-- 
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


Re: [openssl-dev] [openssl.org #3502] nameConstraints bypass bug

2016-05-31 Thread Brian Smith
On Mon, May 30, 2016 at 5:58 PM, Viktor Dukhovni  wrote:

> Name constraints in the X.509v3 PKI have not worked well, and are
> rarely used.  The attack requires a issuing CA to be willing to
> issue certificates beyond its constraints, that would be quite
> noticeable and rather unwise.  So I think this is not a major
> problem.  We should probably make a reasonable effort to address
> this, but the urgency is I think low.
>

Not too long ago, there were changes to the CABForum rules about
certificates to make it easier for any website to get a CA certificates
constrained to its domain name. There were some problems with the loosening
of the rules, and Apple has been slow to implement name constraints, so not
many websites are taking advantage of them. But, soon, I am hopeful, and I
expect, that it will soon be as easy to get name-constrained CA certificate
as it is to get a wildcard certificates now. In fact, it is really
important for the security of many (smaller and medium-sized) websites that
this become possible, because this would make HPKP work much better and
reduce risks relative to wildcard certificates.

In particular, we should be designing things based on the assumption that
in the next few years, the owner of briansmith.org can get a CA certificate
with name constraint of dNSName=briansmith.org. Then the owner of
briansmith.org will be able to put Subject={CN=google.com} in his
certificates if he feels like it. And, we shouldn't even expect such
certificates to be revoked because they will be harmless to anybody that
does validation correctly (i.e. by either ignoring the subject CN or by
applying name constraints to the subject CN).

Is such a nuanced thing something that application developers can really be
expected to deal with on their own? I doubt it.

Cheers,
Brian
-- 
https://briansmith.org/
-- 
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


Re: [openssl-dev] [openssl.org #3502] nameConstraints bypass bug

2016-05-30 Thread Brian Smith
Salz, Rich via RT  wrote:

> > Note that other implementations treated this as a bug and fixed it a
> long time
> > ago.
>
> What other implementations, and what did they do?  Always treating a CN as
> a DNS name?  We can't.
>

As one example, mozilla::pkix treats the CN as a dNSName/iPAddress iif
there is no subjectAltName extension and iif the CN is a valid
dNSNa/iPAddress syntactically.


> Applications can do that now by setting the right flag, as Viktor pointed
> out.  I think it's too late to make the default change for 1.1
>

The important thing is: What happens when applications use the default
settings? If the default settings are "don't consider the subject CN for
name constraint processing, but do consider it for name validation" then
that's obviously wrong and dangerous.

Besides that, there needs to be a way to make name constraint processing
consistent with name validation. That means that if name validation is
configured (by default or with a switch) to look at subject CNs then the
name constraints need to be configurable to do the same. Name validation
and name constraint validation go hand-in-hand.


> > How about this for a heuristic:  If nameConstraints are in effect, then
> the
> > validator MUST NOT accept the CN as a DNS name.  This seems like the
> least
> > the validator could do, in light of the aforementioned deprecation.
>
> Probably.
>

If the validator has that much information then it should be simple for it
to keep the state in sync. But, often (usually?) certificate chain
validation and certificate name validation are done in separate steps by
applications, and it's difficult or impossible to keep things in sync in
that case.


> >  -- The problem is not solved until bad guys are
> >   /required/ to use SAN;
>
> Applications can make that happen now.


Again, what matters is less what applications *can* do and more what
applications *actually* do. I suspect that most applications are having the
wrong, dangerous, thing happen. In fact, that is almost definitely the
case, considering that many applications are doing their own name
validation (and thus almost definitely looking at the subject CN as that is
historically how it is done), as OpenSSL didn't provide a name validation
API until recently.

So, I agree with the others that OpenSSL is broken in this respect.

Cheers,
Brian
-- 
https://briansmith.org/
-- 
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


Re: [openssl-dev] [openssl.org #3502] nameConstraints bypass bug

2016-05-30 Thread Brian Smith via RT
Salz, Rich via RT  wrote:

> > Note that other implementations treated this as a bug and fixed it a
> long time
> > ago.
>
> What other implementations, and what did they do?  Always treating a CN as
> a DNS name?  We can't.
>

As one example, mozilla::pkix treats the CN as a dNSName/iPAddress iif
there is no subjectAltName extension and iif the CN is a valid
dNSNa/iPAddress syntactically.


> Applications can do that now by setting the right flag, as Viktor pointed
> out.  I think it's too late to make the default change for 1.1
>

The important thing is: What happens when applications use the default
settings? If the default settings are "don't consider the subject CN for
name constraint processing, but do consider it for name validation" then
that's obviously wrong and dangerous.

Besides that, there needs to be a way to make name constraint processing
consistent with name validation. That means that if name validation is
configured (by default or with a switch) to look at subject CNs then the
name constraints need to be configurable to do the same. Name validation
and name constraint validation go hand-in-hand.


> > How about this for a heuristic:  If nameConstraints are in effect, then
> the
> > validator MUST NOT accept the CN as a DNS name.  This seems like the
> least
> > the validator could do, in light of the aforementioned deprecation.
>
> Probably.
>

If the validator has that much information then it should be simple for it
to keep the state in sync. But, often (usually?) certificate chain
validation and certificate name validation are done in separate steps by
applications, and it's difficult or impossible to keep things in sync in
that case.


> >  -- The problem is not solved until bad guys are
> >   /required/ to use SAN;
>
> Applications can make that happen now.


Again, what matters is less what applications *can* do and more what
applications *actually* do. I suspect that most applications are having the
wrong, dangerous, thing happen. In fact, that is almost definitely the
case, considering that many applications are doing their own name
validation (and thus almost definitely looking at the subject CN as that is
historically how it is done), as OpenSSL didn't provide a name validation
API until recently.

So, I agree with the others that OpenSSL is broken in this respect.

Cheers,
Brian
-- 
https://briansmith.org/

-- 
Ticket here: http://rt.openssl.org/Ticket/Display.html?id=3502
Please log in as guest with password guest if prompted

-- 
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


[openssl-dev] Syncing OpenSSL and BoringSSL mont ASM code

2016-05-25 Thread Brian Smith
1. Please see
https://boringssl.googlesource.com/boringssl/+/75b833cc819a9d189adb0fdd56327bee600ff9e9
.

I think it would be good for OpenSSL to work with Google to integrate this
patch.

2. Is the `__chkstk` code that was added [1] to `bn_mul_mont` really
necessary? I noticed that when BoringSSL integrated the patch to fix the
constant-timedness issues in bn_mul_mont, it omitted the __chkstk stuff.
Even after reading the code and the comments, it still isn't clear to me
how/why it matters.

[1]
https://github.com/openssl/openssl/commit/adc4f1fc25b2cac90076f1e1695b05b7aeeae501

Thanks,
Brian
-- 
https://briansmith.org/
-- 
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


[openssl-dev] Why is `volatile` used in MOD_EXP_CTIME_COPY_FROM_PREBUF?

2016-05-25 Thread Brian Smith
See
https://github.com/openssl/openssl/commit/d6482a82bc2228327aa4ba98aeeecd9979542a31#diff-3aca3afd18ad75a8f6a09a9860bc6ef5R631

+ volatile BN_ULONG *table = (volatile BN_ULONG *)buf;

Why is `volatile` used here? Is it to work around the effective type
(strict aliasing) violations or for some other reason? I think it would be
good to document this, or better, find a way to avoid needing to use
`volatile` in the first place.

Thanks,
Brian
-- 
https://briansmith.org/
-- 
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


Re: [openssl-dev] Making assembly language optimizations working on Cortex-M3

2016-05-25 Thread Brian Smith
Andy Polyakov  wrote:

> > Cortex-M platforms are so limited that every bit of performance and
> > space savings matters. So, I think it is definitely worthwhile to
> > support the non-NEON ARMv7-M configuration. One easy way to do this
> > would be to avoid building NEON code when __TARGET_PROFILE_M is defined.
>
> I don't see no __TARGET_PROFILE_M defined by gcc


I see. I didn't realize that GCC didn't emulate this ARM compiler feature.
Never mind.


> Anyway, care to make a suggestion in form of patch? That
> would be suitable even for gcc? [Just in case, no, I don't have ARM's
> compiler, only its manual.]
>

I can try to make a patch to bring BoringSSL's OPENSSL_STATIC_ARMCAP
mechanism to OpenSSL, if you think that is an OK approach.


> > Alternatively, similar to what BoringSSL did, you could have an option
> > that says "instead of doing runtime feature detection, instead detect
> > features at compile time based on __ARM_NEON__ and the like." I think
> > such a configuration would also help the C compiler do whole-program
> > optimization better.
>
> I doubt that, because compiler doesn't look at assembly modules.


For example, in the AES-GCM code, there is a runtime check to decide
between various implementations. With the OPENSSL_STATIC_ARMCAP-like
approach, in theory the compiler's constant propagation and dead code
elimination can work together to automatically optimize away the code paths
that aren't applicable to the current configuration, without needing to
maintain lots of #ifdefs.

Cheers,
Brian
-- 
https://briansmith.org/
-- 
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


Re: [openssl-dev] Making assembly language optimizations working on Cortex-M3

2016-05-25 Thread Brian Smith
[Sorry for the **long** delay in responding.]

Andy Polyakov  wrote:

>
> http://git.openssl.org/gitweb/?p=openssl.git;a=commitdiff;h=11208dcfb9105e8afa37233185decefd45e89e17
> made whole assembly pack Thumb2-friendly, so that now you should be able
> to compile all modules. Please, double-check.


This is awesome!

I have a question about the `it` and `itt` instructions you inserted. You
wrapped them in `#ifdef __thumb2__`, which is not wrong, but AFAICT is
usually unnecessary. Is this to support some old assemblers that don't
compile `it` (etc.) into nothing for non-Thumb builds?


> There is no option to
> disable NEON (yet?), because a) I want to expose it to more build cases
> to catch eventual bugs; b) would like to suggest idea of supporting
> Cortex-M with -march=armv6t2 -mthumb. Latter means that you'll loose
> some performance, because it won't utilize word load instruction's
> capability to handle misaligned access in ARMv7. But on the other hand
> it won't have ideas about compiling NEON, and you'll be excused to think
> about which particular Cortex-M is targeted, one will be able to cover
> all with single config/buid. Can it be viable compromise? One would
> still be able to tune for favorite Mx...


For Cortex-M4 and friends, one would really want to use the full
ARMv7-M instruction
set (i.e. not compile for armv6t2). In general Cortex-M platforms are so
limited that every bit of performance and space savings matters. So, I
think it is definitely worthwhile to support the non-NEON ARMv7-M
configuration. One easy way to do this would be to avoid building NEON code
when __TARGET_PROFILE_M is defined. Alternatively, similar to what
BoringSSL did, you could have an option that says "instead of doing runtime
feature detection, instead detect features at compile time based on
__ARM_NEON__ and the like." I think such a configuration would also help
the C compiler do whole-program optimization better.

Again, thanks for doing this!

Cheers,
Brian
-- 
https://briansmith.org/
-- 
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


Re: [openssl-dev] [openssl.org #4521] openssl GCM ordering

2016-04-25 Thread Brian Smith
Praveen Kariyanahalli via RT  wrote:

> Is there is a reason why openssl has restriction of auth before encrypt
> order ? I dont believe there is an algo restriction, was wondering why
> openssl has this.
>

It *is* inherent in the algorithm. The authentication tag for the AAD is
computed first, then the authentication tag for the encrypted data is
computed.


> The reason I bring this up, is that when I broadcast/multicast traffic need
> not encrypt the payload multiple times, but need to auth the header
> differently and openssl is refusing to cooperate :)


With AEADs, in general, you can't separate the authentication from the
encryption like that.

Cheers,
Brian
-- 
https://briansmith.org/
-- 
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


Re: [openssl-dev] Consistent way of making a const bignum

2016-04-20 Thread Brian Smith
Feel free to take the patch at
https://boringssl.googlesource.com/boringssl/+/7af36e1e38f54051559e2a40e6461a0c3b23b3fc%5E%21/#F0
if it helps you.

In particular, crypto/ec (ecp_nistz256, at least) also needs this, and in
fact this is already defined there.

Cheers,
Brian

On Wed, Apr 20, 2016 at 8:36 AM, 黄勤瑾  wrote:

> I'd submitted a pull request to openssl, changing the use of BN_bin2bin()
> to static const bignums in bn_const.c . I defined macro bn_pack2 :
>
> #if BN_BITS2 == 64
> #define bn_pack2(a1,a2) ((0x##a1##ULL<<32)|0x##a2##ULL)
> #elif BN_BITS2 == 32
> #define bn_pack2(a1,a2) 0x##a2##UL, 0x##a1##UL
> #else
> #error "unsupported BN_BITS2"
> #endif
>
> Use bn_pack2 to define BN_ULONG arrays:
> static const BN_ULONG num[] = {
> bn_pack2(, ),  bn_pack2(F44C42E9, A63A3620)
> };
>
> bn_const.c , bn_dh.c , bn_nist.c , bn_srp.c make const bignums
> respectively.
>  *
> richsalz * wants to make them consistent. So
> if the marco bn_pack2 works well on all platforms, or the best way to do
> that?
>
> --
> openssl-dev mailing list
> To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev
>
>


-- 
https://briansmith.org/
-- 
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


Re: [openssl-dev] [openssl.org #4362] chacha-x86.pl has stricter aliasing requirements than other files

2016-04-16 Thread Brian Smith via RT
It seems that 32-bit ARM has the same limitation as x86 that the input and
output pointers must match or the input and output buffers must not overlap
at all. I'm not sure which ARM code path (NEON or non-NEON, or both) has
this issue.

Cheers,
Brian
-- 
https://briansmith.org/

-- 
Ticket here: http://rt.openssl.org/Ticket/Display.html?id=4362
Please log in as guest with password guest if prompted

-- 
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


Re: [openssl-dev] [openssl.org #4116] [PATCH] Reimplement non-asm OPENSSL_cleanse()

2015-10-31 Thread Brian Smith via RT
Alessandro Ghedini via RT  wrote:

> I was also wondering whether it would make sense to just drop the asm
> implementations. Does the speed-up justify the added complexity?
>

IMO, it should work like this:
* memset_s when memset_s is available.
* Otherwise, SecureZeroMemory, when SecureZeroMemory is available.
* Otherwise, if a flag OPENSSL_REQUIRE_SECURE_ZERO is set, fail.
* Otherwise, use an assembly language implementation, if available.
* Otherwise, emit a warning and use the C implementation.

Note in particular that the C compiler is allowed to completely defeat the
purpose of the function unless SecureZeroMemory or memset_s is used, even
if you use "volatile" or other tricks. The primary purpose of the assembly
language implementations is to reduce the possibility that the C compiler
will do the weird things that C compilers love to do.

Cheers,
Brian
-- 
https://briansmith.org/

___
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


Re: [openssl-dev] [openssl.org #4116] [PATCH] Reimplement non-asm OPENSSL_cleanse()

2015-10-31 Thread Brian Smith via RT
On Sat, Oct 31, 2015 at 11:50 AM, Alessandro Ghedini via RT 
wrote:

> In any case memset_s is not available anywhere anyway, so that doesn't
> really
>
matter.
>

Is it available in some places, e.g.
https://developer.apple.com/library/mac/documentation/Darwin/Reference/ManPages/man3/memset_s.3.html
.


> > * Otherwise, SecureZeroMemory, when SecureZeroMemory is available.
> > * Otherwise, if a flag OPENSSL_REQUIRE_SECURE_ZERO is set, fail.
>
> In 99% of the cases (e.g. Linux with glibc or any *BSD) that would fail,
> so I
> don't see the point in that.
>

The point is to let the person building OPENSSL say "I want the build to
fail if there isn't a secure way to zero memory, because I'm expecting
there to always be one in my configuration." Alternatively, there could be
an OPENSSL_USE_MEMSET_S flag that says "always use memset_s and never
anything else."


> > Note in particular that the C compiler is allowed to completely defeat
> the
> > purpose of the function unless SecureZeroMemory or memset_s is used, even
> > if you use "volatile" or other tricks.
>
> I don't think that is true (regarding the volatile pointer). But assuming
> that
> a broken compiler decided to optimize that call away, what's stopping it to
> optimize the call to the asm implementation as well? Also, such broken
> compiler
> probably wouldn't know about C11 either, so even if memset_s() was
> available it
> could optimize that as well.
>

Such optimizations are legal. Otherwise, C11 wouldn't have defined
|memset_s|. And, the entire purpose of |memset_s| is to disable such
optimizations.


> I don't know how compilers are supposed to treat memset_s(), but if I
> define a
> memset_s() function which just calls memset() internally, GCC (v5.2.1
> 20151028)
> optimizes that away just as it does with plain memset(), so libc
> implementations would probably need to adopt "tricks" to avoid the
> optimization
> as well.
>

Right. It has to be built into the compiler.


> FWIW OpenSSH implements the portable explicit_bzero() using the volatile
> pointer as well (unless memset_s() is detected at build time).


That is similar to what I'm suggesting.


> On OpenBSD it
> just uses OpenBSD's explicit_bzero() which is implemented using memset()
> and a
> weak function pointer.


It is a good idea to detect OpenBSD at compile time and use
|explicit_bzero|, just like for |SecureZeroMemory| on Windows.

Don't pay much attention to what tricks OpenBSD's |explicit_bzero| uses.
Those tricks are not guaranteed to work for anything other than that one
function on OpenBSD.


> But that (as used in LibreSSL) seems to have problems in
> relation to LTO, unless optimizations are specifically disabled:
> https://github.com/libressl-portable/openbsd/issues/5


Right, this is more evidence that the only correct implementation is to use
a compiler-provided |memset_s|, |explicit_bzero|, |SecureZeroMemory|. It is
not possible to implement your own in a way that is guaranteed to work. If
you need to implement your own.



> On the other hand BoringSSL uses memset() with an explicit memory barrier
>

The explicit memory barrier is bad for performance.

And, it also isn't guaranteed to work. I don't speak for the BoringSSL
team, but in correspondence with them, they don't seem to care much if
OpenSSL_cleanse works or if it is used.


> > The primary purpose of the assembly language implementations is to
> reduce the
> > possibility that the C compiler will do the weird things that C
> compilers love
> > to do.
>
> According to the changelog and git log, the primary purpose of the asm
> implementations was to improve performance (see commit b2dba9b). Using the
> volatile pointer implementation would IMO make these optimizations useless,
> hence the proposal to drop them and make things simpler.


Sorry. Instead of "primary purpose" (which implies intent), I should have
said 'primary advantage".  An assembly language implementation is more
likely to work than a C implementation because the C compiler generally
won't analyze externally-assembled code, so it has to assume that the
externally-assembled code has side effects, so it must call the
externally-assembled code. In theory, it is possible for the assembler, C
compiler, and the linker to work together during LTO and figure out that
the assembly language implementation doesn't have any side effects other
than zeroing memory, but that seems unlikely--much less likely than the C
compiler subverting any trick.

Note that sometimes I notice places that OpenSSL doesn't call
OPENSSL_cleanse when it seems like it would be warranted to be consistent
with other code. ecdh_compute_key [1] is one example. Generally, I don't
expect OpenSSL to (securely) zero memory, so it doesn't matter much to me.
But, if it matters to others, then this is something that would require an
substantial amount of auditing and fixing.

[1]

Re: [openssl-dev] [openssl.org #4116] [PATCH] Reimplement non-asm OPENSSL_cleanse()

2015-10-31 Thread Brian Smith
On Sat, Oct 31, 2015 at 11:50 AM, Alessandro Ghedini via RT 
wrote:

> In any case memset_s is not available anywhere anyway, so that doesn't
> really
>
matter.
>

Is it available in some places, e.g.
https://developer.apple.com/library/mac/documentation/Darwin/Reference/ManPages/man3/memset_s.3.html
.


> > * Otherwise, SecureZeroMemory, when SecureZeroMemory is available.
> > * Otherwise, if a flag OPENSSL_REQUIRE_SECURE_ZERO is set, fail.
>
> In 99% of the cases (e.g. Linux with glibc or any *BSD) that would fail,
> so I
> don't see the point in that.
>

The point is to let the person building OPENSSL say "I want the build to
fail if there isn't a secure way to zero memory, because I'm expecting
there to always be one in my configuration." Alternatively, there could be
an OPENSSL_USE_MEMSET_S flag that says "always use memset_s and never
anything else."


> > Note in particular that the C compiler is allowed to completely defeat
> the
> > purpose of the function unless SecureZeroMemory or memset_s is used, even
> > if you use "volatile" or other tricks.
>
> I don't think that is true (regarding the volatile pointer). But assuming
> that
> a broken compiler decided to optimize that call away, what's stopping it to
> optimize the call to the asm implementation as well? Also, such broken
> compiler
> probably wouldn't know about C11 either, so even if memset_s() was
> available it
> could optimize that as well.
>

Such optimizations are legal. Otherwise, C11 wouldn't have defined
|memset_s|. And, the entire purpose of |memset_s| is to disable such
optimizations.


> I don't know how compilers are supposed to treat memset_s(), but if I
> define a
> memset_s() function which just calls memset() internally, GCC (v5.2.1
> 20151028)
> optimizes that away just as it does with plain memset(), so libc
> implementations would probably need to adopt "tricks" to avoid the
> optimization
> as well.
>

Right. It has to be built into the compiler.


> FWIW OpenSSH implements the portable explicit_bzero() using the volatile
> pointer as well (unless memset_s() is detected at build time).


That is similar to what I'm suggesting.


> On OpenBSD it
> just uses OpenBSD's explicit_bzero() which is implemented using memset()
> and a
> weak function pointer.


It is a good idea to detect OpenBSD at compile time and use
|explicit_bzero|, just like for |SecureZeroMemory| on Windows.

Don't pay much attention to what tricks OpenBSD's |explicit_bzero| uses.
Those tricks are not guaranteed to work for anything other than that one
function on OpenBSD.


> But that (as used in LibreSSL) seems to have problems in
> relation to LTO, unless optimizations are specifically disabled:
> https://github.com/libressl-portable/openbsd/issues/5


Right, this is more evidence that the only correct implementation is to use
a compiler-provided |memset_s|, |explicit_bzero|, |SecureZeroMemory|. It is
not possible to implement your own in a way that is guaranteed to work. If
you need to implement your own.



> On the other hand BoringSSL uses memset() with an explicit memory barrier
>

The explicit memory barrier is bad for performance.

And, it also isn't guaranteed to work. I don't speak for the BoringSSL
team, but in correspondence with them, they don't seem to care much if
OpenSSL_cleanse works or if it is used.


> > The primary purpose of the assembly language implementations is to
> reduce the
> > possibility that the C compiler will do the weird things that C
> compilers love
> > to do.
>
> According to the changelog and git log, the primary purpose of the asm
> implementations was to improve performance (see commit b2dba9b). Using the
> volatile pointer implementation would IMO make these optimizations useless,
> hence the proposal to drop them and make things simpler.


Sorry. Instead of "primary purpose" (which implies intent), I should have
said 'primary advantage".  An assembly language implementation is more
likely to work than a C implementation because the C compiler generally
won't analyze externally-assembled code, so it has to assume that the
externally-assembled code has side effects, so it must call the
externally-assembled code. In theory, it is possible for the assembler, C
compiler, and the linker to work together during LTO and figure out that
the assembly language implementation doesn't have any side effects other
than zeroing memory, but that seems unlikely--much less likely than the C
compiler subverting any trick.

Note that sometimes I notice places that OpenSSL doesn't call
OPENSSL_cleanse when it seems like it would be warranted to be consistent
with other code. ecdh_compute_key [1] is one example. Generally, I don't
expect OpenSSL to (securely) zero memory, so it doesn't matter much to me.
But, if it matters to others, then this is something that would require an
substantial amount of auditing and fixing.

[1]

Re: [openssl-dev] Continuous Integration for OpenSSL

2015-08-21 Thread Brian Smith
Alessandro Ghedini alessan...@ghedini.me wrote:

 Travis has some limitations though, like the fact that it only supports
 Linux
 and OS X, with fairly old compiler versions (e.g. gcc 4.6).


Take a look at .travis.yml and mk/update-travis-yml.py in
https://github.com/briansmith/ring, which allows a project to be
built/tested using many versions of GCC and Clang. Recently Clang-3.7
stopped working on Travis but I think it will start working again soon.

On Windows, you can use Appveyor; see appveyor.yml in the repo linked above.

Cheers,
Brian
___
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


Re: [openssl-dev] Making assembly language optimizations working on Cortex-M3

2015-08-09 Thread Brian Smith
Andy Polyakov ap...@openssl.org wrote:

  Cortex-M3 and Cortex-M4 processors are -mthumb, -march=armv7-m, which is
  exactly the problematic configuration, if I understand that comment
  correctly.

 The comment in question applies *exclusively* to cases when you attempt
 to build universal binary, one that works on multiple platforms, e.g.
 on ARMv6 and ARMv7.


snip

OK. Thanks for clarifying things.

 Has anybody started working on this?


  If not, my thinking was to refactor the assembly language code so that
  all the ARM-only (-marm) code is cleanly separated from the unified
  (-mthumb and -marm) code,

 As implied, there are few assembly modules that *can* be compiled for
 Thumb-2 today, namely aes-armv4, bsaes-armv7, sha256-armv4,
 sha512-armv4.


Is there evidence that we can't adhere to this strategy of
 adjusting modules for ARM/Thumb-2 duality?


That sounds good to me.


 (BTW, can you confirm that you can get mentioned modules work?)


Yes, I am able to compile those modules:

git clone https://github.com/briansmith/openssl arm-openssl
cd arm-openssl
./configure-arm-none-eabi.sh
make depend
make

In particular, see this commit to understand the configuration:
https://github.com/briansmith/openssl/commit/ffa6e0c7a575184bc171b72a7ff3715bde460163

 I am using the toolchain from [1]:

arm-none-eabi-gcc (GNU Tools for ARM Embedded Processors)
4.9.3 20150529 (release) [ARM/embedded-4_9-branch revision 224288]


I ran out of time this weekend before I could test out the resultant
library though.


  and recognize two new settings,
  OPENSSL_NO_ARM_NEON and OPENSSL_ARM_THUMB_ONLY, to accommodate this.

 While NO_NEON might make sense, I really see no reason to introduce
 THUMB_ONLY. Because pre-defines set by the compiler driver are
 sufficient.


You mean, using __thumb__ (predefined for both thumb1 and thumb2) and
__thumb2__
(predefined for thumb2 only)?


 Actually, one can argue that even decision to omit NEON code
 can be made based on pre-defines, e.g. __ARM_ARCH_7M__. Well, this
 doesn't exclude possibility to define NO_NEON based on pre-define and
 using NO_NEON in code.


Right. If you are building for Cortex-A (armv7-a) then you may or may not
want the NEON detection, depending on how much you know about the users'
hardware in advance. Consequently, I think having an OPENSSL_ARM_NO_NEON
option is useful.


 A word of warning. When looking at ARM assembly code, you might find
 yourself asking why isn't this done this way? It's likely that answer
 to that question is because old assembler fails to compile it. I mean
 there is certain level of legacy support there and it's not a coincidence.


Understood.

I also looked at some of the past commits where thumb2 support was added,
e.g. [2].

So, basically, we just need to (a) understand which modules are thumb2-only
and which aren't, and (b) adjust more modules to work for thumb2. I am
mostly interested in getting the AES-GCM and sha1 assembly language
optimizations working for this configuration. You mentioned in your message
that you already have ghash working for thumb2. Is that right?

[1] https://launchpad.net/~terry.guo/+archive/ubuntu/gcc-arm-embedded
[2]
https://github.com/openssl/openssl/commit/e0202d946d68bd91a3e99f223c66d1fce7db136a

Thanks for your help!

Cheers,
Brian
--
https://briansmith.org/
___
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


[openssl-dev] Making assembly language optimizations working on Cortex-M3

2015-08-05 Thread Brian Smith
Hi,

In ./Configure, there is this comment:

# big-endian platform. This is because ARMv7 processor always
# picks instructions in little-endian order. Another similar
# limitation is that -mthumb can't cross -march=armv6t2
# boundary, because that's where it became Thumb-2. Well, this
# limitation is a bit artificial, because it's not really
# impossible, but it's deemed too tricky to support.

Cortex-M3 and Cortex-M4 processors are -mthumb, -march=armv7-m, which is
exactly the problematic configuration, if I understand that comment
correctly. I am interested in getting libcrypto working for these
processors with the assembly language optimizations enabled. Specifically,
the configuration I am interested in is:

CC=arm-none-eabi-gcc -mcpu=cortex-m3 -march=armv7-m
-mthumb  -D__ARM_MAX_ARCH__=7.

Currently, the assembly language source files don't compile because they
expect to be able to use ARM (-marm) instructions when __ARM_MAX_ARCH__ =
7. Further, they try to do feature detection for NEON in that case, but I'd
prefer to not have the feature detection compiled in, since I know NEON is
never available.

Has anybody started working on this?

If not, my thinking was to refactor the assembly language code so that all
the ARM-only (-marm) code is cleanly separated from the unified (-mthumb
and -marm) code, move the detection of NEON from the assembly language code
to the C wrappers, and recognize two new settings, OPENSSL_NO_ARM_NEON and
OPENSSL_ARM_THUMB_ONLY, to accommodate this. Does this seem like a
reasonable strategy? It would be quite a large change so I think it would
be good to have some coordination ahead of time.

Cheers,
Brian
-- 
https://briansmith.org/
___
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


Re: [openssl-dev] We're working on license changes

2015-08-04 Thread Brian Smith
On Tue, Aug 4, 2015 at 10:53 AM, Salz, Rich rs...@akamai.com wrote:

  How about getting a second opinion?

 You want to hire us legal counsel who understands the issues?  Great.


 Who is us?

It is natural for a lawyer to tell you to require lots of things to protect
whatever entity is paying them. That's defense-in-depth type advice from
them. However, lawyers do cost-benefit analysis based on the goals you give
them. If you tell them that avoiding CLAs is important then they'll help
you avoid CLAs, generally.

For an example of this, see Mozilla, in particular see [1], particularly
sections 2, 3, and 4. See also [2] where Mozilla recently gave up the
requirement to have the agreement signed. Please let me know if you want me
to put you in touch with the licensing people at Mozilla who can probably
help you do the same. I went through a similar process with them for the
mozilla::pkix license during the time I worked there, and after I worked
there.

Note that the proposed CLA is granting special privileges to a particular
**for-profit** US corporation. It isn't technically copyright assignment,
but is practically the same thing. If you read the agreement carefully, it
is asking every contributor to give a license to that for-profit
corporation to re-license contributions however it wants. It is unnecessary
to do things this way.

You could, instead, create a license agreement that asks prior contributors
to re-license their previous contributions under the new license (Apache
2.0 or hopefully something better). Only prior contributors would have to
sign it--not new contributors. Then you could just do what Mozilla does, so
that everybody has the same rights.

And, if for some reason it were necessary for new contributors to sign a
CLA, then that CLA doesn't need to grant any particular entity any rights
beyond what the new OpenSSL license already grants--i.e. it only needs to
assert that the contributor has the right to license the work under the
license that OpenSSL is using. (Again, see the Mozilla links.)

Then, everybody would be being treated fairly, because everybody would have
the exact same rights, instead of one corporation having more rights than
the rest of the contributors.

To be clear, I don't have any problem with the OpenSSL Foundation being a
for-profit corporation. But, it does make for a very different situation
than how the Apache Software Foundation[3] or even Mozilla operates, and I
think that distinction is very important when it comes to licensing.

Cheers,
Brian

[1]
https://www.mozilla.org/en-US/about/governance/policies/commit/requirements/
[2]
http://blog.gerv.net/2015/02/signed-committers-agreements-no-longer-required/
[3] http://apache.org/foundation/faq.html#is-ASF-a-corporation
___
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


Re: [openssl-dev] We're working on license changes

2015-08-04 Thread Brian Smith
On Tue, Aug 4, 2015 at 2:47 PM, Salz, Rich rs...@akamai.com wrote:

  It is natural for a lawyer to tell you to require lots of things to
 protect whatever entity is paying them.

 Well, yeah, sure.  But I would hope that the bono-fides of the SFLC and
 Eben Moglen  aren't being called into question.


Nope. What I'm saying is that lawyers work a lot like us: They help you
build a threat model, and then they help you create a defense for that
threat model. Basically, I'm asking for more considerations to be added to
the threat model:

* The new licensing should facilitate sharing code between the BoringSSL,
LibreSSL, and OpenSSL projects, and it should be clear how this is done.
* The new licensing should facilitate using OpenSSL code with GPLv2 code,
the LInux Kernel and GMP in particular. See [0] in the OpenSSL FAQ.
* The new license should treat every contributor equally. Contributors
should not have to grant privileges to any other contributor beyond the
privileges given in the license that everybody has.


 Please let me know if you want me to put you in touch with the licensing
 people at Mozilla who can probably help you do the same.

 Sure, please contact me (rsalz at openssl.org)


Sure, will do (privately).


  To be clear, I don't have any problem with the OpenSSL Foundation being
 a for-profit corporation. But, it does make for a very different situation
 than how the Apache Software Foundation[3] or even Mozilla operates, and I
 think that distinction is very important when it comes to licensing.

 Since Matt has explained that we're not a for-profit corporation, I assume
 that this is no longer a concern for you.  We are *not* a tax-exempt
 charitable organization, but we are not for profit.


The OpenSSL website says[1] the OpenSSL Software Foundation (OSF) is
incorporated in the United States as a regular for-profit corporation, and
the proposed CLA[2] is an agreement between the contributor and that
for-profit corporation.

Anyway, I don't think we need to rathole on that, because my point is that
there should be a way to do the licensing that doesn't require any CLA for
future contributions, but only for past contributions.

[0] https://www.openssl.org/support/faq.html#LEGAL2
[1] https://openssl.org/support/donations.html
[2] https://www.openssl.org/licenses/openssl_icla.pdf

Cheers,
Brian
-- 
https://briansmith.org/
___
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


Re: [openssl-dev] We're working on license changes

2015-07-31 Thread Brian Smith
On Fri, Jul 31, 2015 at 12:29 PM, Hanno Böck ha...@hboeck.de wrote:

 Salz, Rich rs...@akamai.com wrote:

  Please see https://www.openssl.org/blog/blog/2015/08/01/cla/ for some
  more details.
 
  Summary:  Moving to Apache 2, CLA's coming, it will take time.

 This is a huge step if it works (I leave it up to the lawyers to decide
 if it will), but I want to question whether Apache License is really a
 wise move.


[snip]

In the spirit of making OpenSSL as useful as possible for everyone  I

would consider a permissive license that's more compatible (e.g. MIT) a
 wiser choice.


I agree 100%. What is wrong with the ISC-style license that LibreSSL and
BoringSSL have been using to share code? Why not use that same license for
new code? The ability to share code between these projects is hugely
valuable, especially when it comes to getting security problems fixed in a
timely and secure manner.

Also, I question the need for people to sign a CLA to contribute to
OpenSSL. OpenSSL has been very successful for decades without a CLA
requirement. Lots of other projects are extremely successful without a CLA.
A CLA seems unnecessary.

Cheers,
Brian

[1] https://www.imperialviolet.org/2014/06/20/boringssl.html (end of
document)
___
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


[openssl-dev] Concerns regarding bn_wexpand/bn_expand2/bn_expand_internal

2015-04-28 Thread Brian Smith
Although bn_wexpand is a private function within the crypto/bn module, it
is exposed to other modules through bn_int.h. It is used from outside
crypto/bn quite frequently.

bn_wexpand is implemented in terms of a function called bn_expand2:

/*
 * This is an internal function that should not be used in applications. It
 * ensures that 'b' has enough room for a 'words' word number and
initialises
 * any unused part of b-d with leading zeros. It is mostly used by the
 * various BIGNUM routines. If there is an error, NULL is returned. If not,
 * 'b' is returned.
 */

BIGNUM *bn_expand2(BIGNUM *b, int words)
{
bn_check_top(b);

if (words  b-dmax) {
BN_ULONG *a = bn_expand_internal(b, words);
if (!a)
return NULL;
if (b-d)
OPENSSL_free(b-d);
b-d = a;
b-dmax = words;
}

bn_check_top(b);
return b;
}

My understanding of any unused part of b-d would be something like
|memset(b-d + b-top, 0, words - b-top)|.

But, does bn_expand2 actually zeroize anything?

In bn_expand_internal, we see:

[...]

#ifdef PURIFY
/*
 * Valgrind complains in BN_consttime_swap because we process the whole
 * array even if it's not initialised yet. This doesn't matter in that
 * function - what's important is constant time operation (we're not
 * actually going to use the data)
 */
memset(a, 0, sizeof(BN_ULONG) * words);
#endif

It seems like when PURIFY is not set, then bn_expand2 is not meeting its
contract. I wrote a test like so:

BIGNUM b;
BN_init(b);
bn_wexpand(b, 1000);

I then iterated through all the words of b to verified that all 1000 words
were zero, but in fact they were not. That is concerning.

Here's the RT for this changeset:
https://rt.openssl.org/Ticket/Display.html?id=3415user=guestpass=guest

Quote: This seems to be the most pragmatic way forward, although it does
have the disadvantage that this will mask any other future problems in the
bn library due to incorrect unlitialised reads of this data.

I think that masking such problems is too big a disadvantage to be allowed,
in general, because it effectively disables uninitialized memory checking
for BIGNUM, which is one of the most important things to have uninitialized
memory checking for.

But, if such a workaround is really needed for BN_consttime_swap, then I
think that workaround should be put into BN_consttime_swap instead of into
this lower-level function. Even more preferably, the fix would appear in
the caller of BN_consttime_swap.

Cheers,
Brian
___
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


Re: [openssl-dev] ecp_nistz256 correctness/constant-timedness

2015-04-26 Thread Brian Smith
On Fri, Apr 24, 2015 at 5:54 AM, Emilia Käsper emi...@openssl.org wrote:

 commit c028254b12 fixes 1., 2. and 3. (also applied to 1.0.2).
 commit 53dd4ddf71 fixes 5 and some of 4.

 Still ploughing my way through the rest of error checking.



Great.

I want to call your attention to one particularly non-obvious failure to
handle errors correctly:

static void ecp_nistz256_windowed_mul([...], P256_POINT *r, [...])
{
[...]

if ((num * 16 + 6)  OPENSSL_MALLOC_MAX_NELEMS(P256_POINT)
|| (table_storage =
OPENSSL_malloc((num * 16 + 5) * sizeof(P256_POINT) + 64)) ==
NULL
|| (p_str =
OPENSSL_malloc(num * 33 * sizeof(unsigned char))) == NULL
|| (scalars = OPENSSL_malloc(num * sizeof(BIGNUM *))) == NULL) {
ECerr(EC_F_ECP_NISTZ256_WINDOWED_MUL, ERR_R_MALLOC_FAILURE);
goto err;
}

[...]

err:
[...]
}

ecp_nistz256_windowed_mul checks for errors, but it doesn't report the fact
that an error occurred to the caller, because it has return type |void|.
And, the caller doesn't check that ecp_nistz256_windowed_mul failed; it
can't because of the void return type.

Cheers,
Brian
___
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


[openssl-dev] ecp_nistz256 correctness/constant-timedness

2015-04-23 Thread Brian Smith
Hi,

I have some questions regarding this implementation.

https://github.com/openssl/openssl/blob/e0e920b1a063f14f36418f8795c96f2c649400e1/crypto/ec/ecp_nistz256.c

1. In ecp_nistz256_points_mul, we have this code:

if ((BN_num_bits(scalar)  256) {
...
if (!BN_nnmod(tmp_scalar, scalar, group-order, ctx)) {...}
scalar = tmp_scalar;
}

I think it would be useful to add a comment about why this is OK in terms
of the constant-time-correctness of the code, because it isn't obvious.

2. Later in the same function, we have this code:

bn_correct_top(r-X);
bn_correct_top(r-Y);
bn_correct_top(r-Z);

Again, it isn't clear why it is OK to call bn_correct_top given that
bn_correct_top isn't a constant-time function. I think either this code
should be changed so that it is constant time, or a comment should be added
explaining why it doesn't need to be.

3. When doing the initial adaptation of the code to get it working inside
of BoringSSL, I had to make this change:

bn_correct_top(r-X);
bn_correct_top(r-Y);
bn_correct_top(r-Z);
+  r-Z_is_one = is_one(p.p.Z);

(Alternatively, maybe one can change BoringSSL's implementation
of EC_POINT_is_at_infinity to ignore r-Z_is_one like OpenSSL's
implementation does.)

Looking at the OpenSSL code, I can see that Z_is_one is only used for
optimization purposes in the simple ECC implementation. Even ignoring
BoringSSL being different, I found it confusing that sometimes Z_is_one
*must* be set correctly and other times it *must* be ignored. Perhaps it is
worth renaming this member to simple_Z_is_one and/or documenting more
clearly when it is maintained and when it can and cannot be used.

Alternatively, I noticed that BN_is_one is not a very expensive function,
and probably can be made even less expensive, so the optimization of using
the Z_is_one flag instead of just calling BN_is_one may not be worthwhile.
Perhaps it would be better to remove it completely?

4. There seems to be quite a bit of missing error checking in this code.
For example, the return values of calls to bn_wexpand are not checked in
multiple functions. Further, a lot of the error checking in the
probably-never-used ecp_nistz256_mult_precompute function is missing, e.g.
calls to EC_POINT_new, EC_POINT_copy, and
ec_GFp_simple_{add,dbl,make_affine}. I think this whole file needs to be
combed for missing error checks.

5. In ecp_nistz256_mult_precompute, if the ctx parameter is null, a new
context will be created with BN_CTX_new but BN_CTX_free is not called
anywhere in the file.

All that aside, this code is very fast, and it is awesome that it got
adapted to non-X64 platforms already!

Cheers,
Brian
___
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


[openssl-dev] ec_wNAF_precompute_mult sometimes calls BN_CTX_end without BN_CTX_start

2015-04-23 Thread Brian Smith
See
https://boringssl.googlesource.com/boringssl/+/d6405beb2c75564edbbb7f4203252aa5edee2359

It seems OpenSSL has the same bug, although the code is slightly different,
and my fix for BoringSSL should be easy to adapt for OpenSSL.

It may be worth trying to find the same pattern elsewhere in the OpenSSL
code.

Cheers,
Brian
___
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


Re: [openssl-dev] s3_clnt.c changes regarding external pre-shared secret seem to break EAP-FAST

2015-04-01 Thread Brian Smith
Emilia Käsper emi...@openssl.org wrote:
 On Fri, Mar 27, 2015 at 10:40 PM, Brian Smith br...@briansmith.org wrote:
 If OpenSSL's client code were changed to always use an empty session
 ID when attempting resumption using a session ticket, then the
 EAP-FAST case wouldn't be different from the general session ticket
 resumption case. I think that this is a cleaner approach.

 1)  The way EAP-FAST diverges from 5246 and 5077 is indeed quite
 unfortunate. The lookahead is messy, and hard to get right - I don't want
 another early CCS due to lack of determinism in the state machine. Setting
 the session ID is much cleaner. So, I'd rather put in a workaround that is
 specific to EAP-FAST and doesn't affect regular handshakes.

The added complexity of having a special case for EAP-FAST seems worse
to me. After all, it's not OK to have EAP-FAST be non-secure, and so
it is important to have the no-session-ID case be correct regardless.

 2) Removing the session ID upon resumption would be a big change in
 behaviour that I don't think would qualify for a stable branch anyway unless
 there was a security or regression  issue behind it.

Fair enough. I have no idea what the compatibility problems might
arise. If I have some time, I might try to change one of the web
browsers to do this, to see what happens. If I do, I'll report back.

Cheers,
Brian
___
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


Re: [openssl-dev] s3_clnt.c changes regarding external pre-shared secret seem to break EAP-FAST

2015-03-27 Thread Brian Smith
Erik Tkal etks...@gmail.com wrote:
 In order for EAP-FAST to work it seems that if the client does have a
 tls_session_secret that s-hit must NOT be set since there is no indication
 in the serverHello as to whether the session_ticket sent by the client is
 accepted by the server (the sessionTicket extension is not sent by the
 server in EAP-FAST)

[snip]

Although the RFC4851 (an informational RFC documenting EAP-FAST) does
not require the server to send the session ticket extension during
resumption, it is based on RFC4507/RFC5077 (which are on the standards
track), which *does* require the server to send the extension. So,
this is a bug in the non-conformant servers, not in the openssl
client.

The non-standard mechanism recommended by RFC4851 for distinguishing
resumption vs. full handshakes in EAP-FAST is quite unfortunate. We
should update RFC4851 to require standard RFC5077 semantics to be
used. Is there any effort underway to update RFC4851 for this or other
reasons? It is worth filing an errata against the document, at least.

It would be better to fix this bug on the server (by having them send
the session ticket extension during resumption as required by RFC
5077) than in the openssl client.

Cheers,
Brian
___
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


Re: [openssl-dev] s3_clnt.c changes regarding external pre-shared secret seem to break EAP-FAST

2015-03-27 Thread Brian Smith
Brian Smith br...@briansmith.org wrote:
 Although the RFC4851 (an informational RFC documenting EAP-FAST) does
 not require the server to send the session ticket extension during
 resumption, it is based on RFC4507/RFC5077 (which are on the standards
 track), which *does* require the server to send the extension. So,
 this is a bug in the non-conformant servers, not in the openssl
 client.

Sorry. It seems I am wrong about this. RFC 5077 says It is also
permissible to have an exchange similar to Figure 3 using the
abbreviated handshake defined in Figure 2 of RFC 4346, where the
client uses the SessionTicket extension to resume the session, but the
server does not wish to issue a new ticket, and therefore does not
send a SessionTicket extension.

AFAICT this means that, even outside of EAP-FAST, it is allowed for
the server to resume a session using a session ticket without sending
the session ticket extension in its ServerHello message.

Also, note that RFC 5077 section 3.4 allows the client to use a
session ticket and an empty session ID to resume a session, instead of
generating a fake session ID for the session ticket: Alternatively,
the client MAY include an empty Session ID in the ClientHello.  In
this case, the client ignores the Session ID sent in the ServerHello
and determines if the server is resuming a session by the subsequent
handshake messages.

If OpenSSL's client code were changed to always use an empty session
ID when attempting resumption using a session ticket, then the
EAP-FAST case wouldn't be different from the general session ticket
resumption case. I think that this is a cleaner approach.

Note that RFC4851 would likely still need to be updated, because TLS
1.3 will most likely remove the ChangeCipherSpec messages, and
RFC4851's recommended resumption detection is based on detecting
ChangeCipherSpec messages.

Cheers,
Brian
___
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


Re: [openssl-dev] [openssl.org #3721] Patch for additional checking of self-signed certificates

2015-02-27 Thread Brian Smith
Short, Todd via RT r...@openssl.org wrote:
 Check that in matching issuer/subject certs, that a self-signed subject also 
 has a self-signed issuer.
 Given that the subject certificate is self-signed, it means that the issuer 
 and the subject are the same certificate. This change verifies that.

 Github link:
 https://github.com/akamai/openssl/commit/faff94b732472616828fe724e09053f134ebb88b

Could you explain this more?

In your patch, there is a comment that says Input certificate
(subject) is self signed. But, the test is that the issuer name
equals the subject name. That means the certificate is self-*issued*,
not self-*signed*.

Consider this chain:

{ Subject=Foo, Issuer=Foo, Key=Key1, Signed by Key2 }
{ Subject=Foo, Issuer=Foo, Key=Key2, Signed by Key3 }
{ Subject=Foo, Issuer=Foo, Key=Key3, Signed by Key3, Trust Anchor }

All three certificates are self-issued. The issuer of the first
certificate is not self-signed but it is self-issued. But, it being
self-issued doesn't matter because it isn't a trust anchor.

Consider this chain:

{ Subject=Foo, Issuer=Foo, Key=Key1, Signed by Key1 }
{ Subject=Foo, Issuer=Bar, Key=Key1, Signed by Key2 }
{ Subject=Bar, Issuer=Bar, Key=Key2, Signed by Key2, Trust Anchor }

The first certificate is self-signed and self-issued. It's issuer is
not self-signed or self-issued, so your patch would reject this chain.
But, this is a valid chain.

Cheers,
Brian
___
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


Re: [openssl-dev] [openssl.org #3721] Patch for additional checking of self-signed certificates

2015-02-27 Thread Brian Smith via RT
Short, Todd via RT r...@openssl.org wrote:
 Check that in matching issuer/subject certs, that a self-signed subject also 
 has a self-signed issuer.
 Given that the subject certificate is self-signed, it means that the issuer 
 and the subject are the same certificate. This change verifies that.

 Github link:
 https://github.com/akamai/openssl/commit/faff94b732472616828fe724e09053f134ebb88b

Could you explain this more?

In your patch, there is a comment that says Input certificate
(subject) is self signed. But, the test is that the issuer name
equals the subject name. That means the certificate is self-*issued*,
not self-*signed*.

Consider this chain:

{ Subject=Foo, Issuer=Foo, Key=Key1, Signed by Key2 }
{ Subject=Foo, Issuer=Foo, Key=Key2, Signed by Key3 }
{ Subject=Foo, Issuer=Foo, Key=Key3, Signed by Key3, Trust Anchor }

All three certificates are self-issued. The issuer of the first
certificate is not self-signed but it is self-issued. But, it being
self-issued doesn't matter because it isn't a trust anchor.

Consider this chain:

{ Subject=Foo, Issuer=Foo, Key=Key1, Signed by Key1 }
{ Subject=Foo, Issuer=Bar, Key=Key1, Signed by Key2 }
{ Subject=Bar, Issuer=Bar, Key=Key2, Signed by Key2, Trust Anchor }

The first certificate is self-signed and self-issued. It's issuer is
not self-signed or self-issued, so your patch would reject this chain.
But, this is a valid chain.

Cheers,
Brian


___
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


Re: [openssl-dev] OpenSSL patches and enhancements from Akamai

2015-02-13 Thread Brian Smith
Very cool.

Short, Todd tsh...@akamai.com wrote:
 * Check that in matching issuer/subject certs, that a self-signed subject
 also has a self-signed issuer

Could you explain this one? It isn't necessarily the case that a
self-signed subject has a self-signed issuer in PKIX, if I am
understanding you correctly.

Cheers,
Brian
___
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


Re: [openssl-dev] Seeking feedback on some #ifdef changes

2015-02-10 Thread Brian Smith
On Tue, Feb 10, 2015 at 4:25 PM, Salz, Rich rs...@akamai.com wrote:

 Please continue to make it possible to build the crypto part of OpenSSL,
 without the X.509 and SSL/TLS code. There are lots of uses of OpenSSL that
 don't need that code.

 You can build crypto without ssl.  And the only place OPENSSL_NO_X509 
 appeared was, strangely, in ssl.  So crypto builds, which didn't really pay 
 attention to OPENSLS_NO_x509 before, still don't.  :)

The X.509 code is in crypto, though.

Also, when I've had a use for libssl, I've usually wanted to use an
X.509 library other than OpenSSL's, for a variety of reasons. It would
be useful to remove all of libssl's dependencies on the X.509 code
during the time that the other compatibility-breaking changes are
being made.

Cheers,
Brian
___
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


Re: [openssl-dev] Seeking feedback on some #ifdef changes

2015-02-10 Thread Brian Smith
Salz, Rich rs...@akamai.com wrote:
 OPENSSL_NO_CHAIN_VERIFY
 OPENSSL_NO_RFC3779
 OPENSSL_NO_TLS
 OPENSSL_NO_TLS1
 OPENSSL_NO_TLS1_2_CLIENT
 OPENSSL_NO_TLSEXT
 OPENSSL_NO_X509
 OPENSSL_NO_X509_VERIFY

Please continue to make it possible to build the crypto part of
OpenSSL, without the X.509 and SSL/TLS code. There are lots of uses of
OpenSSL that don't need that code.

Actually, I'd prefer if it were possible to build without any of the
ASN.1 code too, but I understand that is messy because the EVP
interface assumes the ASN.1 parser is available. But, building without
X.509 and SSl/TLS support should be easy to continue to support.

Cheers,
Brian
___
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


Re: [openssl-dev] [openssl.org #3616] [Patch] Implement option to disable sending TLS extensions

2015-01-26 Thread Brian Smith
Hubert Kario hka...@redhat.com wrote:
 Actually it does not introduce it as OpenSSL does send the notification as
 TLS_EMPTY_RENEGOTIATION_INFO_SCSV, not the extension.

 On Sunday 30 November 2014 20:36:20 Richard Moore wrote:
 That would introduce security issues such as the TLS renegotiation flaw.
 Surely a better solution is to make servers that pretend to support TLS but
 actually only support SSL3 die a horrible death?

I agree with Richard that this seems . In particular, the session hash
/ extended master secret [1] specification requires an extension to
work securely. Not having the SNI extension is likely to cause
security issues (using a different and perhaps though-of-as-unused
certificate). Many servers use the values in the signature_algorithms
extension to determine whether to use a SHA-2 or SHA-1 certificate, so
not sending signature_algorithms is likely to cause problems for any
client that disables support for SHA-1 certificates.

Resolving these TLS (extension) intolerance issues requires collective
action, and it would be great if OpenSSL could do its part by not
adding features like this that exist purely to avoid participating in
the collective action, especially when the added feature disables
other important security features.

Cheers,
Brian

[1] https://tools.ietf.org/html/draft-bhargavan-tls-session-hash-00
___
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev


[openssl.org #1680] Possible Bug: x509_name_ex_d2i segmentation fault under IA-64

2008-05-27 Thread Brian Smith via RT
Detail:
When writing a function to generate a CSR via OpenSSL on IA-64, the
X509_REQ_set_subject_name function eventually calls x509_name_ex_d2i which
causes a segmentation fault under IA-64. This code works properly on
Solaris, IRIX, win32 (cygwin), and x86_64

Additional Details:
OpenSSL compiled debug allows the function to operate properly without
crashing
Inserting printf(here) statements between lines of x509_name_ex_d2i also
allows the function to operate without seg fault when built normally

Possibly Known Related Cases:
http://www.mail-archive.com/openssl-dev@openssl.org/msg20993.html

Test Setup:

Linux 2.6.5-7.252.PTF-sn1 SMP
Openssl 0.9.7m Configured linux-ia64 -fPIC
gcc version 3.3.3 (SuSE Linux)

GDB Dump:

Program received signal SIGSEGV, Segmentation fault.

0x400a58c0 in x509_name_ex_d2i ()

(gdb) where

#0  0x400a58c0 in x509_name_ex_d2i ()

#1  0x400ad1f0 in ASN1_item_ex_d2i ()

#2  0x400ae1a0 in ASN1_item_d2i ()

#3  0x4011a860 in ASN1_item_dup ()

#4  0x400a5b90 in X509_NAME_dup ()

#5  0x400a5cb0 in X509_NAME_set ()

#6  0x400bd760 in X509_REQ_set_subject_name ()



Thanks,

Brian Smith

Detail:
When writing a function to generate a CSR via OpenSSL on IA-64, the X509_REQ_set_subject_name function eventually calls x509_name_ex_d2i which causes a segmentation fault under IA-64. This code works properly on Solaris, IRIX, win32 (cygwin), and x86_64


Additional Details:
OpenSSL compiled debug allows the function to operate properly without crashing
Inserting printf(here) statements between lines of x509_name_ex_d2i also allows the function to operate without seg fault when built normally

Possibly Known Related Cases:
http://www.mail-archive.com/openssl-dev@openssl.org/msg20993.html

Test Setup:

Linux 2.6.5-7.252.PTF-sn1 SMP
Openssl 0.9.7m Configured linux-ia64 -fPIC
gcc version 3.3.3 (SuSE Linux)

GDB Dump:

Program received signal SIGSEGV, Segmentation fault.
0x400a58c0 in x509_name_ex_d2i ()
(gdb) where
#0 0x400a58c0 in x509_name_ex_d2i ()
#1 0x400ad1f0 in ASN1_item_ex_d2i ()
#2 0x400ae1a0 in ASN1_item_d2i ()
#3 0x4011a860 in ASN1_item_dup ()
#4 0x400a5b90 in X509_NAME_dup ()
#5 0x400a5cb0 in X509_NAME_set ()
#6 0x400bd760 in X509_REQ_set_subject_name ()

Thanks,
Brian Smith


compile failure on Solaris for X86

1999-12-21 Thread Brian Smith

Your INSTALL file says to mail the make errors here. Hope you can make more
sense of this than I can!

I tried making this with/without the no-asm option installed with the same
results both times.

I am trying to compile openssl-0.9.4


making all in crypto...
( echo "#ifndef MK1MF_BUILD"; \
echo "  /* auto-generated by crypto/Makefile.ssl for crypto/cversion.c */"; \
echo "  #define CFLAGS \"gcc -DTHREADS -D_REENTRANT -O3 -fomit-frame-pointer -m486 
-Wall -DL_ENDIAN -DSHA1_ASM -DMD5_ASM -DRMD160_ASM\""; \
echo "  #define PLATFORM \"solaris-x86-gcc\""; \
echo "  #define DATE \"`date`\""; \
echo "#endif" ) buildinf.h
gcc -I. -I../include -DTHREADS -D_REENTRANT -O3 -fomit-frame-pointer -m486 -Wall 
-DL_ENDIAN -DSHA1_ASM -DMD5_ASM -DRMD160_ASM  -c  cryptlib.c
gcc -I. -I../include -DTHREADS -D_REENTRANT -O3 -fomit-frame-pointer -m486 -Wall 
-DL_ENDIAN -DSHA1_ASM -DMD5_ASM -DRMD160_ASM  -c  mem.c
gcc -I. -I../include -DTHREADS -D_REENTRANT -O3 -fomit-frame-pointer -m486 -Wall 
-DL_ENDIAN -DSHA1_ASM -DMD5_ASM -DRMD160_ASM  -c  cversion.c
gcc -I. -I../include -DTHREADS -D_REENTRANT -O3 -fomit-frame-pointer -m486 -Wall 
-DL_ENDIAN -DSHA1_ASM -DMD5_ASM -DRMD160_ASM  -c  ex_data.c
gcc -I. -I../include -DTHREADS -D_REENTRANT -O3 -fomit-frame-pointer -m486 -Wall 
-DL_ENDIAN -DSHA1_ASM -DMD5_ASM -DRMD160_ASM  -c  tmdiff.c
gcc -I. -I../include -DTHREADS -D_REENTRANT -O3 -fomit-frame-pointer -m486 -Wall 
-DL_ENDIAN -DSHA1_ASM -DMD5_ASM -DRMD160_ASM  -c  cpt_err.c
ar r ../libcrypto.a cryptlib.o mem.o cversion.o ex_data.o tmdiff.o cpt_err.o
/usr/ccs/bin/ranlib ../libcrypto.a
making all in crypto/md2...
gcc -I.. -I../../include -DTHREADS -D_REENTRANT -O3 -fomit-frame-pointer -m486 -Wall 
-DL_ENDIAN -DSHA1_ASM -DMD5_ASM -DRMD160_ASM  -c  md2_dgst.c
gcc -I.. -I../../include -DTHREADS -D_REENTRANT -O3 -fomit-frame-pointer -m486 -Wall 
-DL_ENDIAN -DSHA1_ASM -DMD5_ASM -DRMD160_ASM  -c  md2_one.c
ar r ../../libcrypto.a md2_dgst.o md2_one.o
/usr/ccs/bin/ranlib ../../libcrypto.a
making all in crypto/md5...
gcc -I.. -I../../include -DTHREADS -D_REENTRANT -O3 -fomit-frame-pointer -m486 -Wall 
-DL_ENDIAN -DSHA1_ASM -DMD5_ASM -DRMD160_ASM  -c  md5_dgst.c
gcc -I.. -I../../include -DTHREADS -D_REENTRANT -O3 -fomit-frame-pointer -m486 -Wall 
-DL_ENDIAN -DSHA1_ASM -DMD5_ASM -DRMD160_ASM  -c  md5_one.c
(cd asm; /usr/local/bin/perl md5-586.pl cpp mx86unix.cpp)
gcc -E -DSOL asm/mx86unix.cpp | sed 's/^#.*//'  asm/mx86-sol.s
as -o asm/mx86-sol.o asm/mx86-sol.s
rm -f asm/mx86-sol.s
ar r ../../libcrypto.a md5_dgst.o md5_one.o asm/mx86-sol.o
/usr/ccs/bin/ranlib ../../libcrypto.a
making all in crypto/sha...
gcc -I.. -I../../include -DTHREADS -D_REENTRANT -O3 -fomit-frame-pointer -m486 -Wall 
-DL_ENDIAN -DSHA1_ASM -DMD5_ASM -DRMD160_ASM  -c  sha_dgst.c
*** Error code 1
*** Error code 1
*** Error code 1

 config.t