[exim-dev] build-farm / macOS

2018-08-17 Thread Phil Pennock via Exim-dev
For awareness,

I've applied on behalf of Exim to
 to get a free VM to be used as
a build animal.

If we're approved, we'll get rote paperwork every six months to confirm
that we're actually still using it.

I'd like to get macOS/Darwin builds back on the officially supported
list.

Regards,
-Phil

-- 
## List details at https://lists.exim.org/mailman/listinfo/exim-dev Exim 
details at http://www.exim.org/ ##


Re: [exim-dev] UTF-8 and Exim string operations

2018-08-17 Thread Phil Pennock via Exim-dev
On 2018-08-17 at 10:36 -, Jasen Betts via Exim-dev wrote:
> > and add ulength_1 for being UTF-8 aware?
>
> Would also need utf8-aware also substr and strlen.

Yes, I was using length as an exemplar, not as an exhaustive list.  :)

I favored ulength too, but didn't want to just add a slew of new
expansion operators, items and conditions without at least mentioning it
somewhere first.

> is it going to count code-points or glyphs?

Code-points.  Exim has no business knowing about how a layout engine
might or might not choose to render code-points to glyphs.  I could see
a possibility for normalization handling as another function, for
correct SASLprep for authentication.

I'd really rather not, though.  Exim is setuid root and the main system
for handling such things, ICU, does lots of tricky sensitive stuff with
a history of security problems.

> > Look at the top-bit being set and assume UTF-8, or
> > will that break too much with all the places which are still ISO-8859-1?
> 
> Just looking at that bit won't tell you enough to count code-points or
> glyphs.

I know, this was a suggestion for determining if the string should be
treated as UTF-8 for changing the current expansion o/i/c features; it
sucks but it was the only viable alternative I could think of and I
wanted to at least present an _idea_ of something else, for inciting
feedback.

I know a fair bit about UTF-8 internals and how to work with the various
aspects in multiple programming languages. :)

> parts of ${utf8clean can probably be re-used.

Yes, I thought of that, when pondering a new `utf8valid` expansion
condition.

> "${lc" "${uc" and "${if eqi" need consideraton too

Only if we go the ICU route and include normalization forms.  Which ...
is more bloat than I'm happy with in Exim's current architecture.

-Phil

-- 
## List details at https://lists.exim.org/mailman/listinfo/exim-dev Exim 
details at http://www.exim.org/ ##


Re: [exim-dev] UTF-8 and Exim string operations

2018-08-17 Thread Jeremy Harris via Exim-dev
On 08/17/2018 05:03 AM, Phil Pennock via Exim-dev wrote:
> Anyone have strong feelings on how Exim should handle UTF-8 with
> operators such as ${length_1:STR} ?
> 
> Document that the current operators work on bytes

This.

Add new operators, or options on current ones; don't
change how they currently work (barring bugs).
-- 
Cheers,
  Jeremy



-- 
## List details at https://lists.exim.org/mailman/listinfo/exim-dev Exim 
details at http://www.exim.org/ ##


Re: [exim-dev] UTF-8 and Exim string operations

2018-08-17 Thread Jasen Betts via Exim-dev
On 2018-08-17, Phil Pennock via Exim-dev  wrote:
> Anyone have strong feelings on how Exim should handle UTF-8 with
> operators such as ${length_1:STR} ?
>
> Document that the current operators work on bytes

Yeah stay with treating srings as nul terminated arrays of octets.
The same unit the RFCs use to define email and SMTP.

> and add ulength_1 for being UTF-8 aware?

Would also need utf8-aware also substr and strlen. 
is it going to count code-points or glyphs? 

> Look at the top-bit being set and assume UTF-8, or
> will that break too much with all the places which are still ISO-8859-1?

Just looking at that bit won't tell you enough to count code-points or
glyphs. you need to then group the octets together, and you need to do
something when you hit a non-valid octet
parts of ${utf8clean can probably be re-used.

"${lc" "${uc" and "${if eqi" need consideraton too

-- 
 ت

-- 
## List details at https://lists.exim.org/mailman/listinfo/exim-dev Exim 
details at http://www.exim.org/ ##