Hi all,

Thanks Ben for reading.

For those wanting a follow up, I've proposed this pull request:
https://github.com/pharo-project/pharo/pull/1980.
I'm still working on avoiding dependencies against UFFI, fixing one other
test.
This is however almost finished, and given that I had to adapt the
original *abstract
proposal* to fit the real system, here is an updated version:

API Proposal for OSEnvironment and friends
=========================

OSEnvironment is the common denominator for all platforms. They should
implement at least the following messages with the following semantics:

   - *at: aVariableName [ifAbsent:/ifAbsentPut:/ifPresent:ifAbsent:]*

Gets the String value of an environment variable called `aVariableName`.
It is the system reponsibility to manage the encoding of *both arguments
and return values*.

   - *at: aVariableName put: aValue*

Sets the environment variable called `aVariableName` to value `aValue`.
It is the system reponsibility to manage the encoding of *both arguments
and return values*.

   - *removeKey: aVariableName*

Removes the environment variable called `aVariableName`.
It is the system reponsibility to manage the encoding of *both arguments
and return values*.

API Extensions for *Nix Systems (OSX & Linux)
=========================

Since *Nixes environment variables are binary data that could be encoded in
any encoding, the following methods provide more flexibility to access such
data in the encoding of the choice of the user, or even in binary form.

   - *at: aVariableName encoding: anEncoding
   [ifAbsent:/ifAbsentPut:/ifPresent:ifAbsent:/put:]  /  removeKey:**
aVariableName
   encoding: anEncoding*

Variants of the common API from OSEnvironment.
The encoding used as argument will be used to encode/decode *both arguments
and return values*.

   - *rawAt: anEncodedVariableName encoding: anEncoding
   [ifAbsent:/ifAbsentPut:/ifPresent:ifAbsent:/put:]  /  removeRawKey:*
   *anEncodedVariableName*

Variants of the common API from OSEnvironment.
These methods assume arguments and return values are encoded/decoded by the
user, so no marshalling or decoded is done by it.

Rationale
=========================

   - Encoding/Decoding should be applied not only to values but to
   variables names too. In most cases Ascii overlaps with utf* and Latin*
   encodings, but this cannot be simply assumed.
   - Windows requires calling the right *Wide version of the functions from
   C, plus the correct encoding routine. This could be implemented as an FFI
   call or by modifying the VM to do it properly instead of calling the Ascii
   version.
   - Unix FileSystems and environment variables could mix strings in
   different encodings, thus the flexibility added by the low level *Nix
   extensions.

Other Implementation Details
=========================

   - VM primitives returning paths Strings should be carefuly managed to
   decode them, since they are actually C strings (so byte arrays) disguised
   as ByteStrings.
   - Similar changes had to be applied to correctly obtain the current
   working directory in case it is a wide string.


On Mon, Nov 12, 2018 at 1:31 PM Ben Coman <[email protected]> wrote:

>
>
> On Mon, 12 Nov 2018 at 18:02, Guillermo Polito <[email protected]>
> wrote:
>
>> Hi all,
>>
>> following the meeting we had here @Inria headquarters, I'll be
>> backporting some of the improvements we did in the launcher this last month
>> regarding the encoding of environment variables.
>>
>> I've opened for this issue https://pharo.fogbugz.com/f/cases/22658/
>>
>> We have already studied possible alternatives with Pablo and Christophe
>> and we have some conclusions and we propose some changes:
>>
>> API Proposal for OSEnvironment
>> =========================
>>
>>
>>    -
>> *at: aVariableName *
>>
>> Gets the String value of an environment variable called `aVariableName`
>> It is the system reponsibility to manage the encoding.
>> Rationale: A common denominator for all platforms providing an already
>> decoded string, because windows does not (compared to *nix systems) provide
>> a encoded byte representation of the value. Windows has instead its own
>> wide string representation.
>>
>>    - *[optionally] rawAt: anEncodedVariableName*
>>
>> Gets the Byte value of an environment variable called
>> `anEncodedVariableName`.
>> It is the user responsibility to encode and decode argument and return
>> values in the encoding of this preference.
>> Rationale: Some systems may want to have the liberty to use different
>> encodings, or even to put binary data in the variables.
>>
>>    - *[optionally] at: aVariableName encoding: anEncoding*
>>
>> Gets the value of an environment variable called `aVariableName` using
>> `anEncoding` to encode/decode arguments and return values.
>> Rationale: *xes could potentially use different encodings for their
>> environment variables or even use different encodings in different parts of
>> their file system.
>>
>> Other Implementation details
>> =========================
>>
>>    - VM primitives returning paths Strings should be carefuly managed to
>>    decode them, since they are actually C strings (so byte arrays) disguised
>>    as ByteStrings.
>>    - Windows requires calling the right *Wide version of the functions
>>    from C, plus the correct encoding routine. This could be implemented as an
>>    FFI call or by modifying the VM to do it properly instead of calling the
>>    Ascii version
>>
>>
> I haven't been using environment variables a lot so I don't have a strong
> technical opinion (although at a glance it makes reasonable sense).
> But I wanted to say I really like the way you've presented your offline
> discussion, conclusions and proposal.  Thanks.
>
> cheers -ben
>
>

-- 



Guille Polito

Research Engineer

Centre de Recherche en Informatique, Signal et Automatique de Lille

CRIStAL - UMR 9189

French National Center for Scientific Research - *http://www.cnrs.fr
<http://www.cnrs.fr>*


*Web:* *http://guillep.github.io* <http://guillep.github.io>

*Phone: *+33 06 52 70 66 13

Reply via email to