Great :)

Last question for now: Would you add the details of the error to error_info 
of the badarg error or should this be an entirely different error?

On Tuesday, April 12, 2022 at 4:11:45 PM UTC+8 José Valim wrote:

> I would personally send a PR with steps one and two, and also changing one 
> of the callsites for 3 so you have something to test. Then another PR to 
> migrate the remaining call sites!
>
> On Tue, Apr 12, 2022 at 9:38 AM Jonatan Männchen <jon...@maennchen.ch> 
> wrote:
>
>> Hi all,
>>
>> I dug through the code and now know how the parts fits together.
>>
>> Since it is a combination of Elixir & OTP, I would like to ask for your 
>> input before continuing:
>>
>> When using the following code, the key code paths are:
>>
>> *{:ok, io} = StringIO.open("", encoding: :unicode) # Starts a StringIO 
>> GenServer*
>> *IO.write(io, <<222>>)*
>>
>>
>> https://github.com/erlang/otp/blob/56240f084d80de80d5f1c7fef94db68a6f1b81cc/lib/stdlib/src/io.erl#L91
>> => :io.request(pid, {:put_chars, :unicode,<<222>>}, _ref)
>>
>>
>> https://github.com/elixir-lang/elixir/blob/a64d42f5d3cb6c32752af9d3312897e8cd5bb7ec/lib/elixir/lib/string_io.ex#L289
>> => {{:error, req}, state}
>>
>>
>> https://github.com/erlang/otp/blob/56240f084d80de80d5f1c7fef94db68a6f1b81cc/lib/stdlib/src/io.erl#L340
>> => badarg
>>
>>
>> https://github.com/erlang/otp/blob/56240f084d80de80d5f1c7fef94db68a6f1b81cc/lib/stdlib/src/io.erl#L99
>> => raises argument error without any additional info
>>
>> Based on this, I think multiple things are needed:
>>
>>
>>    1. `io:conv_reason/1` needs to accept a new tuple for the error
>>       1. I would return the error directly from 
>>       `unicode:characters_to_binary/3`: {:error, {:encoding, result}}
>>       2. where result: {error, binary(), RestData} | {incomplete, 
>>       binary(), binary()}
>>    2. `io:o_request/2` needs to handle that new error and raise a better 
>>    error
>>       1. new error kind?
>>       2. existing badarg with more detail data?
>>    3. All those sources need to produce the new error (does not produce 
>>    OTP incompatibility in Elixir sind everything unknown is converted to 
>>    badarg):
>>       1. `StringIO.put_chars/4`
>>       2. `file_io_server:put_chars/3`
>>       3. `group:io_request/5`
>>       4. `standard_error:wrap_characters_to_binary/3`
>>       5. `user_drv:io_command/1`
>>       6. `user:wrap_characters_to_binary/2`
>>       7. `ssh_cli:io_request/4`
>>    
>> Does that sound like I'm going into the right direction?
>>
>> If yes: Should that be one PR in OTP or several?
>>
>> Thanks for any input & Best,
>> Jonatan
>>
>>
>> On Thursday, April 7, 2022 at 2:14:39 PM UTC+8 José Valim wrote:
>>
>>> Make sure to also check out this doc then: 
>>> https://github.com/erlang/otp/blob/master/HOWTO/DEVELOPMENT.md Have fun!
>>>
>>> On Thu, Apr 7, 2022 at 8:12 AM Jonatan Männchen <jon...@maennchen.ch> 
>>> wrote:
>>>
>>>> I’ve never done a PR to OTP so far. I’ll check it out and ping back 
>>>> here for updates.
>>>>
>>>> Sent from my iPhone
>>>>
>>>> On 7 Apr 2022, at 07:09, José Valim <jose....@dashbit.co> wrote:
>>>>
>>>> 
>>>> I would start with 2 by submitting a PR to Erlang/OTP and then access 
>>>> their appetite from there. :)
>>>>
>>>> On Thu, Apr 7, 2022 at 8:03 AM Jonatan Männchen <jon...@maennchen.ch> 
>>>> wrote:
>>>>
>>>>> I guess leaving nr. 3 out would be fine.
>>>>>
>>>>> What would he the proper path to do this in erlang? Just writing some 
>>>>> erlang example code and opening issues?
>>>>>
>>>>> Sent from my iPhone
>>>>>
>>>>> On 7 Apr 2022, at 06:46, José Valim <jose....@dashbit.co> wrote:
>>>>>
>>>>> 
>>>>> All three of them would require changes to Erlang and I would love for 
>>>>> someone to chase this path. 2 sounds doable with EEP 54 but 1 would 
>>>>> definitely require more discussion.
>>>>>
>>>>> I am not sure about 3. binwrite pretty much means "I know what I am 
>>>>> doing" and we should allow the user to write whatever they want as is.
>>>>>
>>>>> On Thu, Apr 7, 2022 at 7:25 AM Jonatan Männchen <jon...@maennchen.ch> 
>>>>> wrote:
>>>>>
>>>>>> Thanks, that solved my specific issue. I still think that some 
>>>>>> improvements are needed.
>>>>>>
>>>>>> Looking at the code for CaptureIO, I think those changes should be 
>>>>>> made directly in the StringIO / IO modules and not specifically for 
>>>>>> CaptureIO.
>>>>>>
>>>>>> The following things are not great today IMHO:
>>>>>>
>>>>>>    1. The encoding that lets everything through is called latin1. I 
>>>>>>    think we should introduce & properly document a new encoding called 
>>>>>>    something like raw_binary. It would work exactly the same though.
>>>>>>    2. IO.write with invalid characters into an io device with any 
>>>>>>    encoding should have a better error message. (Something like "<<222>> 
>>>>>> is 
>>>>>>    not a valid unicode string. Provide `encoding: :raw_binary` when 
>>>>>> opening 
>>>>>>    the io device")
>>>>>>    3. IO.binwrite with invalid characters into an encoding: :unicode 
>>>>>>    io device should probably at least warn if invalid characters are 
>>>>>> passed.
>>>>>>
>>>>>> This would something like this in the form of a test: 
>>>>>> https://gist.github.com/maennchen/f428360a71d23a323538d9b7d51e638b
>>>>>>
>>>>>> On Wednesday, April 6, 2022 at 9:29:12 PM UTC+2 Wojtek Mach wrote:
>>>>>>
>>>>>>> > Additionally, it would be good if there was a proper error for 
>>>>>>> invalid characters instead of the currently raised ArgumentError.
>>>>>>>
>>>>>>> Yeah the error is pretty bad:
>>>>>>>
>>>>>>> IO.puts(<<222>>)
>>>>>>> ** (ArgumentError) errors were found at the given arguments: unknown 
>>>>>>> error: put_chars
>>>>>>>
>>>>>>>
>>>>>>> It is slightly more informative when used inside capture io though:
>>>>>>>
>>>>>>>     assert capture_io(fn ->
>>>>>>>              IO.puts(<<222>>)
>>>>>>>            end) == <<222>>
>>>>>>> ** (ArgumentError) errors were found at the given arguments: unknown 
>>>>>>> error: {put_chars,unicode,[<<"Þ">>,10]}
>>>>>>>
>>>>>>>
>>>>>>> We get error because stdio is with unicode encoding:
>>>>>>>
>>>>>>> iex> :io.getopts(:standard_io)[:encoding]
>>>>>>> :unicode
>>>>>>>
>>>>>>>
>>>>>>> and we're writing <<222>> which isn't unicode.
>>>>>>>
>>>>>>> For writing in raw mode, use IO.binwrite. This and the previously 
>>>>>>> mentioned :encoding option will make the following test succeed:
>>>>>>>
>>>>>>>     assert capture_io([encoding: :latin1], fn ->
>>>>>>>              IO.binwrite(<<222>>)
>>>>>>>            end) == <<222>>
>>>>>>>
>>>>>>>
>>>>>>> On April 6, 2022, "maennchen.ch" <jon...@maennchen.ch> wrote:
>>>>>>>
>>>>>>> That unfortunately gives me the same result.
>>>>>>>
>>>>>>> On Wednesday, April 6, 2022 at 6:57:35 PM UTC+1 Wojtek Mach wrote:
>>>>>>>>
>>>>>>>> I believe capture_io(encoding: :latin1, fun) should do the trick, 
>>>>>>>> can you check?
>>>>>>>>
>>>>>>>> On April 6, 2022, "maennchen.ch" <jon...@maennchen.ch> wrote:
>>>>>>>>
>>>>>>>> Hi everyone,
>>>>>>>>
>>>>>>>> *Background*
>>>>>>>>
>>>>>>>> While developing tests for a mix task, that returns non UTF8 
>>>>>>>> binaries into STDOUT (building block to be piped into a file / pipe), 
>>>>>>>> I 
>>>>>>>> found that ExUnit.CaptureIO can only handle UTF8 and Latin1.
>>>>>>>>
>>>>>>>> Example Test that does not work:
>>>>>>>> https://gist.github.com/maennchen/16d411eeda3255fa3d3152fe9d836a82
>>>>>>>>
>>>>>>>> *Proposal*
>>>>>>>>
>>>>>>>> For testing this use case, it would be good if any raw binary would 
>>>>>>>> also be passed through. (Maybe via option "encoding: :raw_binary")
>>>>>>>>
>>>>>>>> Additionally, it would be good if there was a proper error for 
>>>>>>>> invalid characters instead of the currently raised ArgumentError.
>>>>>>>>
>>>>>>>> *Real World Example*
>>>>>>>>
>>>>>>>> Here is a real test, that would be made possible by this change: 
>>>>>>>> https://github.com/elixir-gettext/expo/blob/9048fe242830614f6d4235cbd345de844693f28a/test/mix/tasks/expo.msgfmt_test.exs#L18
>>>>>>>>
>>>>>>>> *PR*
>>>>>>>>
>>>>>>>> I'm happy to provide a PR for this as well.
>>>>>>>>
>>>>>>>> Thanks & Kind Regards,
>>>>>>>> Jonatan
>>>>>>>>
>>>>>>>> -- 
>>>>>>>> You received this message because you are subscribed to the Google 
>>>>>>>> Groups "elixir-lang-core" group.
>>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>>> send an email to elixir-lang-co...@googlegroups.com.
>>>>>>>> To view this discussion on the web visit 
>>>>>>>> https://groups.google.com/d/msgid/elixir-lang-core/e35e97cf-00d6-422e-b3c1-ec508ff1e36fn%40googlegroups.com
>>>>>>>>  
>>>>>>>> <https://groups.google.com/d/msgid/elixir-lang-core/e35e97cf-00d6-422e-b3c1-ec508ff1e36fn%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>>> .
>>>>>>>>
>>>>>>>>
>>>>>>> -- 
>>>>>>> You received this message because you are subscribed to the Google 
>>>>>>> Groups "elixir-lang-core" group.
>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>> send an email to elixir-lang-co...@googlegroups.com.
>>>>>>>
>>>>>>> To view this discussion on the web visit 
>>>>>>> https://groups.google.com/d/msgid/elixir-lang-core/9e817fc9-6f61-4476-bb27-c062ed6167fan%40googlegroups.com
>>>>>>>  
>>>>>>> <https://groups.google.com/d/msgid/elixir-lang-core/9e817fc9-6f61-4476-bb27-c062ed6167fan%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>> .
>>>>>>>
>>>>>>> -- 
>>>>>> You received this message because you are subscribed to the Google 
>>>>>> Groups "elixir-lang-core" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>> send an email to elixir-lang-co...@googlegroups.com.
>>>>>> To view this discussion on the web visit 
>>>>>> https://groups.google.com/d/msgid/elixir-lang-core/f2816480-4d91-4576-9241-3bdc9351a920n%40googlegroups.com
>>>>>>  
>>>>>> <https://groups.google.com/d/msgid/elixir-lang-core/f2816480-4d91-4576-9241-3bdc9351a920n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>> .
>>>>>>
>>>>> -- 
>>>>> You received this message because you are subscribed to a topic in the 
>>>>> Google Groups "elixir-lang-core" group.
>>>>> To unsubscribe from this topic, visit 
>>>>> https://groups.google.com/d/topic/elixir-lang-core/RR7nbeHsluQ/unsubscribe
>>>>> .
>>>>> To unsubscribe from this group and all its topics, send an email to 
>>>>> elixir-lang-co...@googlegroups.com.
>>>>> To view this discussion on the web visit 
>>>>> https://groups.google.com/d/msgid/elixir-lang-core/CAGnRm4KdzYkX%3DL%3D3%3DYHSJk4R8iq1%3Dbe9262Fg54eX1yLrx-c9g%40mail.gmail.com
>>>>>  
>>>>> <https://groups.google.com/d/msgid/elixir-lang-core/CAGnRm4KdzYkX%3DL%3D3%3DYHSJk4R8iq1%3Dbe9262Fg54eX1yLrx-c9g%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>>> .
>>>>>
>>>>> -- 
>>>>> You received this message because you are subscribed to the Google 
>>>>> Groups "elixir-lang-core" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>>> an email to elixir-lang-co...@googlegroups.com.
>>>>> To view this discussion on the web visit 
>>>>> https://groups.google.com/d/msgid/elixir-lang-core/F3C0BE5C-EC1A-4574-8586-7FC5E7AFD39A%40maennchen.ch
>>>>>  
>>>>> <https://groups.google.com/d/msgid/elixir-lang-core/F3C0BE5C-EC1A-4574-8586-7FC5E7AFD39A%40maennchen.ch?utm_medium=email&utm_source=footer>
>>>>> .
>>>>>
>>>> -- 
>>>> You received this message because you are subscribed to a topic in the 
>>>> Google Groups "elixir-lang-core" group.
>>>> To unsubscribe from this topic, visit 
>>>> https://groups.google.com/d/topic/elixir-lang-core/RR7nbeHsluQ/unsubscribe
>>>> .
>>>> To unsubscribe from this group and all its topics, send an email to 
>>>> elixir-lang-co...@googlegroups.com.
>>>> To view this discussion on the web visit 
>>>> https://groups.google.com/d/msgid/elixir-lang-core/CAGnRm4%2BSqS4j-VRG_aBfSfs3uc3c0Q_m3svGTKS%2Bw2OR5aO_rg%40mail.gmail.com
>>>>  
>>>> <https://groups.google.com/d/msgid/elixir-lang-core/CAGnRm4%2BSqS4j-VRG_aBfSfs3uc3c0Q_m3svGTKS%2Bw2OR5aO_rg%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>> .
>>>>
>>>> -- 
>>>> You received this message because you are subscribed to the Google 
>>>> Groups "elixir-lang-core" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>> an email to elixir-lang-co...@googlegroups.com.
>>>>
>>> To view this discussion on the web visit 
>>>> https://groups.google.com/d/msgid/elixir-lang-core/08729069-5CF7-4400-BAC3-A93E1DA43B1A%40maennchen.ch
>>>>  
>>>> <https://groups.google.com/d/msgid/elixir-lang-core/08729069-5CF7-4400-BAC3-A93E1DA43B1A%40maennchen.ch?utm_medium=email&utm_source=footer>
>>>> .
>>>>
>>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elixir-lang-core" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elixir-lang-co...@googlegroups.com.
>>
> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elixir-lang-core/d5886ad9-002e-48fd-948a-067a96770830n%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/elixir-lang-core/d5886ad9-002e-48fd-948a-067a96770830n%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elixir-lang-core+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elixir-lang-core/dec562c1-0f0c-4f9d-90f9-7865017a3b1en%40googlegroups.com.

Reply via email to