Great :) Last question for now: Would you add the details of the error to error_info of the badarg error or should this be an entirely different error?
On Tuesday, April 12, 2022 at 4:11:45 PM UTC+8 José Valim wrote: > I would personally send a PR with steps one and two, and also changing one > of the callsites for 3 so you have something to test. Then another PR to > migrate the remaining call sites! > > On Tue, Apr 12, 2022 at 9:38 AM Jonatan Männchen <jon...@maennchen.ch> > wrote: > >> Hi all, >> >> I dug through the code and now know how the parts fits together. >> >> Since it is a combination of Elixir & OTP, I would like to ask for your >> input before continuing: >> >> When using the following code, the key code paths are: >> >> *{:ok, io} = StringIO.open("", encoding: :unicode) # Starts a StringIO >> GenServer* >> *IO.write(io, <<222>>)* >> >> >> https://github.com/erlang/otp/blob/56240f084d80de80d5f1c7fef94db68a6f1b81cc/lib/stdlib/src/io.erl#L91 >> => :io.request(pid, {:put_chars, :unicode,<<222>>}, _ref) >> >> >> https://github.com/elixir-lang/elixir/blob/a64d42f5d3cb6c32752af9d3312897e8cd5bb7ec/lib/elixir/lib/string_io.ex#L289 >> => {{:error, req}, state} >> >> >> https://github.com/erlang/otp/blob/56240f084d80de80d5f1c7fef94db68a6f1b81cc/lib/stdlib/src/io.erl#L340 >> => badarg >> >> >> https://github.com/erlang/otp/blob/56240f084d80de80d5f1c7fef94db68a6f1b81cc/lib/stdlib/src/io.erl#L99 >> => raises argument error without any additional info >> >> Based on this, I think multiple things are needed: >> >> >> 1. `io:conv_reason/1` needs to accept a new tuple for the error >> 1. I would return the error directly from >> `unicode:characters_to_binary/3`: {:error, {:encoding, result}} >> 2. where result: {error, binary(), RestData} | {incomplete, >> binary(), binary()} >> 2. `io:o_request/2` needs to handle that new error and raise a better >> error >> 1. new error kind? >> 2. existing badarg with more detail data? >> 3. All those sources need to produce the new error (does not produce >> OTP incompatibility in Elixir sind everything unknown is converted to >> badarg): >> 1. `StringIO.put_chars/4` >> 2. `file_io_server:put_chars/3` >> 3. `group:io_request/5` >> 4. `standard_error:wrap_characters_to_binary/3` >> 5. `user_drv:io_command/1` >> 6. `user:wrap_characters_to_binary/2` >> 7. `ssh_cli:io_request/4` >> >> Does that sound like I'm going into the right direction? >> >> If yes: Should that be one PR in OTP or several? >> >> Thanks for any input & Best, >> Jonatan >> >> >> On Thursday, April 7, 2022 at 2:14:39 PM UTC+8 José Valim wrote: >> >>> Make sure to also check out this doc then: >>> https://github.com/erlang/otp/blob/master/HOWTO/DEVELOPMENT.md Have fun! >>> >>> On Thu, Apr 7, 2022 at 8:12 AM Jonatan Männchen <jon...@maennchen.ch> >>> wrote: >>> >>>> I’ve never done a PR to OTP so far. I’ll check it out and ping back >>>> here for updates. >>>> >>>> Sent from my iPhone >>>> >>>> On 7 Apr 2022, at 07:09, José Valim <jose....@dashbit.co> wrote: >>>> >>>> >>>> I would start with 2 by submitting a PR to Erlang/OTP and then access >>>> their appetite from there. :) >>>> >>>> On Thu, Apr 7, 2022 at 8:03 AM Jonatan Männchen <jon...@maennchen.ch> >>>> wrote: >>>> >>>>> I guess leaving nr. 3 out would be fine. >>>>> >>>>> What would he the proper path to do this in erlang? Just writing some >>>>> erlang example code and opening issues? >>>>> >>>>> Sent from my iPhone >>>>> >>>>> On 7 Apr 2022, at 06:46, José Valim <jose....@dashbit.co> wrote: >>>>> >>>>> >>>>> All three of them would require changes to Erlang and I would love for >>>>> someone to chase this path. 2 sounds doable with EEP 54 but 1 would >>>>> definitely require more discussion. >>>>> >>>>> I am not sure about 3. binwrite pretty much means "I know what I am >>>>> doing" and we should allow the user to write whatever they want as is. >>>>> >>>>> On Thu, Apr 7, 2022 at 7:25 AM Jonatan Männchen <jon...@maennchen.ch> >>>>> wrote: >>>>> >>>>>> Thanks, that solved my specific issue. I still think that some >>>>>> improvements are needed. >>>>>> >>>>>> Looking at the code for CaptureIO, I think those changes should be >>>>>> made directly in the StringIO / IO modules and not specifically for >>>>>> CaptureIO. >>>>>> >>>>>> The following things are not great today IMHO: >>>>>> >>>>>> 1. The encoding that lets everything through is called latin1. I >>>>>> think we should introduce & properly document a new encoding called >>>>>> something like raw_binary. It would work exactly the same though. >>>>>> 2. IO.write with invalid characters into an io device with any >>>>>> encoding should have a better error message. (Something like "<<222>> >>>>>> is >>>>>> not a valid unicode string. Provide `encoding: :raw_binary` when >>>>>> opening >>>>>> the io device") >>>>>> 3. IO.binwrite with invalid characters into an encoding: :unicode >>>>>> io device should probably at least warn if invalid characters are >>>>>> passed. >>>>>> >>>>>> This would something like this in the form of a test: >>>>>> https://gist.github.com/maennchen/f428360a71d23a323538d9b7d51e638b >>>>>> >>>>>> On Wednesday, April 6, 2022 at 9:29:12 PM UTC+2 Wojtek Mach wrote: >>>>>> >>>>>>> > Additionally, it would be good if there was a proper error for >>>>>>> invalid characters instead of the currently raised ArgumentError. >>>>>>> >>>>>>> Yeah the error is pretty bad: >>>>>>> >>>>>>> IO.puts(<<222>>) >>>>>>> ** (ArgumentError) errors were found at the given arguments: unknown >>>>>>> error: put_chars >>>>>>> >>>>>>> >>>>>>> It is slightly more informative when used inside capture io though: >>>>>>> >>>>>>> assert capture_io(fn -> >>>>>>> IO.puts(<<222>>) >>>>>>> end) == <<222>> >>>>>>> ** (ArgumentError) errors were found at the given arguments: unknown >>>>>>> error: {put_chars,unicode,[<<"Þ">>,10]} >>>>>>> >>>>>>> >>>>>>> We get error because stdio is with unicode encoding: >>>>>>> >>>>>>> iex> :io.getopts(:standard_io)[:encoding] >>>>>>> :unicode >>>>>>> >>>>>>> >>>>>>> and we're writing <<222>> which isn't unicode. >>>>>>> >>>>>>> For writing in raw mode, use IO.binwrite. This and the previously >>>>>>> mentioned :encoding option will make the following test succeed: >>>>>>> >>>>>>> assert capture_io([encoding: :latin1], fn -> >>>>>>> IO.binwrite(<<222>>) >>>>>>> end) == <<222>> >>>>>>> >>>>>>> >>>>>>> On April 6, 2022, "maennchen.ch" <jon...@maennchen.ch> wrote: >>>>>>> >>>>>>> That unfortunately gives me the same result. >>>>>>> >>>>>>> On Wednesday, April 6, 2022 at 6:57:35 PM UTC+1 Wojtek Mach wrote: >>>>>>>> >>>>>>>> I believe capture_io(encoding: :latin1, fun) should do the trick, >>>>>>>> can you check? >>>>>>>> >>>>>>>> On April 6, 2022, "maennchen.ch" <jon...@maennchen.ch> wrote: >>>>>>>> >>>>>>>> Hi everyone, >>>>>>>> >>>>>>>> *Background* >>>>>>>> >>>>>>>> While developing tests for a mix task, that returns non UTF8 >>>>>>>> binaries into STDOUT (building block to be piped into a file / pipe), >>>>>>>> I >>>>>>>> found that ExUnit.CaptureIO can only handle UTF8 and Latin1. >>>>>>>> >>>>>>>> Example Test that does not work: >>>>>>>> https://gist.github.com/maennchen/16d411eeda3255fa3d3152fe9d836a82 >>>>>>>> >>>>>>>> *Proposal* >>>>>>>> >>>>>>>> For testing this use case, it would be good if any raw binary would >>>>>>>> also be passed through. (Maybe via option "encoding: :raw_binary") >>>>>>>> >>>>>>>> Additionally, it would be good if there was a proper error for >>>>>>>> invalid characters instead of the currently raised ArgumentError. >>>>>>>> >>>>>>>> *Real World Example* >>>>>>>> >>>>>>>> Here is a real test, that would be made possible by this change: >>>>>>>> https://github.com/elixir-gettext/expo/blob/9048fe242830614f6d4235cbd345de844693f28a/test/mix/tasks/expo.msgfmt_test.exs#L18 >>>>>>>> >>>>>>>> *PR* >>>>>>>> >>>>>>>> I'm happy to provide a PR for this as well. >>>>>>>> >>>>>>>> Thanks & Kind Regards, >>>>>>>> Jonatan >>>>>>>> >>>>>>>> -- >>>>>>>> You received this message because you are subscribed to the Google >>>>>>>> Groups "elixir-lang-core" group. >>>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>>> send an email to elixir-lang-co...@googlegroups.com. >>>>>>>> To view this discussion on the web visit >>>>>>>> https://groups.google.com/d/msgid/elixir-lang-core/e35e97cf-00d6-422e-b3c1-ec508ff1e36fn%40googlegroups.com >>>>>>>> >>>>>>>> <https://groups.google.com/d/msgid/elixir-lang-core/e35e97cf-00d6-422e-b3c1-ec508ff1e36fn%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>>> . >>>>>>>> >>>>>>>> >>>>>>> -- >>>>>>> You received this message because you are subscribed to the Google >>>>>>> Groups "elixir-lang-core" group. >>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>> send an email to elixir-lang-co...@googlegroups.com. >>>>>>> >>>>>>> To view this discussion on the web visit >>>>>>> https://groups.google.com/d/msgid/elixir-lang-core/9e817fc9-6f61-4476-bb27-c062ed6167fan%40googlegroups.com >>>>>>> >>>>>>> <https://groups.google.com/d/msgid/elixir-lang-core/9e817fc9-6f61-4476-bb27-c062ed6167fan%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>> . >>>>>>> >>>>>>> -- >>>>>> You received this message because you are subscribed to the Google >>>>>> Groups "elixir-lang-core" group. >>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>> send an email to elixir-lang-co...@googlegroups.com. >>>>>> To view this discussion on the web visit >>>>>> https://groups.google.com/d/msgid/elixir-lang-core/f2816480-4d91-4576-9241-3bdc9351a920n%40googlegroups.com >>>>>> >>>>>> <https://groups.google.com/d/msgid/elixir-lang-core/f2816480-4d91-4576-9241-3bdc9351a920n%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>> . >>>>>> >>>>> -- >>>>> You received this message because you are subscribed to a topic in the >>>>> Google Groups "elixir-lang-core" group. >>>>> To unsubscribe from this topic, visit >>>>> https://groups.google.com/d/topic/elixir-lang-core/RR7nbeHsluQ/unsubscribe >>>>> . >>>>> To unsubscribe from this group and all its topics, send an email to >>>>> elixir-lang-co...@googlegroups.com. >>>>> To view this discussion on the web visit >>>>> https://groups.google.com/d/msgid/elixir-lang-core/CAGnRm4KdzYkX%3DL%3D3%3DYHSJk4R8iq1%3Dbe9262Fg54eX1yLrx-c9g%40mail.gmail.com >>>>> >>>>> <https://groups.google.com/d/msgid/elixir-lang-core/CAGnRm4KdzYkX%3DL%3D3%3DYHSJk4R8iq1%3Dbe9262Fg54eX1yLrx-c9g%40mail.gmail.com?utm_medium=email&utm_source=footer> >>>>> . >>>>> >>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "elixir-lang-core" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to elixir-lang-co...@googlegroups.com. >>>>> To view this discussion on the web visit >>>>> https://groups.google.com/d/msgid/elixir-lang-core/F3C0BE5C-EC1A-4574-8586-7FC5E7AFD39A%40maennchen.ch >>>>> >>>>> <https://groups.google.com/d/msgid/elixir-lang-core/F3C0BE5C-EC1A-4574-8586-7FC5E7AFD39A%40maennchen.ch?utm_medium=email&utm_source=footer> >>>>> . >>>>> >>>> -- >>>> You received this message because you are subscribed to a topic in the >>>> Google Groups "elixir-lang-core" group. >>>> To unsubscribe from this topic, visit >>>> https://groups.google.com/d/topic/elixir-lang-core/RR7nbeHsluQ/unsubscribe >>>> . >>>> To unsubscribe from this group and all its topics, send an email to >>>> elixir-lang-co...@googlegroups.com. >>>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/elixir-lang-core/CAGnRm4%2BSqS4j-VRG_aBfSfs3uc3c0Q_m3svGTKS%2Bw2OR5aO_rg%40mail.gmail.com >>>> >>>> <https://groups.google.com/d/msgid/elixir-lang-core/CAGnRm4%2BSqS4j-VRG_aBfSfs3uc3c0Q_m3svGTKS%2Bw2OR5aO_rg%40mail.gmail.com?utm_medium=email&utm_source=footer> >>>> . >>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "elixir-lang-core" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to elixir-lang-co...@googlegroups.com. >>>> >>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/elixir-lang-core/08729069-5CF7-4400-BAC3-A93E1DA43B1A%40maennchen.ch >>>> >>>> <https://groups.google.com/d/msgid/elixir-lang-core/08729069-5CF7-4400-BAC3-A93E1DA43B1A%40maennchen.ch?utm_medium=email&utm_source=footer> >>>> . >>>> >>> -- >> You received this message because you are subscribed to the Google Groups >> "elixir-lang-core" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to elixir-lang-co...@googlegroups.com. >> > To view this discussion on the web visit >> https://groups.google.com/d/msgid/elixir-lang-core/d5886ad9-002e-48fd-948a-067a96770830n%40googlegroups.com >> >> <https://groups.google.com/d/msgid/elixir-lang-core/d5886ad9-002e-48fd-948a-067a96770830n%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> > -- You received this message because you are subscribed to the Google Groups "elixir-lang-core" group. To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-core+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/dec562c1-0f0c-4f9d-90f9-7865017a3b1en%40googlegroups.com.