Re: gawk 4.1.4: CR separate char for CRLF files
Vermessung AVT - Wolfgang Rieger writes: > 5) You can always find a better way to do things, of course, I won't > argue about that. Sometimes we thought about switching to Java or php > or python or whatever. Maybe, we should. But we have a lot of running > scripts, massive batch and parallel processing, and cmd.exe with > minimum Cygwin (no X subsystem, no pile of tools, just a tiny > installation) has worked great for many years - so why not use it? > Just because it is not intended to use it that way? Just because it is not intended to use it that way, yes, that is the reason not to do it. Just because it works now doesn't mean that it will continue to work and you put yourself in jeopardy if you ever update your software. With your use of cmd.exe instead of a Cygwin shell also puts you at risk of not being able to execute your scripts. While Cygwin doesn't intentionally cause its binaries to not execute outside of Cygwin support for those binaries is only supported if the problems exist within the Cygwin shell as well. So if an executable provides expected results in bash but not in cmd, you lose. -- cyg Simple P.S.: You need to learn how to use a proper mail client and respond to this list appropriately. I had to "edit as new" and hand edit the mail just to get proper quoting. -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Re: gawk 4.1.4: CR separate char for CRLF files
On 08/16/2017 07:09 AM, Vermessung AVT - Wolfgang Rieger wrote: > Achim Gratz wrote: > Vermessung AVT - Wolfgang Rieger writes: >> Another solution which we have been using for many years now, though >> it might not be feasible for you: > Cygwin is, like it or not, a rolling distribution. Your quoting is horrible; you repeated Achim's comments without adding any '>' nesting, > Regards, > Achim. > -- > +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ Then used a '-- ' line, which sane mail clients treat as the end of the email and start of the signature, and only then added your content. Thunderbird, in particular, refuses to include signatures by default when replying to a message; and hitting 'ctrl-a' then reply to at least paste all of your text then loses the formatting of your message, making it very difficult to reply to you, as shown here: > SD adaptations for KORG EX-800 and Poly-800MkII V0.9: > http://Synth.Stromeko.net/Downloads.html#KorgSDada Dear Achim, I fully > agree to most of what you say. But: 1) As well as Cygwin is a rolling > distrib my work is a "rolling work". And that is why I deal with it in At any rate, in answer to your question: > > Anyway, thanks for the suggestion to contact the upstream developers. I was > not aware of that. Can you give me a hint where to go? awk --help | tail -n10 points you to the manual for how to report upstream bugs; if you don't like info, the same data can be found here: https://www.gnu.org/software/gawk/manual/html_node/Bugs.html#Bugs (in general, ANY good program will include instructions for how to reach upstream in its --help output - of course, not all programs are at the same level of goodness in this regards) -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature
Re: gawk 4.1.4: CR separate char for CRLF files
Achim Gratz wrote: Vermessung AVT - Wolfgang Rieger writes: > Another solution which we have been using for many years now, though > it might not be feasible for you: - snip Jannick, another idea I had thought of previously might eventually help: There is the possibility in awk to include source code by @include "myfile.awk" syntax. I was sometimes thinking of providing a general awk script that could deal with oddities of any kind that could easily be changed just in myfile.awk when necessary, e.g. due to updates. You could even think of an optional environment variable to control which script to include. It should be easy to add such an @include line in all gawk scripts automatically. Did you thing of something like that? Kind regards, Wolfgang -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Re: gawk 4.1.4: CR separate char for CRLF files
Achim Gratz wrote: Vermessung AVT - Wolfgang Rieger writes: > Another solution which we have been using for many years now, though > it might not be feasible for you: Cygwin is, like it or not, a rolling distribution. > We very rarely update Cygwin. We have been using Cygwin for some 15+ > years now. We use tools like gawk (hundreds of scripts), head, tail, > sort, etc. that we are using in shell scripts running under cmd.exe > (no Unix shells involved). I soon realized that upgrades of Cygwin may > cause troubles with existing scripts, so we only update if we really > need to (e.g.: New functionality that would be important, 32 to 64 bit > shift, eventually new Windows versions, bugs we needed to be fixed). Hopefully the machine(s) runnning those scripts are isolated. In your particular case you might be better off using MSys2 or GNUwin32 tools, although you'd still need a better way to deal with updates. Also, audit your scripts for non-portable constructs, since those are the parts that most likely to break. CMD scripting is a tough nut to crack if it's of any complexity and there are lots of things that are poorly or not officially documented. I don't quite understand why you use POSIX tools, but specifically shun POSIX scripting. > I have followed the discussions about the CR/LF behaviour changes in > the past attentively and decided not to update in near future, because > that would lead to a massive problem with many hundreds of scripts - > hoping that sometimes there will be a change in gawk again. You'd better replace that hope with a feature request at gawk upstream. > What is Unix-like or OS-like or Posix-like behaviour in that context? > You could argue that gawk interprets line endings like the underlying > OS does (i. e., gawk reads LF in Unix and CR/LF in Win), or it > interprets line endings in a Unix-style no matter of the underlying OS > used. That's a developer's decision in my opinion. Cygwin uses LF line endings (yes there are still text mounts, but you'd be better off pretending they don't exist). When you're trying to use it for CRLF files, you need to wrap those invocations to do an explicit conversion. https://cygwin.com/cygwin-ug-net/using-textbinary.html > But since with pipes or output redirection gawk used to write no CRs > even in previous versions, we already had the problem that gawk had to > accept *both* inputs, LF with or without CR. That worked widely fine > so far, since most Windows and other application SW we use accept both > record formats, fortunately (we had issues with SW upgrades of other > vendors no longer accepting pure LF, but that only concerned a very > small number of scripts). With the new approach in Cygwin that seems > to be broken, so we did not upgrade Cygwin since then (we currently > use gawk 4.1.3). Again, your attempt to freeze your system at some arbitrary point in time is misguided. It'll never quite work out and chances are that when it breaks it will do so in ways that creates more work and forces you to do it in emergency mode, which is never a good thing. > Of course the reason for that really annoying CR/LF thing is the > arrogance and ignorance of MS, which caused innumerable of useless > developers' hours when I think of the endless discussions and changes > in Cygwin; but MS is the one who defines the standards because of its > very market power, so we have to deal with it, if we like or not. You really can't blame them for CRLF, they weren't and aren't the only ones using it and it's been in use long before Microsoft entered the scene. > I'd definitely prefer to use Unix for its powerful tools, but most of > the SW we use is simply not available for Unix, and MS does not > provide gawk etc. So we have to deal with that CR/LF issue in a > pragmatic rather than in a more, say, philosophical approach: We need > to run our scripts with as little changes as possible. So that's why > we upgrade Cygwin as seldom as possible. It is a "living system", yes, > which is great on the one side - but can be annoying in everyday > practice. Again, you'd better figure out how to transform your input (and possibly output) so it'll conform to the conventions of the tool(s) you use, perhaps by providing a handful of wrapper scripts. Alternatively, only use tools that adhere to the same set of conventions. > In my opinion there should be at least an option for gawk to accept > both LF and CR/LF line endings equally, preferably with a system > variable so that there is no need to change the command line call of > gawk at all. That's what I vote for. Yes, but please cast that vote with the upstream developers. I reckon it'd be a generally useful function, so there's no point in providing it only on Cygwin. Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ SD adaptations for KORG EX-800 and Poly-800MkII V0.9:
Re: gawk 4.1.4: CR separate char for CRLF files
Vermessung AVT - Wolfgang Rieger writes: > Another solution which we have been using for many years now, though > it might not be feasible for you: Cygwin is, like it or not, a rolling distribution. > We very rarely update Cygwin. We have been using Cygwin for some 15+ > years now. We use tools like gawk (hundreds of scripts), head, tail, > sort, etc. that we are using in shell scripts running under cmd.exe > (no Unix shells involved). I soon realized that upgrades of Cygwin may > cause troubles with existing scripts, so we only update if we really > need to (e.g.: New functionality that would be important, 32 to 64 bit > shift, eventually new Windows versions, bugs we needed to be fixed). Hopefully the machine(s) runnning those scripts are isolated. In your particular case you might be better off using MSys2 or GNUwin32 tools, although you'd still need a better way to deal with updates. Also, audit your scripts for non-portable constructs, since those are the parts that most likely to break. CMD scripting is a tough nut to crack if it's of any complexity and there are lots of things that are poorly or not officially documented. I don't quite understand why you use POSIX tools, but specifically shun POSIX scripting. > I have followed the discussions about the CR/LF behaviour changes in > the past attentively and decided not to update in near future, because > that would lead to a massive problem with many hundreds of scripts - > hoping that sometimes there will be a change in gawk again. You'd better replace that hope with a feature request at gawk upstream. > What is Unix-like or OS-like or Posix-like behaviour in that context? > You could argue that gawk interprets line endings like the underlying > OS does (i. e., gawk reads LF in Unix and CR/LF in Win), or it > interprets line endings in a Unix-style no matter of the underlying OS > used. That's a developer's decision in my opinion. Cygwin uses LF line endings (yes there are still text mounts, but you'd be better off pretending they don't exist). When you're trying to use it for CRLF files, you need to wrap those invocations to do an explicit conversion. https://cygwin.com/cygwin-ug-net/using-textbinary.html > But since with pipes or output redirection gawk used to write no CRs > even in previous versions, we already had the problem that gawk had to > accept *both* inputs, LF with or without CR. That worked widely fine > so far, since most Windows and other application SW we use accept both > record formats, fortunately (we had issues with SW upgrades of other > vendors no longer accepting pure LF, but that only concerned a very > small number of scripts). With the new approach in Cygwin that seems > to be broken, so we did not upgrade Cygwin since then (we currently > use gawk 4.1.3). Again, your attempt to freeze your system at some arbitrary point in time is misguided. It'll never quite work out and chances are that when it breaks it will do so in ways that creates more work and forces you to do it in emergency mode, which is never a good thing. > Of course the reason for that really annoying CR/LF thing is the > arrogance and ignorance of MS, which caused innumerable of useless > developers' hours when I think of the endless discussions and changes > in Cygwin; but MS is the one who defines the standards because of its > very market power, so we have to deal with it, if we like or not. You really can't blame them for CRLF, they weren't and aren't the only ones using it and it's been in use long before Microsoft entered the scene. > I'd definitely prefer to use Unix for its powerful tools, but most of > the SW we use is simply not available for Unix, and MS does not > provide gawk etc. So we have to deal with that CR/LF issue in a > pragmatic rather than in a more, say, philosophical approach: We need > to run our scripts with as little changes as possible. So that's why > we upgrade Cygwin as seldom as possible. It is a "living system", yes, > which is great on the one side - but can be annoying in everyday > practice. Again, you'd better figure out how to transform your input (and possibly output) so it'll conform to the conventions of the tool(s) you use, perhaps by providing a handful of wrapper scripts. Alternatively, only use tools that adhere to the same set of conventions. > In my opinion there should be at least an option for gawk to accept > both LF and CR/LF line endings equally, preferably with a system > variable so that there is no need to change the command line call of > gawk at all. That's what I vote for. Yes, but please cast that vote with the upstream developers. I reckon it'd be a generally useful function, so there's no point in providing it only on Cygwin. Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ SD adaptations for KORG EX-800 and Poly-800MkII V0.9: http://Synth.Stromeko.net/Downloads.html#KorgSDada -- Problem reports:
RE: gawk 4.1.4: CR separate char for CRLF files
Hi Wolfgang, First of all, many thanks for your interesting experience report and the constructive remarks. On Mon, 14 Aug 2017 10:36:23 +, Vermessung AVT - Wolfgang Rieger wrote: > Another solution which we have been using for many years now, though it > might not be feasible for you: Yes, you are right, unfortunately: We make extensive use of gawk extensions to upgraded with gawk in tandem. Thus we will move forward with the ongoing gawk development. > We very rarely update Cygwin. We have been using Cygwin for some 15+ > years now. We use tools like gawk (hundreds of scripts), head, tail, sort, etc. > that we are using in shell scripts running under cmd.exe (no Unix shells > involved). I soon realized that upgrades of Cygwin may cause troubles with > existing scripts, so we only update if we really need to (e.g.: New > functionality that would be important, 32 to 64 bit shift, eventually new > Windows versions, bugs we needed to be fixed). > > I have followed the discussions about the CR/LF behaviour changes in the > past attentively and decided not to update in near future, because that > would lead to a massive problem with many hundreds of scripts - hoping > that sometimes there will be a change in gawk again. Agree - this is the same setting here. Furthermore, we run our heavy processes on a semi-annual basis within a more than tight time frame. So cygwin's update came pretty much out of the blue in the late minute, because since the last reporting cycle we have not used gawk. An unpleasant surprise with heavy potential time issues if we had not taken the decision on how to deal with the changed situation. And as you are saying below ... > What is Unix-like or OS-like or Posix-like behaviour in that context? You could > argue that gawk interprets line endings like the underlying OS does (i. e., > gawk reads LF in Unix and CR/LF in Win), or it interprets line endings in a > Unix-style no matter of the underlying OS used. That's a developer's decision > in my opinion. True. And the developers of gawk opted - with a heavy heart I believe - to have gawk swallow CRs. > But since with pipes or output redirection gawk used to write no CRs even in > previous versions, we already had the problem that gawk had to accept > *both* inputs, LF with or without CR. That worked widely fine so far, since > most Windows and other application SW we use accept both record formats, > fortunately (we had issues with SW upgrades of other vendors no longer > accepting pure LF, but that only concerned a very small number of scripts). > With the new approach in Cygwin that seems to be broken, so we did not > upgrade Cygwin since then (we currently use gawk 4.1.3). Yes, this is our basis of SW selection process as well, but we march with gawk's version as it nicely develops needing a gawk version reading files and pipes of any LF and CRLF kind out of the box. > Of course the reason for that really annoying CR/LF thing is the arrogance > and ignorance of MS, which caused innumerable of useless developers' > hours when I think of the endless discussions and changes in Cygwin; but MS > is the one who defines the standards because of its very market power, so > we have to deal with it, if we like or not. I'd definitely prefer to use Unix for > its powerful tools, but most of the SW we use is simply not available for Unix, > and MS does not provide gawk etc. So we have to deal with that CR/LF issue > in a pragmatic rather than in a more, say, philosophical approach: We need > to run our scripts with as little changes as possible. So that's why we upgrade > Cygwin as seldom as possible. It is a "living system", yes, which is great on > the one side - but can be annoying in everyday practice. We are squared into the Windows world as well. So there's no way out of that. So far I was more than happy that the gawk code comes with the feature to silently swallow CRs (cf. the code reference with the exact code line in my previous posting) and that was used until the last update. Now that things - from our point of view - tremendously changed, we were urged to run a decision process looking at alternatives (I listed in my first email). The evaluation in the past days led us to the decision to use another source of bilingual versions of gawk and friends (i.e. they read CRLF and CR without any additional hint). This is what the user can opt for. > In my opinion there should be at least an option for gawk to accept both LF > and CR/LF line endings equally, preferably with a system variable so that > there is no need to change the command line call of gawk at all. That's what I > vote for. Fully agree - for this I would have been pretty much in favor as well. Something close to this I was having in mind in my first posting. > Kind regards, > Wolfgang Best regards, J. -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation:
RE: gawk 4.1.4: CR separate char for CRLF files
On Wed, 9 Aug 2017 10:38 +, Jannick wrote: --- snip --- > Now I can see the following *easy* solutions to the very situation here > (input only for now): > > 1 - Inserting the BEGIN section as you suggested into more than 1k scripts > (not feasible due to additional regression test workload) > > 2 - Calling 'gawk -vRS=\r\n -vORS=\r\n' instead of 'gawk' (hack to turn back > the additional the latest gawk's complexity, wrapper needed) > > 3 - Wrapping a d2u/u2d pipe solution (additional app and wrapper needed again) > > 4 - Using another compiled version of gawk which does *not* disable the > out-of-the-box gawk feature to swallow CRs (cf., e.g., > http://git.savannah.gnu.org/cgit/gawk.git/tree/awkgram.y#n3543), i.e. > without the artificial obstacle to now know the EOL type of the input file > ahead of running gawk. > >> It works in all my cases. The only disadvantage: you have to know what kind > >... plus the disadvantage to systematically amend all the scripts instead of >having an external solution > >> of files you want to handle in the awk script. The same awk script >> will not >> work for DOS files as well as for linux files. > >... another issue originated by the change and which didn't exist before. > >> Best >> >> Roger > > Please don't get me wrong, but this raises a real issue here and I am not > sure which rationale other than 'let's get more of the Linux-feel' drove the > decision. > > All the best, > J. --- snip --- Another solution which we have been using for many years now, though it might not be feasible for you: We very rarely update Cygwin. We have been using Cygwin for some 15+ years now. We use tools like gawk (hundreds of scripts), head, tail, sort, etc. that we are using in shell scripts running under cmd.exe (no Unix shells involved). I soon realized that upgrades of Cygwin may cause troubles with existing scripts, so we only update if we really need to (e.g.: New functionality that would be important, 32 to 64 bit shift, eventually new Windows versions, bugs we needed to be fixed). I have followed the discussions about the CR/LF behaviour changes in the past attentively and decided not to update in near future, because that would lead to a massive problem with many hundreds of scripts - hoping that sometimes there will be a change in gawk again. What is Unix-like or OS-like or Posix-like behaviour in that context? You could argue that gawk interprets line endings like the underlying OS does (i. e., gawk reads LF in Unix and CR/LF in Win), or it interprets line endings in a Unix-style no matter of the underlying OS used. That's a developer's decision in my opinion. But since with pipes or output redirection gawk used to write no CRs even in previous versions, we already had the problem that gawk had to accept *both* inputs, LF with or without CR. That worked widely fine so far, since most Windows and other application SW we use accept both record formats, fortunately (we had issues with SW upgrades of other vendors no longer accepting pure LF, but that only concerned a very small number of scripts). With the new approach in Cygwin that seems to be broken, so we did not upgrade Cygwin since then (we currently use gawk 4.1.3). Of course the reason for that really annoying CR/LF thing is the arrogance and ignorance of MS, which caused innumerable of useless developers' hours when I think of the endless discussions and changes in Cygwin; but MS is the one who defines the standards because of its very market power, so we have to deal with it, if we like or not. I'd definitely prefer to use Unix for its powerful tools, but most of the SW we use is simply not available for Unix, and MS does not provide gawk etc. So we have to deal with that CR/LF issue in a pragmatic rather than in a more, say, philosophical approach: We need to run our scripts with as little changes as possible. So that's why we upgrade Cygwin as seldom as possible. It is a "living system", yes, which is great on the one side - but can be annoying in everyday practice. In my opinion there should be at least an option for gawk to accept both LF and CR/LF line endings equally, preferably with a system variable so that there is no need to change the command line call of gawk at all. That's what I vote for. Kind regards, Wolfgang -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Re: gawk 4.1.4: CR separate char for CRLF files
On 8/11/2017 12:54 PM, Brian Inglis wrote: > On 2017-08-11 06:47, cyg Simple wrote: >> On 8/10/2017 6:49 PM, Brian Inglis wrote: >>> On 2017-08-10 15:49, cyg Simple wrote: On 8/10/2017 5:34 PM, Brian Inglis wrote: >> >> http://cygwin.com/ml/cygwin/2017-08/msg00104.html > > It is flowed format with quoted breaks, which I see reassembled and > wrapped in > the window by Thunderbird with no issues: >>> So what setting do I have that is causing me to not see it. Every mail David sends displays as empty for me. >>> >>> Enable/set wrap settings in config editor, and in Tools/Options/Composition >>> tab/Send Options... button check Text format Send message as plain text if >>> possible checkbox/select Convert the message to plain text dropdown; add >>> cygwin.com and sourceware.org in Plain text domains tab. > >> That is for sending mail, not reading it. What causes you to be able to >> read David's mail and not me? > > First part: >>> Enable/set wrap settings in config editor, > search for wrap and set toggles to true and values to 80/72/... > Great, thanks for that. I can now read David's mail. -- cyg Simple -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Re: gawk 4.1.4: CR separate char for CRLF files
On 2017-08-11 06:47, cyg Simple wrote: > On 8/10/2017 6:49 PM, Brian Inglis wrote: >> On 2017-08-10 15:49, cyg Simple wrote: >>> On 8/10/2017 5:34 PM, Brian Inglis wrote: > > http://cygwin.com/ml/cygwin/2017-08/msg00104.html It is flowed format with quoted breaks, which I see reassembled and wrapped in the window by Thunderbird with no issues: >> >>> So what setting do I have that is causing me to not see it. Every mail >>> David sends displays as empty for me. >> >> Enable/set wrap settings in config editor, and in Tools/Options/Composition >> tab/Send Options... button check Text format Send message as plain text if >> possible checkbox/select Convert the message to plain text dropdown; add >> cygwin.com and sourceware.org in Plain text domains tab. > That is for sending mail, not reading it. What causes you to be able to > read David's mail and not me? First part: >> Enable/set wrap settings in config editor, search for wrap and set toggles to true and values to 80/72/... -- Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Re: gawk 4.1.4: CR separate char for CRLF files
On 8/10/2017 6:49 PM, Brian Inglis wrote: > On 2017-08-10 15:49, cyg Simple wrote: >> On 8/10/2017 5:34 PM, Brian Inglis wrote: http://cygwin.com/ml/cygwin/2017-08/msg00104.html >>> >>> It is flowed format with quoted breaks, which I see reassembled and wrapped >>> in >>> the window by Thunderbird with no issues: > >> So what setting do I have that is causing me to not see it. Every mail >> David sends displays as empty for me. > > Enable/set wrap settings in config editor, and in Tools/Options/Composition > tab/Send Options... button check Text format Send message as plain text if > possible checkbox/select Convert the message to plain text dropdown; add > cygwin.com and sourceware.org in Plain text domains tab. > That is for sending mail, not reading it. What causes you to be able to read David's mail and not me? -- cyg Simple -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Re: gawk 4.1.4: CR separate char for CRLF files
On Thu, 10 Aug 2017 16:48:47, Brian Inglis wrote: Many archives and sites display lines off the right margin instead of allowing them to wrap as normal in HTML. Possibly using pre format style without horizontal scrollbars instead of just specifying a monospace font style. That makes it a site or converter design issue! Nope. Wrong. David has been doing this for over 2 years: http://cygwin.com/ml/cygwin/2015-01/msg00232.html So it is a user issue. The user must hard wrap because Cygwin site does not. When he knowingly disregards this he does it to the detriment of all users of the archives. -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Re: gawk 4.1.4: CR separate char for CRLF files
On 2017-08-10 15:49, cyg Simple wrote: > On 8/10/2017 5:34 PM, Brian Inglis wrote: >>> >>> http://cygwin.com/ml/cygwin/2017-08/msg00104.html >> >> It is flowed format with quoted breaks, which I see reassembled and wrapped >> in >> the window by Thunderbird with no issues: > So what setting do I have that is causing me to not see it. Every mail > David sends displays as empty for me. Enable/set wrap settings in config editor, and in Tools/Options/Composition tab/Send Options... button check Text format Send message as plain text if possible checkbox/select Convert the message to plain text dropdown; add cygwin.com and sourceware.org in Plain text domains tab. -- Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Re: gawk 4.1.4: CR separate char for CRLF files
On 2017-08-10 16:22, Steven Penny wrote: > On Thu, 10 Aug 2017 15:34:11, Brian Inglis wrote: >> It is flowed format with quoted breaks, which I see reassembled and wrapped >> in >> the window by Thunderbird with no issues: > > Thats great, but it doesnt do that with Firefox, and it doesnt do that with > Internet Explorer. So for people reading the mailing list via the archives > (read: me), each line just scrolls off the page until OP decides to break for > a > paragraph. Many archives and sites display lines off the right margin instead of allowing them to wrap as normal in HTML. Possibly using pre format style without horizontal scrollbars instead of just specifying a monospace font style. That makes it a site or converter design issue! -- Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Re: gawk 4.1.4: CR separate char for CRLF files
On Thu, 10 Aug 2017 15:34:11, Brian Inglis wrote: It is flowed format with quoted breaks, which I see reassembled and wrapped in the window by Thunderbird with no issues: Thats great, but it doesnt do that with Firefox, and it doesnt do that with Internet Explorer. So for people reading the mailing list via the archives (read: me), each line just scrolls off the page until OP decides to break for a paragraph. -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Re: gawk 4.1.4: CR separate char for CRLF files
On 8/10/2017 5:34 PM, Brian Inglis wrote: >> >> http://cygwin.com/ml/cygwin/2017-08/msg00104.html > > It is flowed format with quoted breaks, which I see reassembled and wrapped in > the window by Thunderbird with no issues: > So what setting do I have that is causing me to not see it. Every mail David sends displays as empty for me. -- cyg Simple -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Re: gawk 4.1.4: CR separate char for CRLF files
On 2017-08-10 12:35, Steven Penny wrote: > On Thu, 10 Aug 2017 10:45:34, cyg Simple wrote: >> David, I don't know what it is about your email that my thunderbird >> client doesn't like but I can't read your email except from reviewing >> the message source. > > Hes using quoted-printable, but he is not actually breaking on 80, so it just > comes out as one long line. Really annoying, and I usually wont even reads > posts > from people who do that. Here is an example: > > http://cygwin.com/ml/cygwin/2017-08/msg00104.html It is flowed format with quoted breaks, which I see reassembled and wrapped in the window by Thunderbird with no issues: > Content-Type: text/plain; charset=utf-8; format=flowed > Content-Language: en-US > Content-Transfer-Encoding: quoted-printable ... > I feel the need to correct you slightly. Although Linux is a good model, C= > ygwin primarily strives to be a good *POSIX* platform, so there may be case= > s where the two intentionally differ. displays as: I feel the need to correct you slightly. Although Linux is a good model, Cygwin primarily strives to be a good *POSIX* platform, so there may be cases where the two intentionally differ. -- Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Re: gawk 4.1.4: CR separate char for CRLF files
On Thu, 10 Aug 2017 10:45:34, cyg Simple wrote: David, I don't know what it is about your email that my thunderbird client doesn't like but I can't read your email except from reviewing the message source. Hes using quoted-printable, but he is not actually breaking on 80, so it just comes out as one long line. Really annoying, and I usually wont even reads posts from people who do that. Here is an example: http://cygwin.com/ml/cygwin/2017-08/msg00104.html -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Re: gawk 4.1.4: CR separate char for CRLF files
On 8/10/2017 8:31 AM, David Macek wrote: David, I don't know what it is about your email that my thunderbird client doesn't like but I can't read your email except from reviewing the message source. Your assumption that Cygwin strives to be a good *POSIX* platform also applies to Linux. If you find a difference then there is a discrepancy that should be documented or resolved. However, you should determine if you need to resolve the difference if you're on differing platforms. So my statement of "if it doesn't work on Linux but does on Cygwin" still needs to be considered because of portability issues. -- cyg Simple -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Re: gawk 4.1.4: CR separate char for CRLF files
On 10. 8. 2017 14:04, cyg Simple wrote: The clue here is, does it only work for this type of OS? If yes then it isn't portable anyway but should it be? And does it only work on this type of OS because of an issue that could change as a result of a fix. Cygwin has always been and will always be a work in progress. The rule of thumb "does it work on Linux" should be applied to all that you do with Cygwin. If it only works on Cygwin and not on Linux then the chances are, something will change. I feel the need to correct you slightly. Although Linux is a good model, Cygwin primarily strives to be a good *POSIX* platform, so there may be cases where the two intentionally differ. -- David Macek smime.p7s Description: S/MIME Cryptographic Signature
Re: gawk 4.1.4: CR separate char for CRLF files
On 8/9/2017 3:09 PM, Eric Blake wrote: > On 08/09/2017 06:03 AM, Eric Blake wrote: >> On 08/09/2017 03:37 AM, Jannick wrote: >> >>> Which is a pretty much of a pain when there is no easy fallback solution >>> provided in case a major change is applied. > ... >>> This is - to say the least - unpleasant in the light of what Cygwin claims >>> to be, namely 'a large collection of GNU and Open Source tools which provide >>> functionality similar to a Linux distribution on Windows' (from the top of >>> the start website www.cygwin.com). >> >> On Linux, nothing strips CR automatically. So on Cygwin, we behave the >> same - nothing strips CR automatically on binary mounted data. >> >> And the fact that the change was made AND ANNOUNCED back in February, >> but you are now only 6 months later complaining about it, is telling. > > It was pointed out to me off-list that my reply can easily be mis-read > in a much more negative tone than I intended, so I'm apologizing for > coming across as mean (yes, I know, https://cygwin.com/acronyms/#WJM). > I think I was trying to emphasize that complaints about the behavior > change at the time of the change were expected (and there was indeed a > reaction, although I was pleasantly surprised at the time that it was > limited to just a few threads, so apparently not many people were > negatively impacted - and that's a good thing). But complaints about > the behavior after six months are a bit unexpected. But I guess not > everyone keeps their software up-to-date on quite as frequent a > schedule, so I shouldn't have been as surprised or reacted as harshly. > I don't think you need to apologize, in fact your post stopped me from posting similarly. > At any rate, my advice continues to be the same: how would you deal with > CRLF on a Linux system? That's the ideal way to also deal with it on > Cygwin (we used to have gratuitous incompatibilities between the systems > where the same command line on Linux did not have the same result as on > Cygwin; but the change back in February was to get rid of those > incompatibilities, even if it breaks scripts that were unwisely relying > on the incompatibilities). > The clue here is, does it only work for this type of OS? If yes then it isn't portable anyway but should it be? And does it only work on this type of OS because of an issue that could change as a result of a fix. Cygwin has always been and will always be a work in progress. The rule of thumb "does it work on Linux" should be applied to all that you do with Cygwin. If it only works on Cygwin and not on Linux then the chances are, something will change. -- cyg Simple -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Re: gawk 4.1.4: CR separate char for CRLF files
On 08/09/2017 06:03 AM, Eric Blake wrote: > On 08/09/2017 03:37 AM, Jannick wrote: > >> Which is a pretty much of a pain when there is no easy fallback solution >> provided in case a major change is applied. ... >> This is - to say the least - unpleasant in the light of what Cygwin claims >> to be, namely 'a large collection of GNU and Open Source tools which provide >> functionality similar to a Linux distribution on Windows' (from the top of >> the start website www.cygwin.com). > > On Linux, nothing strips CR automatically. So on Cygwin, we behave the > same - nothing strips CR automatically on binary mounted data. > > And the fact that the change was made AND ANNOUNCED back in February, > but you are now only 6 months later complaining about it, is telling. It was pointed out to me off-list that my reply can easily be mis-read in a much more negative tone than I intended, so I'm apologizing for coming across as mean (yes, I know, https://cygwin.com/acronyms/#WJM). I think I was trying to emphasize that complaints about the behavior change at the time of the change were expected (and there was indeed a reaction, although I was pleasantly surprised at the time that it was limited to just a few threads, so apparently not many people were negatively impacted - and that's a good thing). But complaints about the behavior after six months are a bit unexpected. But I guess not everyone keeps their software up-to-date on quite as frequent a schedule, so I shouldn't have been as surprised or reacted as harshly. At any rate, my advice continues to be the same: how would you deal with CRLF on a Linux system? That's the ideal way to also deal with it on Cygwin (we used to have gratuitous incompatibilities between the systems where the same command line on Linux did not have the same result as on Cygwin; but the change back in February was to get rid of those incompatibilities, even if it breaks scripts that were unwisely relying on the incompatibilities). -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature
Re: gawk 4.1.4: CR separate char for CRLF files
On 08/09/2017 03:37 AM, Jannick wrote: > Which is a pretty much of a pain when there is no easy fallback solution > provided in case a major change is applied. E.g. for sed - if I understand > the reference to sed in https://cygwin.com/ml/cygwin/2017-08/msg00033.html > correctly - a separate switch '-b' is added. Incorrect. 'sed -b' has always existed, but did NOT do what you wanted (it forced CR to be treated as a separate character; where what you want is to ignore CR if it appears before LF). In fact, the coordinated change made back in February to all of grep, sed, and awk, was that all three programs now default to what used to be possible only through 'sed -b', because silently stripping CR can corrupt data when you are not expecting it, while requiring the user to explicitly strip CR when they know they are working with CRLF line endings is less magic (fewer downstream patches, and more obvious in looking at a script that the script knows what it is doing). If your data lives on a text mount (instead of a binary mount), then you still get CR stripping for free. If your data comes from a pipeline rather than the file system, then you can add a d2u or other CR-stripping tool in the pipeline. > This is - to say the least - unpleasant in the light of what Cygwin claims > to be, namely 'a large collection of GNU and Open Source tools which provide > functionality similar to a Linux distribution on Windows' (from the top of > the start website www.cygwin.com). On Linux, nothing strips CR automatically. So on Cygwin, we behave the same - nothing strips CR automatically on binary mounted data. And the fact that the change was made AND ANNOUNCED back in February, but you are now only 6 months later complaining about it, is telling. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature
RE: gawk 4.1.4: CR separate char for CRLF files
Hi Roger, On Wed, 9 Aug 2017 07:03:24 +, Roger Krebs wrote: > I've added a BEGIN section at the beginning awk sript file setting the record > separator explicitly for the input file (RS) as well as for the output file (ORS): > > BEGIN { > RS="\r\n" > ORS="\r\n" > } > { >... your script > } > > Especially the RS parameter wasn't necessary in the past but now it is. Which is a pretty much of a pain when there is no easy fallback solution provided in case a major change is applied. E.g. for sed - if I understand the reference to sed in https://cygwin.com/ml/cygwin/2017-08/msg00033.html correctly - a separate switch '-b' is added. For the latest gawk version I cannot see anything like that which means that all of our awk scripts run against cygwin's gawk do break without any tweak unless I am missing anything here. This is - to say the least - unpleasant in the light of what Cygwin claims to be, namely 'a large collection of GNU and Open Source tools which provide functionality similar to a Linux distribution on Windows' (from the top of the start website www.cygwin.com). Again, admittedly I did not dive into the discussion and the substance of the reasoning to make this move to gawk | sed | grep. Now I can see the following *easy* solutions to the very situation here (input only for now): 1 - Inserting the BEGIN section as you suggested into more than 1k scripts (not feasible due to additional regression test workload) 2 - Calling 'gawk -vRS=\r\n -vORS=\r\n' instead of 'gawk' (hack to turn back the additional the latest gawk's complexity, wrapper needed) 3 - Wrapping a d2u/u2d pipe solution (additional app and wrapper needed again) 4 - Using another compiled version of gawk which does *not* disable the out-of-the-box gawk feature to swallow CRs (cf., e.g., http://git.savannah.gnu.org/cgit/gawk.git/tree/awkgram.y#n3543), i.e. without the artificial obstacle to now know the EOL type of the input file ahead of running gawk. > It works in all my cases. The only disadvantage: you have to know what kind ... plus the disadvantage to systematically amend all the scripts instead of having an external solution > of files you want to handle in the awk script. The same awk script will not > work for DOS files as well as for linux files. ... another issue originated by the change and which didnt exist before. > Best > > Roger Please don't get me wrong, but this raises a real issue here and I am not sure which rationale other than 'let's get more of the Linux-feel' drove the decision. All the best, J. > -Ursprüngliche Nachricht- > Von: cygwin-ow...@cygwin.com [mailto:cygwin-ow...@cygwin.com] Im > Auftrag von Jannick > Gesendet: Mittwoch, 9. August 2017 02:48 > An: cygwin@cygwin.com > Betreff: RE: gawk 4.1.4: CR separate char for CRLF files > > On Tue, 08 Aug 2017 16:23:40 -0700 (PDT), Steven Penny wrote: > > On Wed, 9 Aug 2017 01:15:08, "Jannick" wrote: > > > the current version 4.1.4 of gawk appears to unpleasantly treat CR > > > for CRLF files, i.e. CR is not gracefully swallowed, but is a > > > separate > character. > > > > > > This makes some, if not all, of the scripts we are working with here > > > useless, unless the input files are converted to LF which certainly > > > is not feasible. IIRC the issue did not show up some versions back. > > > > > > Is this a bug - or am I missing something here? > > > > Learn to read: > > > > http://cygwin.com/ml/cygwin/2017-08/msg00033.html > > Thanks - quickly done. > > The link reveals that CRLF/LF conversion is now mandatory to work with > cygwin's gawk on DOS machines. As far as I can see there is no legacy > solution like for, e.g., sed (-b switch) to have an easy solution for the issue, > especially when invoking gawk from makefiles (piping). > > I consider this bad news while admittedly not fully understanding the whole > background of the move which is not necessary for now. -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
RE: gawk 4.1.4: CR separate char for CRLF files
On Tue, 08 Aug 2017 16:23:40 -0700 (PDT), Steven Penny wrote: > On Wed, 9 Aug 2017 01:15:08, "Jannick" wrote: > > the current version 4.1.4 of gawk appears to unpleasantly treat CR for > > CRLF files, i.e. CR is not gracefully swallowed, but is a separate character. > > > > This makes some, if not all, of the scripts we are working with here > > useless, unless the input files are converted to LF which certainly is > > not feasible. IIRC the issue did not show up some versions back. > > > > Is this a bug - or am I missing something here? > > Learn to read: > > http://cygwin.com/ml/cygwin/2017-08/msg00033.html Thanks - quickly done. The link reveals that CRLF/LF conversion is now mandatory to work with cygwin's gawk on DOS machines. As far as I can see there is no legacy solution like for, e.g., sed (-b switch) to have an easy solution for the issue, especially when invoking gawk from makefiles (piping). I consider this bad news while admittedly not fully understanding the whole background of the move which is not necessary for now. -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Re: gawk 4.1.4: CR separate char for CRLF files
On Wed, 9 Aug 2017 01:15:08, "Jannick" wrote: the current version 4.1.4 of gawk appears to unpleasantly treat CR for CRLF files, i.e. CR is not gracefully swallowed, but is a separate character. This makes some, if not all, of the scripts we are working with here useless, unless the input files are converted to LF which certainly is not feasible. IIRC the issue did not show up some versions back. Is this a bug - or am I missing something here? Learn to read: http://cygwin.com/ml/cygwin/2017-08/msg00033.html -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple