Re: [Rd] download.file does not process gz files correctly (truncates them?)

2018-05-12 Thread Joris Meys
I can confirm that Excel does all kind of strange things when opening a csv file and saving it from Excel, including adding unnecessarily another set of quotes around already qouted text fields. But I never had problems with Excel not getting linux-type line endings correctly. I'll see if I can

Re: [Rd] download.file does not process gz files correctly (truncates them?)

2018-05-09 Thread Dirk Eddelbuettel
On 9 May 2018 at 10:37, Tomas Kalibera wrote: | And for that reason the behavior should be as intuitive as possible when | designed. What was intuitive 15-20 years ago may not be intuitive now, | but that should probably not be a justification for a change in | documented behavior. Time for

Re: [Rd] download.file does not process gz files correctly (truncates them?)

2018-05-09 Thread peter dalgaard
There was a hint in the Twitterverse that Excel has issues with line endings in .csv. Can anyone elaborate on that? Then again, Excel goes belly-up on comma separators in central European locales anyway... -pd > On 8 May 2018, at 22:47 , Hadley Wickham wrote: > > > Also

Re: [Rd] download.file does not process gz files correctly (truncates them?)

2018-05-09 Thread Duncan Murdoch
On 08/05/2018 4:47 PM, Hadley Wickham wrote: On Tue, May 8, 2018 at 8:15 AM, Hadley Wickham wrote: On Thu, May 3, 2018 at 11:34 PM, Tomas Kalibera wrote: On 05/03/2018 11:14 PM, Henrik Bengtsson wrote: Also, as mentioned in my

Re: [Rd] download.file does not process gz files correctly (truncates them?)

2018-05-09 Thread Tomas Kalibera
On 05/08/2018 05:15 PM, Hadley Wickham wrote: On Thu, May 3, 2018 at 11:34 PM, Tomas Kalibera wrote: On 05/03/2018 11:14 PM, Henrik Bengtsson wrote: Also, as mentioned in my https://stat.ethz.ch/pipermail/r-devel/2012-August/064739.html, when not specifying the mode

Re: [Rd] download.file does not process gz files correctly (truncates them?)

2018-05-08 Thread Hadley Wickham
On Tue, May 8, 2018 at 8:15 AM, Hadley Wickham wrote: > On Thu, May 3, 2018 at 11:34 PM, Tomas Kalibera > wrote: >> On 05/03/2018 11:14 PM, Henrik Bengtsson wrote: >>> >>> Also, as mentioned in my >>>

Re: [Rd] download.file does not process gz files correctly (truncates them?)

2018-05-08 Thread Hadley Wickham
On Thu, May 3, 2018 at 11:34 PM, Tomas Kalibera wrote: > On 05/03/2018 11:14 PM, Henrik Bengtsson wrote: >> >> Also, as mentioned in my >> https://stat.ethz.ch/pipermail/r-devel/2012-August/064739.html, when >> not specifying the mode argument, the default on Windows is

Re: [Rd] download.file does not process gz files correctly (truncates them?)

2018-05-07 Thread Gabe Becker
Hey all, I don't have a strong opinion about whether the default should ultimately eventually change or not. Many people who use windows (a set which does not include me) seem to think it would be better. I will say that like Hugh, I'm strongly against making the argument mandatory as an interim

Re: [Rd] download.file does not process gz files correctly (truncates them?)

2018-05-07 Thread Hugh Parsonage
I'd add my support for mode = "wb" to (eventually) become the default, though I respect Tomas's comments about backwards-compatibility. Instead of making the argument mandatory (which would immediately break scripts -- even ones that won't be helped by changing to mode = 'wb') or otherwise

Re: [Rd] download.file does not process gz files correctly (truncates them?)

2018-05-07 Thread Joris Meys
Martin, also from me a heartfelt thank you for taking care of this. Some thoughts on Henrik's response: On Mon, May 7, 2018 at 2:28 AM, Henrik Bengtsson wrote: > > I still argue that the current behavior cause more harm than it helps. > I agree with your analysis

Re: [Rd] download.file does not process gz files correctly (truncates them?)

2018-05-06 Thread Henrik Bengtsson
Thanks for the comments, feedback, and improvements. I still argue that the current behavior cause more harm than it helps. First of all, it increases the risk for code that does not work on all platforms, which I'd say is one of the strengths and design goals of R. To write cross-platform

Re: [Rd] download.file does not process gz files correctly (truncates them?)

2018-05-04 Thread Martin Maechler
> Joris Meys > on Fri, 4 May 2018 10:00:07 +0200 writes: > On Fri, May 4, 2018 at 8:34 AM, Tomas Kalibera > wrote: >> The current heuristic/hack is in line with the >> compatibility approach: it detects files that are

Re: [Rd] download.file does not process gz files correctly (truncates them?)

2018-05-04 Thread Joris Meys
On Fri, May 4, 2018 at 8:34 AM, Tomas Kalibera wrote: > The current heuristic/hack is in line with the compatibility approach: it > detects files that are obviously binary, so it changes the default behavior > only for cases when it would obviously cause damage. > >

Re: [Rd] download.file does not process gz files correctly (truncates them?)

2018-05-04 Thread Martin Maechler
> Tomas Kalibera > on Fri, 4 May 2018 08:34:03 +0200 writes: > On 05/03/2018 11:14 PM, Henrik Bengtsson wrote: >> Also, as mentioned in my >> https://stat.ethz.ch/pipermail/r-devel/2012-August/064739.html, >> when not specifying the mode

Re: [Rd] download.file does not process gz files correctly (truncates them?)

2018-05-04 Thread Tomas Kalibera
On 05/03/2018 11:14 PM, Henrik Bengtsson wrote: Also, as mentioned in my https://stat.ethz.ch/pipermail/r-devel/2012-August/064739.html, when not specifying the mode argument, the default on Windows is mode = "w" *except* for certain, case-sensitive, filename extensions: if(missing(mode)

Re: [Rd] download.file does not process gz files correctly (truncates them?)

2018-05-03 Thread Henrik Bengtsson
Also, as mentioned in my https://stat.ethz.ch/pipermail/r-devel/2012-August/064739.html, when not specifying the mode argument, the default on Windows is mode = "w" *except* for certain, case-sensitive, filename extensions: if(missing(mode) && length(grep("\\.(gz|bz2|xz|tgz|zip|rda|RData)$",

Re: [Rd] download.file does not process gz files correctly (truncates them?)

2018-05-03 Thread Joris Meys
Thank you Henrik and Martin for explaining what was going on. Very insightful! On Thu, May 3, 2018 at 4:21 PM, Jeroen Ooms wrote: > On Thu, May 3, 2018 at 2:42 PM, Henrik Bengtsson > wrote: > > Use mode="wb" when you download the file. See > >

Re: [Rd] download.file does not process gz files correctly (truncates them?)

2018-05-03 Thread Jeroen Ooms
On Thu, May 3, 2018 at 2:42 PM, Henrik Bengtsson wrote: > Use mode="wb" when you download the file. See > https://github.com/HenrikBengtsson/Wishlist-for-R/issues/30. > > R core, and others, is there a good argument for why we are not making this > the default download

Re: [Rd] download.file does not process gz files correctly (truncates them?)

2018-05-03 Thread Martin Morgan
On 05/03/2018 05:48 AM, Joris Meys wrote: Dear all, I've been diving a bit deeper into this per request of Tomas Kalibra, and found the following : - the lock on the file is only after trying to read it using oligo, so that's not a R problem in itself. The problem is independent of extrenal

Re: [Rd] download.file does not process gz files correctly (truncates them?)

2018-05-03 Thread Duncan Murdoch
On 03/05/2018 8:42 AM, Henrik Bengtsson wrote: Use mode="wb" when you download the file. See https://github.com/HenrikBengtsson/Wishlist-for-R/issues/30. R core, and others, is there a good argument for why we are not making this the default download mode? It seems like a such a simple fix to

Re: [Rd] download.file does not process gz files correctly (truncates them?)

2018-05-03 Thread Joris Meys
Dear all, I've been diving a bit deeper into this per request of Tomas Kalibra, and found the following : - the lock on the file is only after trying to read it using oligo, so that's not a R problem in itself. The problem is independent of extrenal packages. - using Windows' fc utility and

Re: [Rd] download.file does not process gz files correctly (truncates them?)

2018-05-03 Thread Martin Morgan
On 05/02/2018 03:21 PM, Joris Meys wrote: Dear all, I've noticed by trying to download gz files from here : https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM907811 At the bottom one can download GSM907811.CEL.gz . If I download this manually and try

Re: [Rd] download.file does not process gz files correctly (truncates them?)

2018-05-03 Thread Joris Meys
Using the correct mode absolutely solves it. Apologies for not trying the obvious. Cheers Joris On Thu, May 3, 2018 at 2:10 PM, Martin Morgan wrote: > > > On 05/02/2018 03:21 PM, Joris Meys wrote: > >> Dear all, >> >> I've noticed by trying to download gz files

Re: [Rd] download.file does not process gz files correctly (truncates them?)

2018-05-03 Thread Henrik Bengtsson
Use mode="wb" when you download the file. See https://github.com/HenrikBengtsson/Wishlist-for-R/issues/30. R core, and others, is there a good argument for why we are not making this the default download mode? It seems like a such a simple fix to such a common "mistake". Henrik On Thu, May 3,

[Rd] download.file does not process gz files correctly (truncates them?)

2018-05-03 Thread Joris Meys
Dear all, I've noticed by trying to download gz files from here : https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM907811 At the bottom one can download GSM907811.CEL.gz . If I download this manually and try oligo::read.celfiles("GSM907811.CEL.gz") everything works fine. (oligo is a