Thank you Avraham, .xlsx are zipped xml files so wb is required for it to be 
readable.

Kind regards
Hernando

________________________________
From: R-devel <r-devel-boun...@r-project.org> on behalf of Avraham Adler 
<avraham.ad...@gmail.com>
Sent: Sunday, August 10, 2025 2:52:49 PM
To: Paul McQuesten <mcques...@gmail.com>
Cc: R-devel <r-devel@r-project.org>; Hernando Cortina <h...@alum.mit.edu>
Subject: Re: [Rd] Including mode='wb' in download.file() for .xlsx files on 
Windows ?

If I recall correctly, xlsx files are XML. It is the xls/xlsb files which are 
binary.

https://learn.microsoft.com/en-us/openspecs/office_standards/ms-xlsx/2c5dee00-eff2-4b22-92b6-0738acd4475e

Sent from my iPhone

> On Aug 10, 2025, at 2:38 PM, Paul McQuesten <mcques...@gmail.com> wrote:
>
> Perhaps it would be simpler, and more future-proof, for R to always
> download as binary.
> Are there any modern consumers of text files that are bothered by '\r\n'?
> Or even Macintosh '\r' line terminators?
>
>> On Sun, Aug 10, 2025 at 1:22 PM Hernando Cortina <h...@alum.mit.edu> wrote:
>>
>> Yes, .docx and .pptx are part of the same specification.
>>
>>
>>
>> Kind regards
>>
>> Hernando
>>
>>
>>
>> *From: *Paul McQuesten <mcques...@gmail.com>
>> *Date: *Sunday, August 10, 2025 at 1:34 PM
>> *To: *Hernando Cortina <h...@alum.mit.edu>
>> *Subject: *Re: [Rd] Including mode='wb' in download.file() for .xlsx
>> files on Windows ?
>>
>> IIUC, '.docx' files are also binary?
>>
>>
>>
>> On Sun, Aug 10, 2025 at 11:29 AM Hernando Cortina <hcortin...@gmail.com>
>> wrote:
>>
>> Hello all, regarding download.file():
>>
>> On Windows, if mode is not supplied (missing()) and url ends in one of
>> ‘⁠.gz⁠’, ‘⁠.bz2⁠’, ‘⁠.xz⁠’, ‘⁠.tgz⁠’, ‘⁠.zip⁠’, ‘⁠.jar⁠’, ‘⁠.rda⁠’,
>> ‘⁠.rds⁠’, ‘⁠.RData⁠’ or ‘⁠.pdf⁠’, mode = "wb" is set so that a binary
>> transfer is done to help unwary users.
>>
>> May I suggest possibly including .xlsx files to the list of extensions
>> that get this treatment?
>>
>> Downloading such files may be a quite common activity in the R
>> community and having to manually add mode=”wb” may indeed catch
>> Windows users unaware, particularly if they are coming from Linux or
>> Mac where this is not necessary.
>>
>> I understand that it’s hard to know when to stop when adding
>> additional extensions.  That said, .xlsx is quite ubiquitous in the
>> wild and standardized under ECMA-376.
>>
>> I hope this might be helpful to others, and thank you for your
>> consideration.
>> Hernando
>> ---------------
>>
>> The change in src/library/utils/R/Windows/download.file.R would be:
>>
>> …
>>
>> if(missing(mode) &&
>> length(grep("\\.(gz|bz2|xz|tgz|zip|jar|rd[as]|RData|xlsx)$",
>>
>>                    URLdecode(url))))
>>
>>        mode <- "wb"
>>
>> …
>>
>> ______________________________________________
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>>
>
>    [[alternative HTML version deleted]]
>
> ______________________________________________
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

        [[alternative HTML version deleted]]

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

        [[alternative HTML version deleted]]

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to