Re: [R] readxl, read_excel: how colon (:) is read ?

2022-04-01 Thread Jeff Newmiller
Why not simply let readxl read the timestamp as it will naturally and use

dta$Arrival_time_c <- as.character( dta$Arrival_time, format = "%H:%M" )

... though time data in such a form is remarkably useless for data analysis.

On April 1, 2022 12:09:02 AM PDT, Rui Barradas  wrote:
>Hello,
>
>The following function is probably not very solid, to work with dates 
>and times is not trivial, but it converts those Excel numbers correctly.
>
>I don't know with what numbers it fails.
>
>
>xl_fracday_to_time <- function(x, digits = 0L) {
>   old_opts <- options(digits = 20)
>   on.exit(options(old_opts))
>   if(is.character(x)) {
> x <- as.numeric(x)
>   }
>   y <- x * 24
>   hours <- floor(y + .Machine$double.eps^0.5)
>   y <- (y - hours) * 60
>   mins <- floor(y + .Machine$double.eps^0.5)
>   y <- (y - mins) * 60
>   secs <- round(y, digits = digits)
>   sprintf("%02d:%02d:%02d", hours, mins, secs)
>}
>
>x <- c("0.568750009", "0.577083328")
>xl_fracday_to_time(x)
>#[1] "13:39:00" "13:51:00"
>
>
>Hope this helps,
>
>Rui Barradas
>
>Às 07:12 de 01/04/2022, Patrick Giraudoux escreveu:
>> I have a unexpected behaviour reading times with colon from an Excel
>> file, using the package readxl.
>> 
>> In an Excel sheet, I have a column with times in hours:minutes, e.g:
>> 
>> Arrival_time
>> 13:39
>> 13:51
>> 
>> When read from R with readxl::read_excel, this gives a tibble column
>> with full date by defaut being the last day of 1899. OK. Why not, I know
>> that POSIX variables are starting in 1900 after R doc (however I wonder
>> why here the defaut is one day before January 1, 1900
>> 
>>> tmp$Arrival_time  [1] "1899-12-31 13:39:00 UTC" "1899-12-31 13:51:00 UTC"
>> 
>> Well, this is not exactly what I want to. I do not care about the year
>> and the day... Therefore I decided to import this column as "text"
>> explicitely (in order to manage it within R then). And this is what I
>> get now:
>> 
>>> read_excel("saisie_data_durban_rapaces_LPO.xlsx",sheet=2,col_types="text")  
>>> > tmp$Arrival_time [1] "0.568750009" "0.577083328"
>> 
>> Can someone tell me what happens ?
>> 
>> I would really appreciate to understand the trick...
>> 
>> 
>>  [[alternative HTML version deleted]]
>> 
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

-- 
Sent from my phone. Please excuse my brevity.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] readxl, read_excel: how colon (:) is read ?

2022-04-01 Thread Rui Barradas

Hello,

The following function is probably not very solid, to work with dates 
and times is not trivial, but it converts those Excel numbers correctly.


I don't know with what numbers it fails.


xl_fracday_to_time <- function(x, digits = 0L) {
  old_opts <- options(digits = 20)
  on.exit(options(old_opts))
  if(is.character(x)) {
x <- as.numeric(x)
  }
  y <- x * 24
  hours <- floor(y + .Machine$double.eps^0.5)
  y <- (y - hours) * 60
  mins <- floor(y + .Machine$double.eps^0.5)
  y <- (y - mins) * 60
  secs <- round(y, digits = digits)
  sprintf("%02d:%02d:%02d", hours, mins, secs)
}

x <- c("0.568750009", "0.577083328")
xl_fracday_to_time(x)
#[1] "13:39:00" "13:51:00"


Hope this helps,

Rui Barradas

Às 07:12 de 01/04/2022, Patrick Giraudoux escreveu:

I have a unexpected behaviour reading times with colon from an Excel
file, using the package readxl.

In an Excel sheet, I have a column with times in hours:minutes, e.g:

Arrival_time
13:39
13:51

When read from R with readxl::read_excel, this gives a tibble column
with full date by defaut being the last day of 1899. OK. Why not, I know
that POSIX variables are starting in 1900 after R doc (however I wonder
why here the defaut is one day before January 1, 1900


tmp$Arrival_time  [1] "1899-12-31 13:39:00 UTC" "1899-12-31 13:51:00 UTC"


Well, this is not exactly what I want to. I do not care about the year
and the day... Therefore I decided to import this column as "text"
explicitely (in order to manage it within R then). And this is what I
get now:


read_excel("saisie_data_durban_rapaces_LPO.xlsx",sheet=2,col_types="text")  > tmp$Arrival_time [1] 
"0.568750009" "0.577083328"


Can someone tell me what happens ?

I would really appreciate to understand the trick...


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] readxl, read_excel: how colon (:) is read ?

2022-04-01 Thread Patrick Giraudoux
This can be made using the TEXT (TEXTE in the French version) function 
of Excel, hence:

TEXT(M2;"HH:MM")

Changes the time into text, and it can be imported from R as wanted.



Le 01/04/2022 à 08:34, Patrick Giraudoux a écrit :
> Absolutely correct ! I checked in Excel and when I change the format 
> to "text", then I get in Excel the same fractional numbers as those 
> obtained importing text from R... Hence the issue comes from Excel 
> itself. Will find a way to change this format to text in Excel without 
> avoiding such conversion...
> Thanks Andrew !
>
> Le 01/04/2022 à 08:26, Andrew Simmons a écrit :
>> Probably (but not entirely sure), Excel is storing your text as a 
>> number of days, so 13:38 is a little more than half a day. Open your 
>> spreadsheet in excel and save those columns as text instead of times, 
>> that (should) fix your issue.
>>
>> On Fri, Apr 1, 2022, 02:12 Patrick Giraudoux 
>>  wrote:
>>
>> I have a unexpected behaviour reading times with colon from an Excel
>> file, using the package readxl.
>>
>> In an Excel sheet, I have a column with times in hours:minutes, e.g:
>>
>> Arrival_time
>> 13:39
>> 13:51
>>
>> When read from R with readxl::read_excel, this gives a tibble column
>> with full date by defaut being the last day of 1899. OK. Why not,
>> I know
>> that POSIX variables are starting in 1900 after R doc (however I
>> wonder
>> why here the defaut is one day before January 1, 1900
>>
>> > tmp$Arrival_time  [1] "1899-12-31 13:39:00 UTC" "1899-12-31
>> 13:51:00 UTC"
>>
>> Well, this is not exactly what I want to. I do not care about the
>> year
>> and the day... Therefore I decided to import this column as "text"
>> explicitely (in order to manage it within R then). And this is
>> what I
>> get now:
>>
>> >
>> 
>> read_excel("saisie_data_durban_rapaces_LPO.xlsx",sheet=2,col_types="text")
>> > tmp$Arrival_time [1] "0.568750009" "0.577083328"
>>
>> Can someone tell me what happens ?
>>
>> I would really appreciate to understand the trick...
>>
>>
>>         [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> 
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] readxl, read_excel: how colon (:) is read ?

2022-04-01 Thread Patrick Giraudoux
Le 01/04/2022 à 08:40, Jeff Newmiller a écrit :
> Both R and Excel assume a date is associated with every time object. In 
> Excel, when you show a date it is an integer number of days since 1899-12-31 
> (due to a mistake made early in programming it). Whenever you show a time, it 
> it merely displaying the time portion (fraction of a day) of a date/time. The 
> date part of that value may or may not be 1899-12-31.
>
> With this in mind, you are tilting at windmills hoping to import a "pure 
> time" because no such thing exists in either program. You can choose to 
> render a `POSIXct` as showing only the time portion when you convert it to 
> character if you so choose.

Thanks for the infos. Yes, this is exactly what  I did yesterday with 
POSIXct > POSIXlt to go ahead. However I wanted to understand fully what 
happened, hence the call to the list. Jeff and Andrew, now eveything is 
clear to me thanks to you...


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] readxl/lifecycle/rlang

2021-12-02 Thread Ivan Krylov
On Wed, 1 Dec 2021 16:38:37 -0500
Dennis Weygand  wrote:

> Installing package into
> ‘C:/Users/dennisweygand/Documents/R/win-library/3.5’ (as ‘lib’ is
> unspecified)

It could be time to install a newer version of R. CRAN doesn't provide
binary builds of packages for R 3.5...

>   binary source needs_compilation
> rlang  0.4.5 0.4.12  TRUE
> lifecycle  0.2.0  1.0.1 FALSE

...so the latest version of rlang you can currently install is 0.4.5,
but lifecycle wants rlang (≥ 0.4.10). You can either follow the link
https://cran.rstudio.com/bin/windows/Rtools/ and install Rtools to
compile rlang 0.4.12 from source, or install a later version of R (≥
4.0) that CRAN provides Windows binaries for.

You could also try the XLConnect package (it works fine for me on
older installations of R), but it depends on Java. 

>   [[alternative HTML version deleted]]

P.S. Please compose your messages in plain text to make sure that we
see them exactly as you write them.

-- 
Best regards,
Ivan

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] readxl/lifecycle/rlang

2021-12-01 Thread Jeff Newmiller
This

>package ‘rlang’ successfully unpacked and MD5 sums checked
>Warning in install.packages :
>  cannot remove prior installation of package ‘rlang’

indicates that rlang didn't get re-installed. It is a warning because R seems 
to think it could still be working, but it isn't verifying that status.

Below that,

>ERROR: dependency 'rlang' is not available for package 'lifecycle'

indicates that there is still a problem with rlang.

The "cannot remove prior installation" often means that the installed package 
is open, or you have some other program (File Manager?) holding one of the 
directories open. Be sure to shut down all but one instance of R before 
modifying your package library.

It is also possible that anti-virus software is being overly protective.

And DONT keep bashing on the package that depends on rlang... install the one 
that was at the root of the problem: rlang. Only after that succeeds should you 
go back and try lifecycle. Once that one works, try installing/updating readxl 
again.


On December 1, 2021 1:38:37 PM PST, Dennis Weygand  wrote:
>I have been using read_excel for years... suddenly today I have this
>problem:
>
>> T1 <- read_excel(fname,sheet="T1")
>Error in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]])
>:
>  there is no package called ‘lifecycle’
>
>So I install lifecycle:
>
>install.packages("lifecycle")
>WARNING: Rtools is required to build R packages but is not currently
>installed. Please download and install the appropriate version of Rtools
>before proceeding:
>
>https://cran.rstudio.com/bin/windows/Rtools/
>Installing package into ‘C:/Users/dennisweygand/Documents/R/win-library/3.5’
>(as ‘lib’ is unspecified)
>also installing the dependency ‘rlang’
>
>
>  There are binary versions available but the source versions are later:
>  binary source needs_compilation
>rlang  0.4.5 0.4.12  TRUE
>lifecycle  0.2.0  1.0.1 FALSE
>
>  Binaries will be installed
>trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.5/rlang_0.4.5.zip
>'
>Content type 'application/zip' length 1115838 bytes (1.1 MB)
>downloaded 1.1 MB
>
>package ‘rlang’ successfully unpacked and MD5 sums checked
>Warning in install.packages :
>  cannot remove prior installation of package ‘rlang’
>
>The downloaded binary packages are in
>C:\Users\dennisweygand\AppData\Local\Temp\RtmpKSler3\downloaded_packages
>installing the source package ‘lifecycle’
>
>trying URL 'https://cran.rstudio.com/src/contrib/lifecycle_1.0.1.tar.gz'
>Content type 'application/x-gzip' length 99716 bytes (97 KB)
>downloaded 97 KB
>
>ERROR: dependency 'rlang' is not available for package 'lifecycle'
>* removing 'C:/Users/dennisweygand/Documents/R/win-library/3.5/lifecycle'
>In R CMD INSTALL
>Warning in install.packages :
>  installation of package ‘lifecycle’ had non-zero exit status
>
>The downloaded source packages are in
>‘C:\Users\dennisweygand\AppData\Local\Temp\RtmpKSler3\downloaded_packages’
>
>
>Did it work?
>Try it again:
>
>> T1 <- read_excel(fname,sheet="T1")
>Error in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]])
>:
>  there is no package called ‘lifecycle’
>
>I am clueless: what am I doing wrong?
>Can anyone help?
>D.P. Weygand
>
>   [[alternative HTML version deleted]]
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

-- 
Sent from my phone. Please excuse my brevity.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] readxl/lifecycle/rlang

2021-12-01 Thread Dennis Weygand
I have been using read_excel for years... suddenly today I have this
problem:

> T1 <- read_excel(fname,sheet="T1")
Error in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]])
:
  there is no package called ‘lifecycle’

So I install lifecycle:

install.packages("lifecycle")
WARNING: Rtools is required to build R packages but is not currently
installed. Please download and install the appropriate version of Rtools
before proceeding:

https://cran.rstudio.com/bin/windows/Rtools/
Installing package into ‘C:/Users/dennisweygand/Documents/R/win-library/3.5’
(as ‘lib’ is unspecified)
also installing the dependency ‘rlang’


  There are binary versions available but the source versions are later:
  binary source needs_compilation
rlang  0.4.5 0.4.12  TRUE
lifecycle  0.2.0  1.0.1 FALSE

  Binaries will be installed
trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.5/rlang_0.4.5.zip
'
Content type 'application/zip' length 1115838 bytes (1.1 MB)
downloaded 1.1 MB

package ‘rlang’ successfully unpacked and MD5 sums checked
Warning in install.packages :
  cannot remove prior installation of package ‘rlang’

The downloaded binary packages are in
C:\Users\dennisweygand\AppData\Local\Temp\RtmpKSler3\downloaded_packages
installing the source package ‘lifecycle’

trying URL 'https://cran.rstudio.com/src/contrib/lifecycle_1.0.1.tar.gz'
Content type 'application/x-gzip' length 99716 bytes (97 KB)
downloaded 97 KB

ERROR: dependency 'rlang' is not available for package 'lifecycle'
* removing 'C:/Users/dennisweygand/Documents/R/win-library/3.5/lifecycle'
In R CMD INSTALL
Warning in install.packages :
  installation of package ‘lifecycle’ had non-zero exit status

The downloaded source packages are in
‘C:\Users\dennisweygand\AppData\Local\Temp\RtmpKSler3\downloaded_packages’


Did it work?
Try it again:

> T1 <- read_excel(fname,sheet="T1")
Error in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]])
:
  there is no package called ‘lifecycle’

I am clueless: what am I doing wrong?
Can anyone help?
D.P. Weygand

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] readxl

2021-12-01 Thread Jeff Newmiller
Probably some kind of failure occurred during package updating and one or more 
of the packages readxl depends on was removed but not successfully re-installed.

Run

  install.packages( "lifecycle" )

and then

  update.packages()

and try using your function again. If a similar error occurs, repeat for the 
next missing package.

On December 1, 2021 7:30:39 AM PST, Dennis Weygand  wrote:
>I have been using readxl function for years...
>Suddenly, today, I get an error message:
>
>> T1 <- read_excel(fname,sheet="T1")
>Error in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]])
>:
>  there is no package called ‘lifecycle’
>
>Does anyone know what has happened?
>D.P. Weygand
>Old Dominion University
>Norfolk, VA
>
>   [[alternative HTML version deleted]]
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

-- 
Sent from my phone. Please excuse my brevity.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] readxl

2021-12-01 Thread Dennis Weygand
I have been using readxl function for years...
Suddenly, today, I get an error message:

> T1 <- read_excel(fname,sheet="T1")
Error in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]])
:
  there is no package called ‘lifecycle’

Does anyone know what has happened?
D.P. Weygand
Old Dominion University
Norfolk, VA

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Readxl on Mac

2020-12-01 Thread Marc Schwartz via R-help


> On Dec 1, 2020, at 1:51 PM, Dennis Weygand  wrote:
> 
> From the following lines of code on an iMac running Catalina 10.15.7
> 
> 
> library(readxl)
> fname <-  "/Volumes/SD/LabData/Lorentz/Lorentz.xlsx"
> 
> Helm <- read_excel(fname,sheet="Helmholtz")
> 
> I get the following annoying error:
> Error: Evaluation error: zip file
> 
> To me this makes no sense: the file in question is not a zip file. The
> excel file in question, however, is either open on excel, or recently
> opened in excel.
> Does anyone know about this error, and what to do about it?
> I am running R version 3.5.1
> 
> Thanks for any assistance you can provide.


Hi,

Current generation MS Office files (ending with an 'x' in the file extension) 
are ZIP compressed XML files. Thus, the zip error is relevant.

If the file that you are referencing actually exists, since the error can 
indicate a non-existent file (error in name and/or path), it is possible that 
the file is still open, and read_excel() presumably requires an exclusive read 
on the file, thus cannot open it in that manner.

You might be sure that the file is not in use by Excel or another application, 
precluding an exclusive lock on the file. If so, close out the other 
application.

Another possibility is that the Excel file is password protected, and thus 
cannot be read directly, without saving it in an unprotected fashion.

Regards,

Marc Schwartz

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Readxl on Mac

2020-12-01 Thread Duncan Murdoch

On 01/12/2020 1:51 p.m., Dennis Weygand wrote:

 From the following lines of code on an iMac running Catalina 10.15.7


library(readxl)
fname <-  "/Volumes/SD/LabData/Lorentz/Lorentz.xlsx"

Helm <- read_excel(fname,sheet="Helmholtz")

I get the following annoying error:
Error: Evaluation error: zip file

To me this makes no sense: the file in question is not a zip file. The
excel file in question, however, is either open on excel, or recently
opened in excel.
Does anyone know about this error, and what to do about it?
I am running R version 3.5.1

Thanks for any assistance you can provide.


The .xlsx file format is actually a zip file containing a number of XML 
files.  I'd guess that Excel locks it when it's open, to stop two 
spreadsheets from opening it at once.


So the error message could be more informative, but it's probably 
accurately reporting the immediate problem.


Duncan Murdoch

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Readxl on Mac

2020-12-01 Thread Dennis Weygand
>From the following lines of code on an iMac running Catalina 10.15.7


library(readxl)
fname <-  "/Volumes/SD/LabData/Lorentz/Lorentz.xlsx"

Helm <- read_excel(fname,sheet="Helmholtz")

I get the following annoying error:
Error: Evaluation error: zip file

To me this makes no sense: the file in question is not a zip file. The
excel file in question, however, is either open on excel, or recently
opened in excel.
Does anyone know about this error, and what to do about it?
I am running R version 3.5.1

Thanks for any assistance you can provide.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] readxl question

2020-08-27 Thread Ulrik Stervbo via R-help
I clearly didn't read well enough. As Petr pointed out, there is also 
the col_names argument.


```
# Solution 4a

map_dfr(files, function(cur_file, ranges){
  map_dfc(ranges, function(cur_range, df){
read_excel(cur_file, sheet = 1, col_names = cur_range, range = 
cur_range)

  }, df = df)
}, ranges = ranges, .id = "filename")

```

On 2020-08-27 17:33, Ulrik Stervbo via R-help wrote:

Hi Thomas,

I am not familiar with the use of the range argument, but it seems to
me that the cell value becomes the column name. This might be fine,
but you might get into trouble if you have repeated cell values since
as.data.frame() will fix these.

I am also not sure about what you want, but this seems to capture your
example (reading the same cells in a number of files):

```
library(readxl)

# Create test set
path <- readxl_example("geometry.xls")

read_xls(path) # See the content

example_file1 <- tempfile(fileext = ".xls")
example_file2 <- tempfile(fileext = ".xls")

file.copy(path, example_file1, overwrite = TRUE)
file.copy(path, example_file2, overwrite = TRUE)

# Solve the problem using loops
files <- c(example_file1, example_file2)
ranges <- c("B4", "C5", "D6")

fr <- lapply(ranges, function(cur_range, files){
  x <- lapply(files, read_xls, sheet = 1, range = cur_range)
  t(as.data.frame(x))
}, files = files)

# Loop over fr and save content if needed
```

A couple of variations over the theme, where the cell content is
accessed after reading the file. This will not work well if the data
in the excel files does not start at A1, but if you can adjust for
this it should work just fine

```
# Solution #2

# Read the whole excel file, and access just the column - row
# This will give really unexpected results if the data does not start 
in the
# cell A1 as is the case for geometry.xls. Also, it does not work with 
ranges

# spaning more than a single cell
files <- rep(readxl_example("datasets.xlsx"), 3)
ranges <- c("B4", "C5", "D6")

# Loop over the files to avoid re-reading
fr <- lapply(files, function(cur_file, ranges){
  df <- read_excel(cur_file, sheet = 1)
  x <- lapply(ranges, function(cur_range, df){
cr <- cellranger::as.cell_addr(cur_range, strict = FALSE)
df[cr$row, cr$col][[1]]
  }, df = df)
  as.data.frame(setNames(x, ranges))

}, ranges = ranges)

# Solution 3
# Like solution 2 but using purr

library(purrr)

files <- rep(readxl_example("datasets.xlsx"), 3)
ranges <- c("B4", "C5", "D6")

map_dfr(files, function(cur_file, ranges){
  map_dfc(ranges, function(cur_range, df){
df <- read_excel(cur_file, sheet = 1)
cr <- cellranger::as.cell_addr(cur_range, strict = FALSE)
setNames(df[cr$row, cr$col], cur_range)
  }, df = df)

}, ranges = ranges)

# Solution 4
# Like solution 3, but with the addition of the file name and producing 
a single

# data.frame at the end

library(purrr)

path <- readxl_example("datasets.xls")
example_file1 <- tempfile(fileext = "_1.xls")
example_file2 <- tempfile(fileext = "_2.xls")
example_file3 <- tempfile(fileext = "_3.xls")

file.copy(path, example_file1, overwrite = TRUE)
file.copy(path, example_file2, overwrite = TRUE)
file.copy(path, example_file3, overwrite = TRUE)

files <- c(example_file1, example_file2, example_file3)

# Name the file paths with the file names. We can them make use of the 
.id

# argument to map_dfr()
files <- setNames(files, basename(files))
ranges <- c("B4", "C5", "D6")

map_dfr(files, function(cur_file, ranges){
  map_dfc(ranges, function(cur_range, df){
df <- read_excel(cur_file, sheet = 1)
cr <- cellranger::as.cell_addr(cur_range, strict = FALSE)
setNames(df[cr$row, cr$col], cur_range)
  }, df = df)
}, ranges = ranges, .id = "filename")
```

HTH
Ulrik

On 2020-08-26 15:38, PIKAL Petr wrote:

Hi

As OP has only about 250 files and in read_excel you cannot specify 
several

ranges at once, reading those values separately and concatenating them
together in one step seems to be the most efficient way. One probably 
could
design such function, but time spent on the function performing the 
task

only once is probably bigger than performing 250*3 reads.

I see inefficiency in writing each column into separate text file and
coppying it back to Excel file.

Cheers
Petr


-Original Message-
From: Upton, Stephen (Steve) (CIV) 
Sent: Wednesday, August 26, 2020 2:44 PM
To: PIKAL Petr ; Thomas Subia 


Cc: r-help@r-project.org
Subject: RE: [R] readxl question

From your example, it appears you are reading in the same excel file 
for
each function to get a value. I would look at creating a function 
that
extracts what you need from each file all at once, rather than 
separate

reads.

Stephen C. Upton
SEED (Simulation 

Re: [R] readxl question

2020-08-27 Thread Ulrik Stervbo via R-help

Hi Thomas,

I am not familiar with the use of the range argument, but it seems to me 
that the cell value becomes the column name. This might be fine, but you 
might get into trouble if you have repeated cell values since 
as.data.frame() will fix these.


I am also not sure about what you want, but this seems to capture your 
example (reading the same cells in a number of files):


```
library(readxl)

# Create test set
path <- readxl_example("geometry.xls")

read_xls(path) # See the content

example_file1 <- tempfile(fileext = ".xls")
example_file2 <- tempfile(fileext = ".xls")

file.copy(path, example_file1, overwrite = TRUE)
file.copy(path, example_file2, overwrite = TRUE)

# Solve the problem using loops
files <- c(example_file1, example_file2)
ranges <- c("B4", "C5", "D6")

fr <- lapply(ranges, function(cur_range, files){
  x <- lapply(files, read_xls, sheet = 1, range = cur_range)
  t(as.data.frame(x))
}, files = files)

# Loop over fr and save content if needed
```

A couple of variations over the theme, where the cell content is 
accessed after reading the file. This will not work well if the data in 
the excel files does not start at A1, but if you can adjust for this it 
should work just fine


```
# Solution #2

# Read the whole excel file, and access just the column - row
# This will give really unexpected results if the data does not start in 
the
# cell A1 as is the case for geometry.xls. Also, it does not work with 
ranges

# spaning more than a single cell
files <- rep(readxl_example("datasets.xlsx"), 3)
ranges <- c("B4", "C5", "D6")

# Loop over the files to avoid re-reading
fr <- lapply(files, function(cur_file, ranges){
  df <- read_excel(cur_file, sheet = 1)
  x <- lapply(ranges, function(cur_range, df){
cr <- cellranger::as.cell_addr(cur_range, strict = FALSE)
df[cr$row, cr$col][[1]]
  }, df = df)
  as.data.frame(setNames(x, ranges))

}, ranges = ranges)

# Solution 3
# Like solution 2 but using purr

library(purrr)

files <- rep(readxl_example("datasets.xlsx"), 3)
ranges <- c("B4", "C5", "D6")

map_dfr(files, function(cur_file, ranges){
  map_dfc(ranges, function(cur_range, df){
df <- read_excel(cur_file, sheet = 1)
cr <- cellranger::as.cell_addr(cur_range, strict = FALSE)
setNames(df[cr$row, cr$col], cur_range)
  }, df = df)

}, ranges = ranges)

# Solution 4
# Like solution 3, but with the addition of the file name and producing 
a single

# data.frame at the end

library(purrr)

path <- readxl_example("datasets.xls")
example_file1 <- tempfile(fileext = "_1.xls")
example_file2 <- tempfile(fileext = "_2.xls")
example_file3 <- tempfile(fileext = "_3.xls")

file.copy(path, example_file1, overwrite = TRUE)
file.copy(path, example_file2, overwrite = TRUE)
file.copy(path, example_file3, overwrite = TRUE)

files <- c(example_file1, example_file2, example_file3)

# Name the file paths with the file names. We can them make use of the 
.id

# argument to map_dfr()
files <- setNames(files, basename(files))
ranges <- c("B4", "C5", "D6")

map_dfr(files, function(cur_file, ranges){
  map_dfc(ranges, function(cur_range, df){
df <- read_excel(cur_file, sheet = 1)
cr <- cellranger::as.cell_addr(cur_range, strict = FALSE)
setNames(df[cr$row, cr$col], cur_range)
  }, df = df)
}, ranges = ranges, .id = "filename")
```

HTH
Ulrik

On 2020-08-26 15:38, PIKAL Petr wrote:

Hi

As OP has only about 250 files and in read_excel you cannot specify 
several

ranges at once, reading those values separately and concatenating them
together in one step seems to be the most efficient way. One probably 
could
design such function, but time spent on the function performing the 
task

only once is probably bigger than performing 250*3 reads.

I see inefficiency in writing each column into separate text file and
coppying it back to Excel file.

Cheers
Petr


-Original Message-
From: Upton, Stephen (Steve) (CIV) 
Sent: Wednesday, August 26, 2020 2:44 PM
To: PIKAL Petr ; Thomas Subia 


Cc: r-help@r-project.org
Subject: RE: [R] readxl question

From your example, it appears you are reading in the same excel file 
for

each function to get a value. I would look at creating a function that
extracts what you need from each file all at once, rather than 
separate

reads.

Stephen C. Upton
SEED (Simulation Experiments & Efficient Designs) Center for Data 
Farming

SEED Center website: https://harvest.nps.edu

-Original Message-
From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of PIKAL 
Petr

Sent: Wednesday, August 26, 2020 3:50 AM
To: Thomas Subia 
Cc: r-help@r-project.org
Subject: Re: [R] readxl question

NPS WARNING: *external sender* verify before acting.


Hi


A

Re: [R] readxl question

2020-08-26 Thread PIKAL Petr
Hi

As OP has only about 250 files and in read_excel you cannot specify several
ranges at once, reading those values separately and concatenating them
together in one step seems to be the most efficient way. One probably could
design such function, but time spent on the function performing the task
only once is probably bigger than performing 250*3 reads.

I see inefficiency in writing each column into separate text file and
coppying it back to Excel file.

Cheers
Petr

> -Original Message-
> From: Upton, Stephen (Steve) (CIV) 
> Sent: Wednesday, August 26, 2020 2:44 PM
> To: PIKAL Petr ; Thomas Subia 
> Cc: r-help@r-project.org
> Subject: RE: [R] readxl question
> 
> From your example, it appears you are reading in the same excel file for
> each function to get a value. I would look at creating a function that
> extracts what you need from each file all at once, rather than separate
> reads.
> 
> Stephen C. Upton
> SEED (Simulation Experiments & Efficient Designs) Center for Data Farming
> SEED Center website: https://harvest.nps.edu
> 
> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of PIKAL Petr
> Sent: Wednesday, August 26, 2020 3:50 AM
> To: Thomas Subia 
> Cc: r-help@r-project.org
> Subject: Re: [R] readxl question
> 
> NPS WARNING: *external sender* verify before acting.
> 
> 
> Hi
> 
> 
> Are you sure that your command read values from respective cells?
> 
> I tried it and got empty data frame with names
> > WO <- lapply(files, read_excel, sheet=1, range=("B3"))
> > as.data.frame(WO)
> [1] ano TP303   X96
> [4] X0  X3.7518 X26.7
> <0 rows> (or 0-length row.names)
> 
> To get data, col_names argument should be set to FALSE WO <- lapply(files,
> read_excel, sheet=1, range=("B3"), col_names=FALSE)
> WO2 <- lapply(files, read_excel, sheet=1, range=("B5"), col_names=FALSE)
> 
> After that unlist and one rbind together with t should be enough to give
you
> one table WO <- unlist(WO)
> WO2 <- unlist(WO2)
> result <- t(rbind(WO, WO2))
> result
>  WO  WO2
> ...1 "ano"   "ano"
> ...1 "TP303" "261119/2"
> ...1 "96""288"
> ...1 "0" "192"
> ...1 "3.752" "25.92094"
> ...1 "26.7"  "38.6"
> >
> 
> And instead txt document you could do
> 
> write.table(result, "result.xls", sep = "\t", row.names = F)
> 
> And now "result.xls" is directly readable with Excel
> 
> Cheers
> Petr
> 
> >
> > -Original Message-
> > From: R-help  On Behalf Of Thomas Subia
> > via R-help
> > Sent: Saturday, August 22, 2020 6:25 AM
> > To: r-help@r-project.org
> > Subject: [R] readxl question
> >
> > Colleagues,
> >
> >
> >
> > I have 250 Excel files in a directory. Each of those files has the
> > same
> layout.
> > The problem is that the data in each Excel data is not in rectangular
> form. I've
> > been using readxl to extract the data which I need.
> > Each of my metrics are stored in a particular cell. For each metric, I
> create text
> > files which stores my metrics.
> >
> >
> >
> > library(plyr)
> >
> > library(readxl)
> >
> >
> >
> > files <- list.files(pattern="*.xls", full.names = FALSE)
> >
> >
> >
> > # Extract Work Order
> >
> > WO <- lapply(files, read_excel, sheet="Sheet1", range=("B9")) WO_list
> > <-
> > as.data.frame(WO) trans_WO <- t(WO_list) write.table(trans_WO
> > ,"WO.txt")
> >
> >
> >
> > # Extract bubble 14_1
> >
> > BUBBLE_14_1 <- lapply(files, read_excel, sheet="Sheet1",
> > range=("c46")) BUBBLE_14_1_list <- as.data.frame(BUBBLE_14_1)
> >
> > trans_BUBBLE_14_1 <- t(BUBBLE_14_1_list)
> >
> >
> >
> > write.table(trans_BUBBLE_14_1,"BUBBLE_14_1.txt")
> >
> >
> >
> >
> >
> > # Extract bubble 14_2
> >
> > BUBBLE_14_2 <- lapply(files, read_excel, sheet="Sheet1",
> > range=("c62")) BUBBLE_14_2_list <- as.data.frame(BUBBLE_14_2)
> >
> > trans_BUBBLE_14_2 <- t(BUBBLE_14_2_list)
> >
> > write.table(trans_BUBBLE_14_2,"BUBBLE_14_2.txt")
> >
> >
> >
> > After the text files have been created, I cut and paste the contents
> > of
> each
> > text

Re: [R] readxl question

2020-08-26 Thread Upton, Stephen (Steve) (CIV)
>From your example, it appears you are reading in the same excel file for
each function to get a value. I would look at creating a function that
extracts what you need from each file all at once, rather than separate
reads.

Stephen C. Upton
SEED (Simulation Experiments & Efficient Designs) Center for Data Farming
SEED Center website: https://harvest.nps.edu

-Original Message-
From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of PIKAL Petr
Sent: Wednesday, August 26, 2020 3:50 AM
To: Thomas Subia 
Cc: r-help@r-project.org
Subject: Re: [R] readxl question

NPS WARNING: *external sender* verify before acting.


Hi


Are you sure that your command read values from respective cells?

I tried it and got empty data frame with names
> WO <- lapply(files, read_excel, sheet=1, range=("B3"))
> as.data.frame(WO)
[1] ano TP303   X96
[4] X0  X3.7518 X26.7
<0 rows> (or 0-length row.names)

To get data, col_names argument should be set to FALSE WO <- lapply(files,
read_excel, sheet=1, range=("B3"), col_names=FALSE)
WO2 <- lapply(files, read_excel, sheet=1, range=("B5"), col_names=FALSE)

After that unlist and one rbind together with t should be enough to give you
one table WO <- unlist(WO)
WO2 <- unlist(WO2)
result <- t(rbind(WO, WO2))
result
 WO  WO2
...1 "ano"   "ano"
...1 "TP303" "261119/2"
...1 "96""288"
...1 "0" "192"
...1 "3.752" "25.92094"
...1 "26.7"  "38.6"
>

And instead txt document you could do

write.table(result, "result.xls", sep = "\t", row.names = F)

And now "result.xls" is directly readable with Excel

Cheers
Petr

>
> -Original Message-
> From: R-help  On Behalf Of Thomas Subia 
> via R-help
> Sent: Saturday, August 22, 2020 6:25 AM
> To: r-help@r-project.org
> Subject: [R] readxl question
>
> Colleagues,
>
>
>
> I have 250 Excel files in a directory. Each of those files has the 
> same
layout.
> The problem is that the data in each Excel data is not in rectangular
form. I've
> been using readxl to extract the data which I need.
> Each of my metrics are stored in a particular cell. For each metric, I
create text
> files which stores my metrics.
>
>
>
> library(plyr)
>
> library(readxl)
>
>
>
> files <- list.files(pattern="*.xls", full.names = FALSE)
>
>
>
> # Extract Work Order
>
> WO <- lapply(files, read_excel, sheet="Sheet1", range=("B9")) WO_list 
> <-
> as.data.frame(WO) trans_WO <- t(WO_list) write.table(trans_WO 
> ,"WO.txt")
>
>
>
> # Extract bubble 14_1
>
> BUBBLE_14_1 <- lapply(files, read_excel, sheet="Sheet1", 
> range=("c46")) BUBBLE_14_1_list <- as.data.frame(BUBBLE_14_1)
>
> trans_BUBBLE_14_1 <- t(BUBBLE_14_1_list)
>
>
>
> write.table(trans_BUBBLE_14_1,"BUBBLE_14_1.txt")
>
>
>
>
>
> # Extract bubble 14_2
>
> BUBBLE_14_2 <- lapply(files, read_excel, sheet="Sheet1", 
> range=("c62")) BUBBLE_14_2_list <- as.data.frame(BUBBLE_14_2)
>
> trans_BUBBLE_14_2 <- t(BUBBLE_14_2_list)
>
> write.table(trans_BUBBLE_14_2,"BUBBLE_14_2.txt")
>
>
>
> After the text files have been created, I cut and paste the contents 
> of
each
> text file to Excel.
>
> This has worked fine if the number of cells I am extracting from a 
> file is
small.
>
> If the number gets larger, this method is inefficient.
>
>
>
> Any advice on how to do this would be appreciated.
>
>
>
> All the best,
>
>
>
> Thomas Subia
>
>
>   [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see 
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting- 
> guide.html and provide commented, minimal, self-contained, 
> reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] readxl question

2020-08-26 Thread PIKAL Petr
Hi


Are you sure that your command read values from respective cells?

I tried it and got empty data frame with names 
> WO <- lapply(files, read_excel, sheet=1, range=("B3")) 
> as.data.frame(WO)
[1] ano TP303   X96
[4] X0  X3.7518 X26.7  
<0 rows> (or 0-length row.names)

To get data, col_names argument should be set to FALSE
WO <- lapply(files, read_excel, sheet=1, range=("B3"), col_names=FALSE)
WO2 <- lapply(files, read_excel, sheet=1, range=("B5"), col_names=FALSE)

After that unlist and one rbind together with t should be enough to give you
one table
WO <- unlist(WO)
WO2 <- unlist(WO2)
result <- t(rbind(WO, WO2))
result
 WO  WO2   
...1 "ano"   "ano" 
...1 "TP303" "261119/2"
...1 "96""288" 
...1 "0" "192" 
...1 "3.752" "25.92094"
...1 "26.7"  "38.6"
>

And instead txt document you could do

write.table(result, "result.xls", sep = "\t", row.names = F)

And now "result.xls" is directly readable with Excel

Cheers
Petr

>
> -Original Message-
> From: R-help  On Behalf Of Thomas Subia via
> R-help
> Sent: Saturday, August 22, 2020 6:25 AM
> To: r-help@r-project.org
> Subject: [R] readxl question
> 
> Colleagues,
> 
> 
> 
> I have 250 Excel files in a directory. Each of those files has the same
layout.
> The problem is that the data in each Excel data is not in rectangular
form. I've
> been using readxl to extract the data which I need.
> Each of my metrics are stored in a particular cell. For each metric, I
create text
> files which stores my metrics.
> 
> 
> 
> library(plyr)
> 
> library(readxl)
> 
> 
> 
> files <- list.files(pattern="*.xls", full.names = FALSE)
> 
> 
> 
> # Extract Work Order
> 
> WO <- lapply(files, read_excel, sheet="Sheet1", range=("B9")) WO_list <-
> as.data.frame(WO) trans_WO <- t(WO_list) write.table(trans_WO ,"WO.txt")
> 
> 
> 
> # Extract bubble 14_1
> 
> BUBBLE_14_1 <- lapply(files, read_excel, sheet="Sheet1", range=("c46"))
> BUBBLE_14_1_list <- as.data.frame(BUBBLE_14_1)
> 
> trans_BUBBLE_14_1 <- t(BUBBLE_14_1_list)
> 
> 
> 
> write.table(trans_BUBBLE_14_1,"BUBBLE_14_1.txt")
> 
> 
> 
> 
> 
> # Extract bubble 14_2
> 
> BUBBLE_14_2 <- lapply(files, read_excel, sheet="Sheet1", range=("c62"))
> BUBBLE_14_2_list <- as.data.frame(BUBBLE_14_2)
> 
> trans_BUBBLE_14_2 <- t(BUBBLE_14_2_list)
> 
> write.table(trans_BUBBLE_14_2,"BUBBLE_14_2.txt")
> 
> 
> 
> After the text files have been created, I cut and paste the contents of
each
> text file to Excel.
> 
> This has worked fine if the number of cells I am extracting from a file is
small.
> 
> If the number gets larger, this method is inefficient.
> 
> 
> 
> Any advice on how to do this would be appreciated.
> 
> 
> 
> All the best,
> 
> 
> 
> Thomas Subia
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] readxl question

2020-08-21 Thread Thomas Subia via R-help
Colleagues,

 

I have 250 Excel files in a directory. Each of those files has the same
layout. The problem is that the data in each Excel data is not in
rectangular form. I've been using readxl to extract the data which I need.
Each of my metrics are stored in a particular cell. For each metric, I
create text files which stores my metrics.

 

library(plyr)

library(readxl)

 

files <- list.files(pattern="*.xls", full.names = FALSE)

 

# Extract Work Order

WO <- lapply(files, read_excel, sheet="Sheet1", range=("B9")) WO_list <-
as.data.frame(WO)

trans_WO <- t(WO_list)

 

write.table(trans_WO ,"WO.txt")

 

# Extract bubble 14_1

BUBBLE_14_1 <- lapply(files, read_excel, sheet="Sheet1", range=("c46"))
BUBBLE_14_1_list <- as.data.frame(BUBBLE_14_1)

trans_BUBBLE_14_1 <- t(BUBBLE_14_1_list)

 

write.table(trans_BUBBLE_14_1,"BUBBLE_14_1.txt")

 

 

# Extract bubble 14_2

BUBBLE_14_2 <- lapply(files, read_excel, sheet="Sheet1", range=("c62"))
BUBBLE_14_2_list <- as.data.frame(BUBBLE_14_2)

trans_BUBBLE_14_2 <- t(BUBBLE_14_2_list)

write.table(trans_BUBBLE_14_2,"BUBBLE_14_2.txt")

 

After the text files have been created, I cut and paste each column of data
to Excel.

This has worked fine if the number of cells I am extracting from a file is
small.

If the number gets larger, this method is inefficient.

 

Any advice on how to do this would be appreciated.

 

All the best,

 

Thomas Subia


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] readxl question

2020-08-21 Thread Thomas Subia via R-help
Colleagues,

 

I have 250 Excel files in a directory. Each of those files has the same
layout. The problem is that the data in each Excel data is not in
rectangular form. I've been using readxl to extract the data which I need.
Each of my metrics are stored in a particular cell. For each metric, I
create text files which stores my metrics.

 

library(plyr)

library(readxl)

 

files <- list.files(pattern="*.xls", full.names = FALSE)

 

# Extract Work Order

WO <- lapply(files, read_excel, sheet="Sheet1", range=("B9")) WO_list <-
as.data.frame(WO) trans_WO <- t(WO_list) write.table(trans_WO ,"WO.txt")

 

# Extract bubble 14_1

BUBBLE_14_1 <- lapply(files, read_excel, sheet="Sheet1", range=("c46"))
BUBBLE_14_1_list <- as.data.frame(BUBBLE_14_1)

trans_BUBBLE_14_1 <- t(BUBBLE_14_1_list)

 

write.table(trans_BUBBLE_14_1,"BUBBLE_14_1.txt")

 

 

# Extract bubble 14_2

BUBBLE_14_2 <- lapply(files, read_excel, sheet="Sheet1", range=("c62"))
BUBBLE_14_2_list <- as.data.frame(BUBBLE_14_2)

trans_BUBBLE_14_2 <- t(BUBBLE_14_2_list)

write.table(trans_BUBBLE_14_2,"BUBBLE_14_2.txt")

 

After the text files have been created, I cut and paste the contents of each
text file to Excel.

This has worked fine if the number of cells I am extracting from a file is
small.

If the number gets larger, this method is inefficient.

 

Any advice on how to do this would be appreciated.

 

All the best,

 

Thomas Subia


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] readxl issue

2020-02-05 Thread Abby Spurdle
> I'm using readxl and dplyr to extract a specific cell from all worksheets in 
> a directory.
> All of these worksheets have the same physical layout.

I don't have access to Excel, so can't test this.

I have a suspicion that the problem is not readxl.
But rather the "as.data.frame" step.

Try reading one file at a time.
And inspect (1) your immediate object after reading the Excel file,
and then (2) the data.frame after the as.data.frame call.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] readxl issue

2020-02-05 Thread Thomas Subia
Jeff,

You wrote: " Pay attention to whether the read_csv call is configured to expect 
first line as header."

Here is the code I'm using to extract one cell from a series of Excel files 
having the same physical format.

library(plyr)
library(readxl)
files <- list.files(pattern="*.xls", full.names = TRUE)
# Extract part average from cell c6 for all Excel files
avgs <- lapply(files, read_excel, sheet="Flow Data", range=("c9"))
# Write data to text file
write.table(avgs ,"avgs.txt",sep="\t")

I'm not sure where read_csv applies here.

Thanks for your help!

Thomas Subia

-Original Message-
From: Jeff Newmiller  
Sent: Wednesday, February 05, 2020 11:42 AM
To: r-help@r-project.org; Thomas Subia ; 
r-help@r-project.org
Subject: Re: [R] readxl issue

Pay attention to whether the read_csv call is configured to expect first line 
as header.

On February 5, 2020 11:09:01 AM PST, Thomas Subia  
wrote:
>Colleagues,
>
>I'm using readxl and dplyr to extract a specific cell from all 
>worksheets in a directory.
>All of these worksheets have the same physical layout.
>
>Issue 1: Minus sign replaced by an X after data extraction.
>
>library(plyr)
>library(readxl)
>
>files <- list.files(pattern="*.xls", full.names = FALSE) avgs <- 
>lapply(files, read_excel, sheet="Flow Data", range=("c9")) avg_list <- 
>as.data.frame(avgs) trans_avgs <- t(avg_list)
>write.table(trans_avgs,"avgs.txt")
>
>Here are the first lines of the avgs.txt file.
>"X.0.51571428571428557" 
>"X.0.5349795918367346" 
>"X.0.4895714285714286" 
>"X.0.5112448979591836"
>
>The original Excel file contains
>-0.516
>-0.535
>-0.490
>-0.511
>
>It appears that readxl is changing the - sign to an X.
>Is there any feature in readxl which I can change so that readxl 
>extracts the minus sign?
>
>Issue 2: Duplicate dates contain additional characters
>
>dates <- lapply(files, read_excel, sheet="Flow Data", range=("h14")) 
>dates_list <- as.data.frame(dates) trans_dates <- t(dates_list) 
>write.table(trans_dates ,"dates.txt")
>
>Here are the first lines of the dates.txt file.
>
>"X43859" 
>"X43859.1" 
>"X43859.2" 
>"X43859.3" 
>"X43833"
>
>In Excel, this is what is recorded.
>
>1/29/2020
>1/29/2020
>1/29/2020
>1/29/2020
>1/3/2020
>
>It appears that readxl is adding additional characters which are 
>signaling duplicate dates.
>Is there any feature in readxl which can I can change to eliminate 
>these additional characters?
>
>Some advice would be appreciated.
>
>
>Thomas Subia
>
>Statistician / Senior Quality Engineer
>IMG Companies
>225 Mountain Vista Parkway
>Livermore, CA 94551
>T. (925) 273-1106
>F. (925) 273-
>E. tsu...@imgprecision.com
>
>
>Precision Manufacturing for Emerging Technologies imgprecision.com
>
>The contents of this message, together with any attachments, are 
>intended only for the use of the individual or entity to which they are 
>addressed and may contain information that is legally privileged, 
>confidential and exempt from disclosure. If you are not the intended 
>recipient, you are hereby notified that any dissemination, 
>distribution, or copying of this message, or any attachment, is 
>strictly prohibited. If you have received this message in error, please 
>notify the original sender or IMG Companies, LLC at Tel: 925-273-1100 
>immediately by telephone or by return E-mail and delete this message, 
>along with any attachments, from your computer. Thank you.
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see 
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

--
Sent from my phone. Please excuse my brevity.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] readxl issue

2020-02-05 Thread Jeff Newmiller
Pay attention to whether the read_csv call is configured to expect first line 
as header.

On February 5, 2020 11:09:01 AM PST, Thomas Subia  
wrote:
>Colleagues,
>
>I'm using readxl and dplyr to extract a specific cell from all
>worksheets in a directory.
>All of these worksheets have the same physical layout.
>
>Issue 1: Minus sign replaced by an X after data extraction.
>
>library(plyr)
>library(readxl)
>
>files <- list.files(pattern="*.xls", full.names = FALSE)
>avgs <- lapply(files, read_excel, sheet="Flow Data", range=("c9"))
>avg_list <- as.data.frame(avgs)
>trans_avgs <- t(avg_list)
>write.table(trans_avgs,"avgs.txt")
>
>Here are the first lines of the avgs.txt file.
>"X.0.51571428571428557" 
>"X.0.5349795918367346" 
>"X.0.4895714285714286" 
>"X.0.5112448979591836"
>
>The original Excel file contains
>-0.516
>-0.535
>-0.490
>-0.511
>
>It appears that readxl is changing the - sign to an X.
>Is there any feature in readxl which I can change so that readxl
>extracts the minus sign?
>
>Issue 2: Duplicate dates contain additional characters
>
>dates <- lapply(files, read_excel, sheet="Flow Data", range=("h14"))
>dates_list <- as.data.frame(dates)
>trans_dates <- t(dates_list)
>write.table(trans_dates ,"dates.txt")
>
>Here are the first lines of the dates.txt file.
>
>"X43859" 
>"X43859.1" 
>"X43859.2" 
>"X43859.3" 
>"X43833"
>
>In Excel, this is what is recorded.
>
>1/29/2020
>1/29/2020
>1/29/2020
>1/29/2020
>1/3/2020
>
>It appears that readxl is adding additional characters which are
>signaling duplicate dates.
>Is there any feature in readxl which can I can change to eliminate
>these additional characters?
>
>Some advice would be appreciated.
>
>
>Thomas Subia 
>
>Statistician / Senior Quality Engineer
>IMG Companies 
>225 Mountain Vista Parkway
>Livermore, CA 94551
>T. (925) 273-1106
>F. (925) 273-
>E. tsu...@imgprecision.com
>
>
>Precision Manufacturing for Emerging Technologies
>imgprecision.com 
>
>The contents of this message, together with any attachments, are
>intended only for the use of the individual or entity to which they are
>addressed and may contain information that is legally privileged,
>confidential and exempt from disclosure. If you are not the intended
>recipient, you are hereby notified that any dissemination,
>distribution, or copying of this message, or any attachment, is
>strictly prohibited. If you have received this message in error, please
>notify the original sender or IMG Companies, LLC at Tel: 925-273-1100
>immediately by telephone or by return E-mail and delete this message,
>along with any attachments, from your computer. Thank you.
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

-- 
Sent from my phone. Please excuse my brevity.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] readxl issue

2020-02-05 Thread Thomas Subia
Colleagues,

I'm using readxl and dplyr to extract a specific cell from all worksheets in a 
directory.
All of these worksheets have the same physical layout.

Issue 1: Minus sign replaced by an X after data extraction.

library(plyr)
library(readxl)

files <- list.files(pattern="*.xls", full.names = FALSE)
avgs <- lapply(files, read_excel, sheet="Flow Data", range=("c9"))
avg_list <- as.data.frame(avgs)
trans_avgs <- t(avg_list)
write.table(trans_avgs,"avgs.txt")

Here are the first lines of the avgs.txt file.
"X.0.51571428571428557" 
"X.0.5349795918367346" 
"X.0.4895714285714286" 
"X.0.5112448979591836"

The original Excel file contains
-0.516
-0.535
-0.490
-0.511

It appears that readxl is changing the - sign to an X.
Is there any feature in readxl which I can change so that readxl extracts the 
minus sign?

Issue 2: Duplicate dates contain additional characters

dates <- lapply(files, read_excel, sheet="Flow Data", range=("h14"))
dates_list <- as.data.frame(dates)
trans_dates <- t(dates_list)
write.table(trans_dates ,"dates.txt")

Here are the first lines of the dates.txt file.

"X43859" 
"X43859.1" 
"X43859.2" 
"X43859.3" 
"X43833"

In Excel, this is what is recorded.

1/29/2020
1/29/2020
1/29/2020
1/29/2020
1/3/2020

It appears that readxl is adding additional characters which are signaling 
duplicate dates.
Is there any feature in readxl which can I can change to eliminate these 
additional characters?

Some advice would be appreciated.


Thomas Subia 

Statistician / Senior Quality Engineer
IMG Companies 
225 Mountain Vista Parkway
Livermore, CA 94551
T. (925) 273-1106
F. (925) 273-
E. tsu...@imgprecision.com


Precision Manufacturing for Emerging Technologies
imgprecision.com 

The contents of this message, together with any attachments, are intended only 
for the use of the individual or entity to which they are addressed and may 
contain information that is legally privileged, confidential and exempt from 
disclosure. If you are not the intended recipient, you are hereby notified that 
any dissemination, distribution, or copying of this message, or any attachment, 
is strictly prohibited. If you have received this message in error, please 
notify the original sender or IMG Companies, LLC at Tel: 925-273-1100 
immediately by telephone or by return E-mail and delete this message, along 
with any attachments, from your computer. Thank you.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Readxl Question

2019-12-20 Thread Jim Lemon
Hi Thomas,
Perhaps this is what you are seeking:

my_read_excel<-function(filename) {
 serials<-read_excel(filename,sheet="Flow Data",range=("c6"))
 flow.data<-read_excel(filename,sheet="Flow Data",range=("c22:c70"))
 dates<-read_excel(filename,sheet="Flow Data",range=("h14"))
 return(data.frame(Serial=rep(serials,49),Date=rep(dates,49),
  Flow=flow.data),stringsAsFactors=FALSE)
}
lapply(files,my_read_excel)

Should return a list of data frames in the format you want. If you
want just one big data frame, rbind the results. You should also be
able to convert the dates from character to date type without striking
the factor problem.

Jim

On Sat, Dec 21, 2019 at 4:16 AM Thomas Subia  wrote:
>
> Colleagues,
>
> The objective is to create a text file having this format:
>
> Serial  DateFlow
>

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Readxl Question

2019-12-20 Thread Thomas Subia
Colleagues,

I am using readxl to extract a serial number and its associated data using the 
following code.

library(readxl)
files <- list.files(pattern="*.xls", full.names = FALSE)
serials <- lapply(files, read_excel, sheet="Flow Data", range=("c6"))
flow.datum <- lapply(files, read_excel, sheet="Flow Data", range=("c22:c70"))
dates <- lapply(files, read_excel, sheet="Flow Data", range=("h14"))

Here each serial and date is associated with 49 data points in flow.datum.

Now I write the serials, flow data and dates into a text file using:

datesdf <- as.data.frame(dates)
mydates <- list(datesdf)
write.table(mydates,"dates.txt",sep="\t")

serialdf <- as.data.frame(serials)
myserials <- list(serialdf)
write.table(myserials,"serials.txt",sep="\t")

flowdf <-as.data.frame(flow.datum)
myflow <- list(flowdf)
write.table(myflow,"myflow.txt",sep=",")

The problem with the dates.txt and the serials.txt is that they need to 
associated with its 49 corresponding values in myflow.txt.

The objective is to create a text file having this format:

Serial  DateFlow

I'm not sure how to do this. Any suggestions would be appreciated.

Thomas Subia 
Statistician / Senior Quality Engineer
ASQ CQE

IMG Companies 
225 Mountain Vista Parkway
Livermore, CA 94551
T. (925) 273-1106
F. (925) 273-
E. tsu...@imgprecision.com


Precision Manufacturing for Emerging Technologies
imgprecision.com 

The contents of this message, together with any attachments, are intended only 
for the use of the individual or entity to which they are addressed and may 
contain information that is legally privileged, confidential and exempt from 
disclosure. If you are not the intended recipient, you are hereby notified that 
any dissemination, distribution, or copying of this message, or any attachment, 
is strictly prohibited. If you have received this message in error, please 
notify the original sender or IMG Companies, LLC at Tel: 925-273-1100 
immediately by telephone or by return E-mail and delete this message, along 
with any attachments, from your computer. Thank you.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] readxl question

2019-12-05 Thread Ivan Krylov
On Thu, 5 Dec 2019 15:39:56 +
Thomas Subia  wrote:

> date <- lapply(files, read_excel, sheet="Sheet1", range=("B5"))
> date_df <- as.data.frame(date)
> trans_date <-t(date_df)
> mydates <- list(trans_date)

This feels a bit excessive for what looks like a one-dimensional string
vector. Why is it needed? Can you get better results with sapply or
vapply (which return vectors, not lists)?

In particular, as.data.frame might be responsible for the name
mangling. Also, your data seems to end up inside the row names. Try
using str() on every step of the transformation to check if that is the
case.

Also check out the .name_repair argument of the read_excel function,
but I think that as.data.frame is part of the problem.

-- 
Best regards,
Ivan

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] readxl question

2019-12-05 Thread Thomas Subia
Colleagues,

I'm trying to extract a cell from all Excel files in a directory.

library(readxl)
files <- list.files(pattern="*.xls", full.names = FALSE)

date <- lapply(files, read_excel, sheet="Sheet1", range=("B5"))

date_df <- as.data.frame(date)
trans_date <-t(date_df)
mydates <- list(trans_date)
write.table(mydates,"mydates.txt",sep="\t")

Looking at mydates.txt shows:

""
"Saturday..June.09..2018"   
"Saturday..June.09..2018.1" 
"Saturday..June.09..2018.2"

But the original Excel contents are:

Saturday, June 09, 2018
Saturday, June 09, 2018
Saturday, June 09, 2018

I get a similar problem with my serial numbers

serial <-lapply(files, read_excel, sheet="Sheet1", range=("B9"))

serial_df <- as.data.frame(serial)
trans_serial <-t(serial_df)
myserials <- list(trans_serial)
write.table(myserials,"myserials.txt",sep="\t")

R Output

""
"X96739.0027.1" 
"X96739.0041.1" 
"X96739.0044.1" 

Original Excel Content
96739-0027/1
96739-0041/1
96739-0044/1

How can I amend my script so that the output matches the original Excel content?

Thomas Subia 
Statistician / Senior Quality Engineer
IMG Precision

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] readxl::excel_sheets in tryCatch() doesn't catch error

2019-02-06 Thread Phillip-Jan van Zyl via R-help
Hi R programmers

I am reading multiple .xls and .xlsx files from a directory using readxl from 
tidyverse. When reading fails, the code should continue on to the next file.

However, when I call the custom function readExcelSheets (in a loop and with 
the tryCatch function) I get an error for some files and the code then stops 
executing. How can I force my code to continue on to the next files?

Here is the function:

readExcelSheets <- function(curPath) {
  out <- tryCatch(
{
  message("This is the 'try' part")
  dat <- excel_sheets(curPath)
},
error=function(cond) {
  message(paste("Error in opening Excel file with readxl read sheets:", 
curPath))
  message("Here's the original error message:")
  message(cond)
},
warning=function(cond) {
  message(paste("readxl caused a warning en reading sheets:", curPath))
  message("Here's the original warning message:")
  message(cond)
},
finally={
  message(paste("Processed file for sheets:", curPath))
  message("End of processing file for sheets.")
}
  )
  return(out)
}

The loop looks like this:

listLength <- length(excelList)
for (excel_file in excelList) {
  curPath <- excel_file
  sheetNames <- NULL
  sheetNames <- withTimeout({readExcelSheets(curPath)}, timeout = 5, 
onTimeout="silent")
  if(is.null(sheetNames)){next}
  for (sheetName in sheetNames){
# do something
  }
}

The problem is that I get an error:

Error: Evaluation error: zip file '' cannot be opened.

And then execution of the loop stops without progressing to the next Excel 
file. Note that for the first n=+-20 files the code works as expected. I think 
that there may be an error in the full path name (such as a text encoding 
error), but my point is that it should exit silently and progress to the next 
Excel file even if the path is not found.

Best regards
Phillip
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.