Re: [R] Is there a hash data structure for R

2021-11-03 Thread Abby Spurdle
Here's an interesting article:
Collections in R: Review and Proposal
Timothy Barry
The R Journal
doi: 10.32614/RJ-2018-037
https://journal.r-project.org/archive/2018/RJ-2018-037/RJ-2018-037.pdf

On Tue, Nov 2, 2021 at 10:48 PM Yonghua Peng  wrote:
>
> I know this is a newbie question. But how do I implement the hash structure
> which is available in other languages (in python it's dict)?
>
> I know there is the list, but list's names can be duplicated here.
>
> > x <- list(x=1:5,y=month.name,x=3:7)
>
> > x
>
> $x
>
> [1] 1 2 3 4 5
>
>
> $y
>
>  [1] "January"   "February"  "March" "April" "May"   "June"
>
>  [7] "July"  "August""September" "October"   "November"  "December"
>
>
> $x
>
> [1] 3 4 5 6 7
>
>
>
> Thanks a lot.
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sink() not working as expected [RESOLVED]

2021-11-03 Thread Rich Shepard

On Wed, 3 Nov 2021, Rui Barradas wrote:


You do not assign the pipe output, so put the print statement as the last
instruction of the pipe. The following works.

# file: rhelp.R
library(dplyr)

mtcars %>%
 select(mpg, cyl, disp, hp, am) %>%
 mutate(
   sampdt = c("automatic", "manual")[am + 1L]
 ) %>%
 print()


Rui,

Thank you very much. It works here, too.

Much appreciated,

Rich

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fwd: Merging multiple csv files to new file

2021-11-03 Thread Jim Lemon
Hi Gabrielle,
I get the feeling that you are trying to merge data in which each file
contains different variables, but the same subjects have contributed
the data. This a very wild guess, but it may provide some insight.

# assume that subjects are identified by a variable named "subjectID"
# create a vector of all your filenames
my_filenames<-c("dailyActivity_merged.csv",
"dailyCalories_merged.csv", "dailyIntensities_merged.csv",
...)
# step through the filenames, reading each one and merging it into the
final data frame
for(filename in my_filenames) {
 if(!exists(my_df)) my_df<-read.csv(filename)
 else {
  next_df<-read.csv(filename)
  my_df<-merge(my_df,next_df,by="subjectID",fill=TRUE)
 }
}

I doubt that this will work first time, but it will be a lot easier to
debug than throwing it all into a black box and seeing what comes out.

Jim

On Thu, Nov 4, 2021 at 2:36 AM gabrielle aban steinberg
 wrote:
>
> Hello, I would like to merge 18 csv files into a master data csv file, but
> each file has a different number of columns (mostly found in one or more of
> the other cvs files) and different number of rows.
>
> I have tried something like the following in R Studio (cloud):
>
> all_data_fit_files <- rbind("dailyActivity_merged.csv",
> "dailyCalories_merged.csv", "dailyIntensities_merged.csv",
> "dailySteps_merged.csv", "heartrate_seconds_merged.csv",
> "hourlyCalories_merged.csv", "hourlyIntensities_merged.csv",
> "hourlySteps_merged.csv", "minuteCaloriesNarrow_merged.csv",
> "minuteCaloriesWide_merged.csv", "minuteIntensitiesNarrow_merged.csv",
> "minuteIntensitiesWide_merged.csv", "minuteMETsNarrow_merged.csv",
> "minuteSleep_merged.csv", "minuteStepsNarrow_merged.csv",
> “minuteStepsWide_merged.csv", "sleepDay_merged.csv",
> "minuteStepsWide_merged.csv", "sleepDay_merged.csv",
> "weightLogInfo_merged.csv")
>
>
>
> But I am getting the following error:
>
> Error: unexpected input in "rlySteps_merged.csv",
> "minuteCaloriesNarrow_merged.csv", "minuteCaloriesWide_merged.csv",
> "minuteIntensitiesNarrow_merged.csv",
> "minuteIntensitiesWide_merged.csv", "minuteMETsNarrow_merged.csv"
>
>
> (Maybe the R Studio free trial/usage is underpowered for my project?)
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] names.data.frame?

2021-11-03 Thread Leonard Mada via R-help
Thank you very much.


Indeed, NextMethod() is the correct way and is working fine.


There are some alternatives (as pointed out). Although I am still trying 
to figure out what would be the best design strategy of such a class.


Note:

- I wanted to exclude the "coeff" column from the returned names;
names.pm = function(p) {
     nms = NextMethod();
     # excludes the Coefficients:
     id = match("coeff", nms);
     return(nms[ - id]);
}


This is why I also hesitate regarding what to use: dimnames or names.


Sincerely,


Leonard


On 11/3/2021 8:54 PM, Andrew Simmons wrote:
> First, your signature for names.pm  is wrong. It 
> should look something more like:
>
>
> names.pm  <- function (x)
> {
> }
>
>
> As for the body of the function, you might do something like:
>
>
> names.pm  <- function (x)
> {
>     NextMethod()
> }
>
>
> but you don't need to define a names method if you're just going to 
> call the next method. I would suggest not defining a names method at all.
>
>
> As a side note, I would suggest making your class through the methods 
> package, with methods::setClass("pm", ...)
> See the documentation for setClass for more details, it's the 
> recommended way to define classes in R.
>
> On Wed, Nov 3, 2021 at 2:36 PM Leonard Mada via R-help 
>  wrote:
>
> Dear List members,
>
>
> Is there a way to access the default names() function?
>
>
> I tried the following:
>
> # Multi-variable polynomial
>
> p = data.frame(x=1:3, coeff=1)
>
> class(p) = c("pm", class(p));
>
>
> names.pm  = function(p) {
> # .Primitive("names")(p) # does NOT function
> # .Internal("names")(p) # does NOT function
> # nms = names.default(p) # does NOT exist
> # nms = names.data.frame(p) # does NOT exist
> # nms = names(p); # obvious infinite recursion;
> nms = names(unclass(p));
> }
>
>
> Alternatively:
>
> Would it be better to use dimnames.pm  instead
> of names.pm ?
>
> I am not fully aware of the advantages and disadvantages of
> dimnames vs
> names.
>
>
> Sincerely,
>
>
> Leonard
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> 
> and provide commented, minimal, self-contained, reproducible code.
>
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fwd: Merging multiple csv files to new file

2021-11-03 Thread Duncan Murdoch

On 02/11/2021 6:30 p.m., gabrielle aban steinberg wrote:

Hello, I would like to merge 18 csv files into a master data csv file, but
each file has a different number of columns (mostly found in one or more of
the other cvs files) and different number of rows.

I have tried something like the following in R Studio (cloud):

all_data_fit_files <- rbind("dailyActivity_merged.csv",
"dailyCalories_merged.csv", "dailyIntensities_merged.csv",
"dailySteps_merged.csv", "heartrate_seconds_merged.csv",
"hourlyCalories_merged.csv", "hourlyIntensities_merged.csv",
"hourlySteps_merged.csv", "minuteCaloriesNarrow_merged.csv",
"minuteCaloriesWide_merged.csv", "minuteIntensitiesNarrow_merged.csv",
"minuteIntensitiesWide_merged.csv", "minuteMETsNarrow_merged.csv",
"minuteSleep_merged.csv", "minuteStepsNarrow_merged.csv",
“minuteStepsWide_merged.csv", "sleepDay_merged.csv",
"minuteStepsWide_merged.csv", "sleepDay_merged.csv",
"weightLogInfo_merged.csv")



That just puts the names together.  You need to read each file, figure 
out how the resulting dataframes get merged (they have different 
columns, so that will take some thinking), and then do it and write it out.


Duncan Murdoch

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sink() not working as expected

2021-11-03 Thread Rui Barradas

Hello,

You do not assign the pipe output, so put the print statement as the 
last instruction of the pipe. The following works.


# file: rhelp.R
library(dplyr)

mtcars %>%
  select(mpg, cyl, disp, hp, am) %>%
  mutate(
sampdt = c("automatic", "manual")[am + 1L]
  ) %>%
  print()


Then, I've just tested it,


source("rhelp.R")


Hope this helps,

Rui Barradas

Às 19:21 de 03/11/21, Rich Shepard escreveu:

On Wed, 3 Nov 2021, Ivan Krylov wrote:


instead. When you source() a script, auto-printing is not performed. This
is explained in the first paragraph of ?source, but not ?sink. If you 
want

to source() scripts and rely on their output (including sink()), you'll
need to print() results explicitly.


Ivan,

I've read ?source and still do not understand where to put either 
auto-print

or an explicit print statement. For example,
cor_disc %>%
     select(site_nbr,year, mon, day, hr, min, tz, cfs) %>%
     mutate(
     sampdt = make_datetime(year, mon, day, hr, min)
     )
print(cor_disc)

throws an error. I also get an error when the print statement follows the
sampdt assignment.

I need to understand how to get a sourced file that's modified (as these
tibbles are by combining data and time columns into a datatime colune) and
saving the tibble with the new appended column.

Rich

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] names.data.frame?

2021-11-03 Thread Duncan Murdoch

On 03/11/2021 2:54 p.m., Andrew Simmons wrote:

First, your signature for names.pm is wrong. It should look something more
like:


names.pm <- function (x)
{
}


As for the body of the function, you might do something like:


names.pm <- function (x)
{
 NextMethod()
}


but you don't need to define a names method if you're just going to call
the next method. I would suggest not defining a names method at all.


As a side note, I would suggest making your class through the methods
package, with methods::setClass("pm", ...)
See the documentation for setClass for more details, it's the recommended
way to define classes in R.


That's incorrect.  It is *a* recommended way to define classes in R, but 
there are other recommended ways as well, for doing other kinds of 
things, and many people stick with the S3 system without formal classes 
at all.


If you're writing a Bioconductor package you should probably use the 
formal methods.  If you're writing code for other purposes, you should 
think about whether you need formal classes at all, and if so, whether 
the methods package formalism is a match for what you're doing.


Duncan Murdoch

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] names.data.frame?

2021-11-03 Thread Rui Barradas

Hello,

I too would expected NextMethod to work but the following seems to be 
simpler.

"names" is an attribute, so should be accessible with attr.



names.pm = function(p) {
  attr(p, "names")
}

p = data.frame(x=1:3, coeff=1)
class(p) = c("pm", class(p));

names(p)
#[1] "x" "coeff"



Hope this helps,

Rui Barradas

Às 18:35 de 03/11/21, Leonard Mada via R-help escreveu:

Dear List members,


Is there a way to access the default names() function?


I tried the following:

# Multi-variable polynomial

p = data.frame(x=1:3, coeff=1)

class(p) = c("pm", class(p));


names.pm = function(p) {
# .Primitive("names")(p) # does NOT function
# .Internal("names")(p) # does NOT function
# nms = names.default(p) # does NOT exist
# nms = names.data.frame(p) # does NOT exist
# nms = names(p); # obvious infinite recursion;
nms = names(unclass(p));
}


Alternatively:

Would it be better to use dimnames.pm instead of names.pm?

I am not fully aware of the advantages and disadvantages of dimnames vs 
names.



Sincerely,


Leonard

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sink() not working as expected

2021-11-03 Thread Rich Shepard

On Wed, 3 Nov 2021, Ivan Krylov wrote:


instead. When you source() a script, auto-printing is not performed. This
is explained in the first paragraph of ?source, but not ?sink. If you want
to source() scripts and rely on their output (including sink()), you'll
need to print() results explicitly.


Ivan,

I've read ?source and still do not understand where to put either auto-print
or an explicit print statement. For example,
cor_disc %>%
select(site_nbr,year, mon, day, hr, min, tz, cfs) %>%
mutate(
sampdt = make_datetime(year, mon, day, hr, min)
)
print(cor_disc)

throws an error. I also get an error when the print statement follows the
sampdt assignment.

I need to understand how to get a sourced file that's modified (as these
tibbles are by combining data and time columns into a datatime colune) and
saving the tibble with the new appended column.

Rich

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] What to do when problems() returns nothing

2021-11-03 Thread Rich Shepard

On Wed, 3 Nov 2021, Bert Gunter wrote:


More to the point, the tidyverse galaxy tries to largely replace R's
standard functionality and has its own help forum. So I think you should
post there, rather than here, for questions about it:
https://www.tidyverse.org/help/


Bert,

Thank you very much. I am tying to learn tidyverse and had no idea it had
it's own help.

I will post tidyverse questions there.

Regards,

Rich

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] What to do when problems() returns nothing

2021-11-03 Thread Bert Gunter
None of this is for R's *standard* packages. The posting guide, linked
below, says:

"For questions about functions in standard packages distributed with R (see
the FAQ Add-on packages in R
), ask
questions on R-help.
If the question relates to a *contributed package* , e.g., one downloaded
from CRAN, try contacting the package maintainer first. You can also use
find("functionname") and packageDescription("packagename") to find this
information. *Only* send such questions to R-help or R-devel if you get no
reply or need further assistance. This applies to both requests for help
and to bug reports."

More to the point, the tidyverse galaxy tries to largely replace R's
standard functionality and has its own help forum. So I think you should
post there, rather than here, for questions about it:
https://www.tidyverse.org/help/



Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Wed, Nov 3, 2021 at 11:50 AM Rich Shepard 
wrote:

> When I source the import_data.R script now I get errors that tell me to
> look
> at problems(). I enter that function name but there's no return.
>
> Reading ?problems I learned that stop_for_problems(x) should stop the
> process when a problem occurs, so I added that function to each data file;
> for example,
>
> library(tidyverse)
> library(lubridate)
>
> cor_disc <- read_csv("../data/cor-disc.csv", col_names = TRUE,
>   col_types = list (
>   site_nbr = col_character(),
>   year = col_integer(),
>   mon = col_integer(),
>   day = col_integer(),
>   hr = col_double(),
>   min = col_double(),
>   cfs = col_integer())
>   )
> stop_for_problems(cor_disc)
>
> running the command, source('input_data.R') produces this result:
> > source('import_data.R')
> > Error: 415903 parsing failures
> > In addition: Warning message:
> > One or more parsing issues, see `problems()` for details
>
> When I run the problems() function nothing is returned:
> > problems()
> >
>
> What do I read to learn how to identify the problems so I can fix them?
>
> TIA,
>
> Rich
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] names.data.frame?

2021-11-03 Thread Ivan Krylov
On Wed, 3 Nov 2021 20:35:58 +0200
Leonard Mada via R-help  wrote:

> class(p) = c("pm", class(p));

Does NextMethod() work for you?

-- 
Best regards,
Ivan

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] names.data.frame?

2021-11-03 Thread Andrew Simmons
First, your signature for names.pm is wrong. It should look something more
like:


names.pm <- function (x)
{
}


As for the body of the function, you might do something like:


names.pm <- function (x)
{
NextMethod()
}


but you don't need to define a names method if you're just going to call
the next method. I would suggest not defining a names method at all.


As a side note, I would suggest making your class through the methods
package, with methods::setClass("pm", ...)
See the documentation for setClass for more details, it's the recommended
way to define classes in R.

On Wed, Nov 3, 2021 at 2:36 PM Leonard Mada via R-help 
wrote:

> Dear List members,
>
>
> Is there a way to access the default names() function?
>
>
> I tried the following:
>
> # Multi-variable polynomial
>
> p = data.frame(x=1:3, coeff=1)
>
> class(p) = c("pm", class(p));
>
>
> names.pm = function(p) {
> # .Primitive("names")(p) # does NOT function
> # .Internal("names")(p) # does NOT function
> # nms = names.default(p) # does NOT exist
> # nms = names.data.frame(p) # does NOT exist
> # nms = names(p); # obvious infinite recursion;
> nms = names(unclass(p));
> }
>
>
> Alternatively:
>
> Would it be better to use dimnames.pm instead of names.pm?
>
> I am not fully aware of the advantages and disadvantages of dimnames vs
> names.
>
>
> Sincerely,
>
>
> Leonard
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] What to do when problems() returns nothing

2021-11-03 Thread Rich Shepard

When I source the import_data.R script now I get errors that tell me to look
at problems(). I enter that function name but there's no return.

Reading ?problems I learned that stop_for_problems(x) should stop the
process when a problem occurs, so I added that function to each data file;
for example,

library(tidyverse)
library(lubridate)

cor_disc <- read_csv("../data/cor-disc.csv", col_names = TRUE,
 col_types = list (
 site_nbr = col_character(),
 year = col_integer(),
 mon = col_integer(),
 day = col_integer(),
 hr = col_double(),
 min = col_double(),
 cfs = col_integer())
 )
stop_for_problems(cor_disc)

running the command, source('input_data.R') produces this result:

source('import_data.R')
Error: 415903 parsing failures
In addition: Warning message:
One or more parsing issues, see `problems()` for details


When I run the problems() function nothing is returned:

problems()



What do I read to learn how to identify the problems so I can fix them?

TIA,

Rich

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] names.data.frame?

2021-11-03 Thread Leonard Mada via R-help

Dear List members,


Is there a way to access the default names() function?


I tried the following:

# Multi-variable polynomial

p = data.frame(x=1:3, coeff=1)

class(p) = c("pm", class(p));


names.pm = function(p) {
# .Primitive("names")(p) # does NOT function
# .Internal("names")(p) # does NOT function
# nms = names.default(p) # does NOT exist
# nms = names.data.frame(p) # does NOT exist
# nms = names(p); # obvious infinite recursion;
nms = names(unclass(p));
}


Alternatively:

Would it be better to use dimnames.pm instead of names.pm?

I am not fully aware of the advantages and disadvantages of dimnames vs 
names.



Sincerely,


Leonard

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fwd: Merging multiple csv files to new file

2021-11-03 Thread Avi Gross via R-help
I am not clear why Python came up on this forum. Yes, you can do all sorts of 
stuff in Python (or other languages) in ways similar or not to doing them in R.

The topic here was reading in data from multiple CSV files and I saw no mention 
about whether some columns are supposed to be of type character or other types.

As noted, if the (CSV) file is properly formatted and whatever function you use 
to read them in does not guess right, you can use versions of functions that 
let you specify what type to expect OR change it after you read it in.

One poster seems to be confused and perhaps things what is being read from is 
some other kind of EXCEL data file. There are other functions that can read 
from those but I believe a CSV has no real issues unless it is formatted wrong 
or in ways that require you to ask for comments to be ignored or skip a few 
lines and so on.

Nontheless, Gabriella needs to spell out a bit more about her project as all we 
know now is to suggest she read in each file sequentially (or in a loop) into 
multiple R variables. Beyond that, it is not clear what she wants to do in 
combining them and I am not so sure an rbind() makes much sense.

So what she needs perhaps is to look at functions like read.csv() and 
write.csv() and consider what transformations to make in the data read in and 
then how to recombine them.

Of course, if I have completely misunderstood what she wants, never mind!

-Original Message-
From: R-help  On Behalf Of Jeff Newmiller
Sent: Wednesday, November 3, 2021 1:22 PM
To: r-help@r-project.org; Robert Knight ; gabrielle 
aban steinberg 
Cc: r-help 
Subject: Re: [R] Fwd: Merging multiple csv files to new file

Data type in a CSV is always character until inferred otherwise... it is not 
necessary nor even easier to manipulate files with Python if you are planning 
to use R to manipulate the data further with R. Just use the 
colClasses="character" argument for read.csv.

On November 3, 2021 9:47:03 AM PDT, Robert Knight  
wrote:
>It might be easier to settle on the desired final csv layout and use 
>Python to copy the rows via line reads.  Python doesn't care about the 
>data type in a given "cell", numeric or char, whereas the type errors R 
>would encounter would make the task very difficult.
>
>On Wed, Nov 3, 2021, 10:36 AM gabrielle aban steinberg < 
>gabrielleabansteinb...@gmail.com> wrote:
>
>> Hello, I would like to merge 18 csv files into a master data csv 
>> file, but each file has a different number of columns (mostly found 
>> in one or more of the other cvs files) and different number of rows.
>>
>> I have tried something like the following in R Studio (cloud):
>>
>> all_data_fit_files <- rbind("dailyActivity_merged.csv", 
>> "dailyCalories_merged.csv", "dailyIntensities_merged.csv", 
>> "dailySteps_merged.csv", "heartrate_seconds_merged.csv", 
>> "hourlyCalories_merged.csv", "hourlyIntensities_merged.csv", 
>> "hourlySteps_merged.csv", "minuteCaloriesNarrow_merged.csv",
>> "minuteCaloriesWide_merged.csv", 
>> "minuteIntensitiesNarrow_merged.csv",
>> "minuteIntensitiesWide_merged.csv", "minuteMETsNarrow_merged.csv", 
>> "minuteSleep_merged.csv", "minuteStepsNarrow_merged.csv", 
>> “minuteStepsWide_merged.csv", "sleepDay_merged.csv", 
>> "minuteStepsWide_merged.csv", "sleepDay_merged.csv",
>> "weightLogInfo_merged.csv")
>>
>>
>>
>> But I am getting the following error:
>>
>> Error: unexpected input in "rlySteps_merged.csv", 
>> "minuteCaloriesNarrow_merged.csv", "minuteCaloriesWide_merged.csv", 
>> "minuteIntensitiesNarrow_merged.csv",
>> "minuteIntensitiesWide_merged.csv", "minuteMETsNarrow_merged.csv"
>>
>>
>> (Maybe the R Studio free trial/usage is underpowered for my project?)
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see 
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>   [[alternative HTML version deleted]]
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see 
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide 
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

--
Sent from my phone. Please excuse my brevity.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see 
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 

Re: [R] Fwd: Merging multiple csv files to new file

2021-11-03 Thread Jeff Newmiller
>(Maybe the R Studio free trial/usage is underpowered for my project?)

- R is a computer language, as well as a program for interpreting R source code.
- RStudio Desktop is an editor with "features" intended to make using R easy. 
It cannot "do" anything without R being installed.
- R is completely free. There is no "trial" period for using R. There are no 
"crippled" versions of R.
- RStudio Desktop has both free and paid versions, but they have very nearly 
identical capabilities. The most significant difference is that you get tech 
support with the paid version. [1]

So no, your difficulty lies not with what you downloaded but with how you are 
expressing your desires with the R language (with or without RStudio), and 
others have suggested ways you could correct that.

[1] https://www.rstudio.com/products/rstudio/

On November 2, 2021 3:30:46 PM PDT, gabrielle aban steinberg 
 wrote:
>Hello, I would like to merge 18 csv files into a master data csv file, but
>each file has a different number of columns (mostly found in one or more of
>the other cvs files) and different number of rows.
>
>I have tried something like the following in R Studio (cloud):
>
>all_data_fit_files <- rbind("dailyActivity_merged.csv",
>"dailyCalories_merged.csv", "dailyIntensities_merged.csv",
>"dailySteps_merged.csv", "heartrate_seconds_merged.csv",
>"hourlyCalories_merged.csv", "hourlyIntensities_merged.csv",
>"hourlySteps_merged.csv", "minuteCaloriesNarrow_merged.csv",
>"minuteCaloriesWide_merged.csv", "minuteIntensitiesNarrow_merged.csv",
>"minuteIntensitiesWide_merged.csv", "minuteMETsNarrow_merged.csv",
>"minuteSleep_merged.csv", "minuteStepsNarrow_merged.csv",
>“minuteStepsWide_merged.csv", "sleepDay_merged.csv",
>"minuteStepsWide_merged.csv", "sleepDay_merged.csv",
>"weightLogInfo_merged.csv")
>
>
>
>But I am getting the following error:
>
>Error: unexpected input in "rlySteps_merged.csv",
>"minuteCaloriesNarrow_merged.csv", "minuteCaloriesWide_merged.csv",
>"minuteIntensitiesNarrow_merged.csv",
>"minuteIntensitiesWide_merged.csv", "minuteMETsNarrow_merged.csv"
>
>
>(Maybe the R Studio free trial/usage is underpowered for my project?)
>
>   [[alternative HTML version deleted]]
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

-- 
Sent from my phone. Please excuse my brevity.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fwd: Merging multiple csv files to new file

2021-11-03 Thread Jeff Newmiller
Data type in a CSV is always character until inferred otherwise... it is not 
necessary nor even easier to manipulate files with Python if you are planning 
to use R to manipulate the data further with R. Just use the 
colClasses="character" argument for read.csv.

On November 3, 2021 9:47:03 AM PDT, Robert Knight  
wrote:
>It might be easier to settle on the desired final csv layout and use Python
>to copy the rows via line reads.  Python doesn't care about the data type
>in a given "cell", numeric or char, whereas the type errors R would
>encounter would make the task very difficult.
>
>On Wed, Nov 3, 2021, 10:36 AM gabrielle aban steinberg <
>gabrielleabansteinb...@gmail.com> wrote:
>
>> Hello, I would like to merge 18 csv files into a master data csv file, but
>> each file has a different number of columns (mostly found in one or more of
>> the other cvs files) and different number of rows.
>>
>> I have tried something like the following in R Studio (cloud):
>>
>> all_data_fit_files <- rbind("dailyActivity_merged.csv",
>> "dailyCalories_merged.csv", "dailyIntensities_merged.csv",
>> "dailySteps_merged.csv", "heartrate_seconds_merged.csv",
>> "hourlyCalories_merged.csv", "hourlyIntensities_merged.csv",
>> "hourlySteps_merged.csv", "minuteCaloriesNarrow_merged.csv",
>> "minuteCaloriesWide_merged.csv", "minuteIntensitiesNarrow_merged.csv",
>> "minuteIntensitiesWide_merged.csv", "minuteMETsNarrow_merged.csv",
>> "minuteSleep_merged.csv", "minuteStepsNarrow_merged.csv",
>> “minuteStepsWide_merged.csv", "sleepDay_merged.csv",
>> "minuteStepsWide_merged.csv", "sleepDay_merged.csv",
>> "weightLogInfo_merged.csv")
>>
>>
>>
>> But I am getting the following error:
>>
>> Error: unexpected input in "rlySteps_merged.csv",
>> "minuteCaloriesNarrow_merged.csv", "minuteCaloriesWide_merged.csv",
>> "minuteIntensitiesNarrow_merged.csv",
>> "minuteIntensitiesWide_merged.csv", "minuteMETsNarrow_merged.csv"
>>
>>
>> (Maybe the R Studio free trial/usage is underpowered for my project?)
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>   [[alternative HTML version deleted]]
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

-- 
Sent from my phone. Please excuse my brevity.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fwd: Merging multiple csv files to new file

2021-11-03 Thread Bill Dunlap
The error message arises because you are sometimes delimiting character
strings using non-ASCII open and close double quotes, '“' and '”', instead
of the old-fashioned ones, '"', which have no open or close variants.  This
is a language syntax error, so R didn't try to compute anything.

The others' comments are still valid - you need to read the files named by
these strings to produce R datasets and combine the datasets.

-Bill


On Wed, Nov 3, 2021 at 8:36 AM gabrielle aban steinberg <
gabrielleabansteinb...@gmail.com> wrote:

> Hello, I would like to merge 18 csv files into a master data csv file, but
> each file has a different number of columns (mostly found in one or more of
> the other cvs files) and different number of rows.
>
> I have tried something like the following in R Studio (cloud):
>
> all_data_fit_files <- rbind("dailyActivity_merged.csv",
> "dailyCalories_merged.csv", "dailyIntensities_merged.csv",
> "dailySteps_merged.csv", "heartrate_seconds_merged.csv",
> "hourlyCalories_merged.csv", "hourlyIntensities_merged.csv",
> "hourlySteps_merged.csv", "minuteCaloriesNarrow_merged.csv",
> "minuteCaloriesWide_merged.csv", "minuteIntensitiesNarrow_merged.csv",
> "minuteIntensitiesWide_merged.csv", "minuteMETsNarrow_merged.csv",
> "minuteSleep_merged.csv", "minuteStepsNarrow_merged.csv",
> “minuteStepsWide_merged.csv", "sleepDay_merged.csv",
> "minuteStepsWide_merged.csv", "sleepDay_merged.csv",
> "weightLogInfo_merged.csv")
>
>
>
> But I am getting the following error:
>
> Error: unexpected input in "rlySteps_merged.csv",
> "minuteCaloriesNarrow_merged.csv", "minuteCaloriesWide_merged.csv",
> "minuteIntensitiesNarrow_merged.csv",
> "minuteIntensitiesWide_merged.csv", "minuteMETsNarrow_merged.csv"
>
>
> (Maybe the R Studio free trial/usage is underpowered for my project?)
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fwd: Merging multiple csv files to new file

2021-11-03 Thread Robert Knight
It might be easier to settle on the desired final csv layout and use Python
to copy the rows via line reads.  Python doesn't care about the data type
in a given "cell", numeric or char, whereas the type errors R would
encounter would make the task very difficult.

On Wed, Nov 3, 2021, 10:36 AM gabrielle aban steinberg <
gabrielleabansteinb...@gmail.com> wrote:

> Hello, I would like to merge 18 csv files into a master data csv file, but
> each file has a different number of columns (mostly found in one or more of
> the other cvs files) and different number of rows.
>
> I have tried something like the following in R Studio (cloud):
>
> all_data_fit_files <- rbind("dailyActivity_merged.csv",
> "dailyCalories_merged.csv", "dailyIntensities_merged.csv",
> "dailySteps_merged.csv", "heartrate_seconds_merged.csv",
> "hourlyCalories_merged.csv", "hourlyIntensities_merged.csv",
> "hourlySteps_merged.csv", "minuteCaloriesNarrow_merged.csv",
> "minuteCaloriesWide_merged.csv", "minuteIntensitiesNarrow_merged.csv",
> "minuteIntensitiesWide_merged.csv", "minuteMETsNarrow_merged.csv",
> "minuteSleep_merged.csv", "minuteStepsNarrow_merged.csv",
> “minuteStepsWide_merged.csv", "sleepDay_merged.csv",
> "minuteStepsWide_merged.csv", "sleepDay_merged.csv",
> "weightLogInfo_merged.csv")
>
>
>
> But I am getting the following error:
>
> Error: unexpected input in "rlySteps_merged.csv",
> "minuteCaloriesNarrow_merged.csv", "minuteCaloriesWide_merged.csv",
> "minuteIntensitiesNarrow_merged.csv",
> "minuteIntensitiesWide_merged.csv", "minuteMETsNarrow_merged.csv"
>
>
> (Maybe the R Studio free trial/usage is underpowered for my project?)
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fwd: Merging multiple csv files to new file

2021-11-03 Thread Bert Gunter
I should have added that once read into R, the collection of data frames
(presumably) can also be saved in one .Rdata file via save() **without**
first combining them into a list. I still prefer keeping them together as
one list in R, but that's up to you.

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Wed, Nov 3, 2021 at 9:12 AM Bert Gunter  wrote:

> 1. Think more carefully about the appropriate data structure for what you
> wish to do. It's unlikely to be .csv files, however.
>
> In the absence of the above, a simple (but perhaps inappropriate) default
> is:
>
> 2. Read the files into R and combine into a list.(You will need to read
> about lists in R if you don't know what these are, of course).
>
> 3. Save your list as an .Rdata file. See ?save and ?load for details. But
> do note that such files are special binary files only (easily anyway)
> readable by R.
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Wed, Nov 3, 2021 at 8:36 AM gabrielle aban steinberg <
> gabrielleabansteinb...@gmail.com> wrote:
>
>> Hello, I would like to merge 18 csv files into a master data csv file, but
>> each file has a different number of columns (mostly found in one or more
>> of
>> the other cvs files) and different number of rows.
>>
>> I have tried something like the following in R Studio (cloud):
>>
>> all_data_fit_files <- rbind("dailyActivity_merged.csv",
>> "dailyCalories_merged.csv", "dailyIntensities_merged.csv",
>> "dailySteps_merged.csv", "heartrate_seconds_merged.csv",
>> "hourlyCalories_merged.csv", "hourlyIntensities_merged.csv",
>> "hourlySteps_merged.csv", "minuteCaloriesNarrow_merged.csv",
>> "minuteCaloriesWide_merged.csv", "minuteIntensitiesNarrow_merged.csv",
>> "minuteIntensitiesWide_merged.csv", "minuteMETsNarrow_merged.csv",
>> "minuteSleep_merged.csv", "minuteStepsNarrow_merged.csv",
>> “minuteStepsWide_merged.csv", "sleepDay_merged.csv",
>> "minuteStepsWide_merged.csv", "sleepDay_merged.csv",
>> "weightLogInfo_merged.csv")
>>
>>
>>
>> But I am getting the following error:
>>
>> Error: unexpected input in "rlySteps_merged.csv",
>> "minuteCaloriesNarrow_merged.csv", "minuteCaloriesWide_merged.csv",
>> "minuteIntensitiesNarrow_merged.csv",
>> "minuteIntensitiesWide_merged.csv", "minuteMETsNarrow_merged.csv"
>>
>>
>> (Maybe the R Studio free trial/usage is underpowered for my project?)
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fwd: Merging multiple csv files to new file

2021-11-03 Thread Bert Gunter
1. Think more carefully about the appropriate data structure for what you
wish to do. It's unlikely to be .csv files, however.

In the absence of the above, a simple (but perhaps inappropriate) default
is:

2. Read the files into R and combine into a list.(You will need to read
about lists in R if you don't know what these are, of course).

3. Save your list as an .Rdata file. See ?save and ?load for details. But
do note that such files are special binary files only (easily anyway)
readable by R.


Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Wed, Nov 3, 2021 at 8:36 AM gabrielle aban steinberg <
gabrielleabansteinb...@gmail.com> wrote:

> Hello, I would like to merge 18 csv files into a master data csv file, but
> each file has a different number of columns (mostly found in one or more of
> the other cvs files) and different number of rows.
>
> I have tried something like the following in R Studio (cloud):
>
> all_data_fit_files <- rbind("dailyActivity_merged.csv",
> "dailyCalories_merged.csv", "dailyIntensities_merged.csv",
> "dailySteps_merged.csv", "heartrate_seconds_merged.csv",
> "hourlyCalories_merged.csv", "hourlyIntensities_merged.csv",
> "hourlySteps_merged.csv", "minuteCaloriesNarrow_merged.csv",
> "minuteCaloriesWide_merged.csv", "minuteIntensitiesNarrow_merged.csv",
> "minuteIntensitiesWide_merged.csv", "minuteMETsNarrow_merged.csv",
> "minuteSleep_merged.csv", "minuteStepsNarrow_merged.csv",
> “minuteStepsWide_merged.csv", "sleepDay_merged.csv",
> "minuteStepsWide_merged.csv", "sleepDay_merged.csv",
> "weightLogInfo_merged.csv")
>
>
>
> But I am getting the following error:
>
> Error: unexpected input in "rlySteps_merged.csv",
> "minuteCaloriesNarrow_merged.csv", "minuteCaloriesWide_merged.csv",
> "minuteIntensitiesNarrow_merged.csv",
> "minuteIntensitiesWide_merged.csv", "minuteMETsNarrow_merged.csv"
>
>
> (Maybe the R Studio free trial/usage is underpowered for my project?)
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fwd: Merging multiple csv files to new file

2021-11-03 Thread Avi Gross via R-help
Gabrielle,

Why would you expect that to work?

rbind() binds rows of internal R data structures that are some variety of 
data.frame with exactly the same columns in the same order into a larger object 
of that type.

You are not providing rbind() with the names of variables holding the info but 
file names of Comma Separated Values.

If you have many files with different numbers of columns of data with some 
overlap, you need to decide on quite a few things first. If a file has say 4 
columns out of a possible 20 unique columns across the files, do you want to 
add 16 columns to the contents of the file, after reading it in, and re-arrange 
it into a specific order by column? What will you fill in the new columns with? 
NA is a popular choice but you need to decide.

You then need to repeat the same thing with all the other files and read in 6 
columns then add 14 filled as you wish and rearrange the columns to the same 
order.

When done, you have an assortment of variables of class data.frame (or other 
similar ones) and you can use rbind() on those variables to get a result.

But it may not be what you want. You may actually want more of a database merge 
type of operation combining columns from each into the same userID field or 
whatever. rbind() is not the function to do that with and I won't go on to give 
a long tutorial. 

My main point is what you are doing is at the wrong level. You need to read all 
the files into variable before doing additional calculations in R.

-Original Message-
From: R-help  On Behalf Of gabrielle aban 
steinberg
Sent: Tuesday, November 2, 2021 6:31 PM
To: r-help@r-project.org
Subject: [R] Fwd: Merging multiple csv files to new file

Hello, I would like to merge 18 csv files into a master data csv file, but each 
file has a different number of columns (mostly found in one or more of the 
other cvs files) and different number of rows.

I have tried something like the following in R Studio (cloud):

all_data_fit_files <- rbind("dailyActivity_merged.csv", 
"dailyCalories_merged.csv", "dailyIntensities_merged.csv", 
"dailySteps_merged.csv", "heartrate_seconds_merged.csv", 
"hourlyCalories_merged.csv", "hourlyIntensities_merged.csv", 
"hourlySteps_merged.csv", "minuteCaloriesNarrow_merged.csv",
"minuteCaloriesWide_merged.csv", "minuteIntensitiesNarrow_merged.csv",
"minuteIntensitiesWide_merged.csv", "minuteMETsNarrow_merged.csv", 
"minuteSleep_merged.csv", "minuteStepsNarrow_merged.csv", 
“minuteStepsWide_merged.csv", "sleepDay_merged.csv", 
"minuteStepsWide_merged.csv", "sleepDay_merged.csv",
"weightLogInfo_merged.csv")



But I am getting the following error:

Error: unexpected input in "rlySteps_merged.csv", 
"minuteCaloriesNarrow_merged.csv", "minuteCaloriesWide_merged.csv", 
"minuteIntensitiesNarrow_merged.csv",
"minuteIntensitiesWide_merged.csv", "minuteMETsNarrow_merged.csv"


(Maybe the R Studio free trial/usage is underpowered for my project?)

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see 
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Is there a hash data structure for R

2021-11-03 Thread Avi Gross via R-help
Jack, I was agreeing with you and pointing out that although changing names
of columns to be unique has a positive side, it makes it hard to use for
anything that needs to look like either a set or a bag and of course a
dictionary/hash. All the above want to put things in using some identifier
and expect to get back the same.

R actually has other places names can be changed or dynamically created
often with defaults. This can be convenient or annoying as you need to look
at the resulting names to be able to use them. A feature that is nice in
some programs and parts of the tidyverse packages is the ability to specify
things like suffixes to be used.

I use lots of computer languages and keep running into people who expect
another language to just support what the first does. If that were true, we
might not create so many. If your other language supports indefinite sized
integers, good for you. Many others do not and perhaps may not easily do so
even if you try to create your own emulation if some other code gets your
version and does not work on it.

So assuming you use an available package or roll your own and can now make
some kind of hash data structure. As you pointed out, hashing may not be
required if your implementation is already fast and hashing can use lots of
memory. What are the allowed keys in your implementation? Will an integer of
1 be distinct from a floating point of 1.0? Can you hash objects of the
half-dozen or so kinds R seems to have? Whatever your answers are to many
questions like these, you may not get quite the same answers in Python or
PERL or ...

R is not really meant to be a general-purpose language although it can be.
It is not really fully-blown Object Oriented  but in many ways it can be
using newer grafts. So if you need or want the things R is designed for and
in particular there are good packages available for your needs, then use R.
If your needs include things not easily part of R, and you have a language
that works for you, why switch?

I will note that there is an intermediate path. I often run programs in
RSTUDIO that include code for both R and Python. The data structures used
sort of convert back and forth as needed so you can begin in R and read in
data and make some changes and then hand it over to Python for more, perhaps
multiple times, then generate a graph within R or whatever.

So if you want a dictionary, you can sort of keep it on the Python side and
use Python commands to create it and add to it or access contents. The
results may be handed over to the R side as needed but not as a dict but
instead to a pairlist or named list/vector or whatever and when needed, you
can have python take your results. The same applies to lots of other things
Python does that R may not do quite the same or at all. I mean generators
and all kinds of object-oriented programs including multiple inheritance and
so much more. You can use each language for what it is good at or that work
with the way you think and with some overhead get the best of both worlds.

Of course, this comes with costs and any programs you send out to be used by
others would require both languages to be installed properly and ...

-Original Message-
From: R-help  On Behalf Of Jan van der Laan
Sent: Wednesday, November 3, 2021 5:47 AM
To: r-help@r-project.org
Subject: Re: [R] Is there a hash data structure for R



On 03-11-2021 00:42, Avi Gross via R-help wrote:

> 
> Finally, someone mentioned how creating a data.frame with duplicate 
> names for columns is not a problem as it can automagically CHANGE them 
> to be unique. That is a HUGE problem for using that as a dictionary as 
> the new name will not be known to the system so all kinds of things will
fail.

I think you are referring to my remark which was:

 > However, the data.frame construction method will detect this and  >
generate unique names (which also might not be what you want):

I didn't say this means that duplicate names aren't a problem; I just
mentioned the the behaviour is different. Personally, I would actually
prefer the behaviour of list (keep the duplicated name) with a warning.

Most of the responses seem to assume that the OP actually wants a hash
table. Yes, he did ask for that and for a hash table an environment (with
some work) would be a good option. But in many cases, where other languages
would use a hash-table-like object (such as a dict) in R you would use other
types of objects. Furthermore, for many operations where you might use hash
tables to implement the operation, R has already built in options, for
example %in%, match, duplicated. These are also vectorised; so two vectors:
one with keys and one with values might actually be faster than an
environment in some use cases.

Best,
Jan


> 
> And there are also packages for many features like sets as well as 
> functions to manipulate these things.
> 
> -Original Message-
> From: R-help  On Behalf Of Bill Dunlap
> Sent: Tuesday, November 2, 2021 1:26 

[R] Fwd: Merging multiple csv files to new file

2021-11-03 Thread gabrielle aban steinberg
Hello, I would like to merge 18 csv files into a master data csv file, but
each file has a different number of columns (mostly found in one or more of
the other cvs files) and different number of rows.

I have tried something like the following in R Studio (cloud):

all_data_fit_files <- rbind("dailyActivity_merged.csv",
"dailyCalories_merged.csv", "dailyIntensities_merged.csv",
"dailySteps_merged.csv", "heartrate_seconds_merged.csv",
"hourlyCalories_merged.csv", "hourlyIntensities_merged.csv",
"hourlySteps_merged.csv", "minuteCaloriesNarrow_merged.csv",
"minuteCaloriesWide_merged.csv", "minuteIntensitiesNarrow_merged.csv",
"minuteIntensitiesWide_merged.csv", "minuteMETsNarrow_merged.csv",
"minuteSleep_merged.csv", "minuteStepsNarrow_merged.csv",
“minuteStepsWide_merged.csv", "sleepDay_merged.csv",
"minuteStepsWide_merged.csv", "sleepDay_merged.csv",
"weightLogInfo_merged.csv")



But I am getting the following error:

Error: unexpected input in "rlySteps_merged.csv",
"minuteCaloriesNarrow_merged.csv", "minuteCaloriesWide_merged.csv",
"minuteIntensitiesNarrow_merged.csv",
"minuteIntensitiesWide_merged.csv", "minuteMETsNarrow_merged.csv"


(Maybe the R Studio free trial/usage is underpowered for my project?)

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] tidyverse: read_csv() misses column [RESOLVED]

2021-11-03 Thread Rich Shepard

From krylov.r...@gmail.com Tue Nov  2 14:22:05 2021



instead. When you source() a script, auto-printing is not performed. This
is explained in the first paragraph of ?source, but not ?sink. If you want
to source() scripts and rely on their output (including sink()), you'll
need to print() results explicitly.

--

From akwsi...@gmail.com Tue Nov  2 14:31:26 2021
Date: Tue, 2 Nov 2021 17:31:12 -0400



cat in R behaves similarly to cat in unix-alikes, sends text to a stdout.
Usually, that stdout would be a file, but usually in R it is the R
Console. I think it might also help to note the difference between cat and
print:
x <- "test\n"
cat(x)
print(x)
produces
cat(x)
test
print(x)
[1] "test\n"


Ivan/Andrew,

Thank you both for increasing my knowledge of R. I've not before used sink()
and now I can use it properly. The issues are resolved.

Regards,

Rich

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Is there a hash data structure for R

2021-11-03 Thread Jan van der Laan




On 03-11-2021 00:42, Avi Gross via R-help wrote:



Finally, someone mentioned how creating a data.frame with duplicate names
for columns is not a problem as it can automagically CHANGE them to be
unique. That is a HUGE problem for using that as a dictionary as the new
name will not be known to the system so all kinds of things will fail.


I think you are referring to my remark which was:

> However, the data.frame construction method will detect this and
> generate unique names (which also might not be what you want):

I didn't say this means that duplicate names aren't a problem; I just 
mentioned the the behaviour is different. Personally, I would actually 
prefer the behaviour of list (keep the duplicated name) with a warning.


Most of the responses seem to assume that the OP actually wants a hash 
table. Yes, he did ask for that and for a hash table an environment 
(with some work) would be a good option. But in many cases, where other 
languages would use a hash-table-like object (such as a dict) in R you 
would use other types of objects. Furthermore, for many operations where 
you might use hash tables to implement the operation, R has already 
built in options, for example %in%, match, duplicated. These are also 
vectorised; so two vectors: one with keys and one with values might 
actually be faster than an environment in some use cases.


Best,
Jan




And there are also packages for many features like sets as well as functions
to manipulate these things.

-Original Message-
From: R-help  On Behalf Of Bill Dunlap
Sent: Tuesday, November 2, 2021 1:26 PM
To: Andrew Simmons 
Cc: R Help 
Subject: Re: [R] Is there a hash data structure for R

Note that an environment carries a hash table with it, while a named list
does not.  I think that looking up an entry in a list causes a hash table to
be created and thrown away.  Here are some timings involving setting and
getting various numbers of entries in environments and lists.  The times are
roughly linear in n for environments and quadratic for lists.


vapply(1e3 * 2 ^ (0:6), f, L=new.env(parent=emptyenv()),

FUN.VALUE=NA_real_)
[1] 0.00 0.00 0.00 0.02 0.03 0.06 0.15

vapply(1e3 * 2 ^ (0:6), f, L=list(), FUN.VALUE=NA_real_)

[1]  0.01  0.03  0.15  0.53  2.66 13.66 56.05

f

function(n, L, V = sprintf("V%07d", sample(n, replace=TRUE))) {
 system.time(for(v in V)L[[v]]<-c(L[[v]],v))["elapsed"] }

Note that environments do not allow an element named "" (the empty string).

Elements named NA_character_ are treated differently in environments and
lists, neither of which is great.  You may want your hash table functions to
deal with oddball names explicitly.

-Bill

On Tue, Nov 2, 2021 at 8:52 AM Andrew Simmons  wrote:


If you're thinking about using environments, I would suggest you
initialize them like


x <- new.env(parent = emptyenv())


Since environments have parent environments, it means that requesting
a value from that environment can actually return the value stored in
a parent environment (this isn't an issue for [[ or $, this is
exclusively an issue with assign, get, and exists) Or, if you've
already got your values stored in a list that you want to turn into an
environment:


x <- list2env(listOfValues, parent = emptyenv())


Hope this helps!


On Tue, Nov 2, 2021, 06:49 Yonghua Peng  wrote:


But for data.frame the colnames can be duplicated. Am I right?

Regards.

On Tue, Nov 2, 2021 at 6:29 PM Jan van der Laan 

wrote:




True, but in a lot of cases where a python user might use a dict
an R user will probably use a list; or when we are talking about
arrays of dicts in python, the R solution will probably be a
data.frame (with

each

dict field in a separate column).

Jan




On 02-11-2021 11:18, Eric Berger wrote:

One choice is
new.env(hash=TRUE)
in the base package



On Tue, Nov 2, 2021 at 11:48 AM Yonghua Peng  wrote:


I know this is a newbie question. But how do I implement the
hash

structure

which is available in other languages (in python it's dict)?

I know there is the list, but list's names can be duplicated here.


x <- list(x=1:5,y=month.name,x=3:7)



x


$x

[1] 1 2 3 4 5


$y

   [1] "January"   "February"  "March" "April" "May"

  "June"


   [7] "July"  "August""September" "October"   "November"

"December"



$x

[1] 3 4 5 6 7



Thanks a lot.

  [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more,
see https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more,
see https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide

http://www.R-project.org/posting-guide.html

and provide