Re: [R] Problems with closing R

2019-07-08 Thread jeremiah rounds
It can seem like it is hung when you default to saving your workspace on
close and you have a very large workspace in memory relative to your hard
drive write speed.  Are you sure it isn't that?

On Sun, Jul 7, 2019 at 1:18 PM Spencer Brackett <
spbracket...@saintjosephhs.com> wrote:

> Thank you. I will try upgrading and see if that solves the problem
>
> On Sun, Jul 7, 2019 at 4:08 PM Jeff Newmiller 
> wrote:
>
> > A) You ask whether uninstalling RStudio will delete files... I don't
> think
> > so but this is not the support area for RStudio.
> >
> > B) R will not delete your data files when uninstalled.
> >
> > C) I suspect that reinstalling software is unlikely to repair the
> symptoms
> > you describe (sounds like buggy software to me). Simply restarting
> RStudio
> > and plowing on would be about as effective but less work. However there
> > have been mentions on this list of bugs in RStudio related to
> > incompatibility with R 3.6, [1] which might be related to your problems
> so
> > upgrading RStudio to beta or downgrading R to 3.5.3 may make a
> difference.
> >
> > [1] https://stat.ethz.ch/pipermail/r-help/2019-July/463226.html
> >
> > On July 7, 2019 12:28:15 PM PDT, Spencer Brackett <
> > spbracket...@saintjosephhs.com> wrote:
> > >Hello,
> > >
> > >  I am trying to quit a current session on RStudio and the “quitting
> > >session” prompt from R has just continued to load. I assume that R is
> > >not
> > >responding for some reason. If the problem persists, and I were to
> > >uninstall and then reinstall R, would my saved .RData and other R files
> > >and
> > >environments saved on my desktop be deleted?
> > >
> > >Not sure if any other solutions to this issue.
> > >
> > >Best,
> > >
> > >Spencer
> > >
> > >   [[alternative HTML version deleted]]
> > >
> > >__
> > >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > >https://stat.ethz.ch/mailman/listinfo/r-help
> > >PLEASE do read the posting guide
> > >http://www.R-project.org/posting-guide.html
> > >and provide commented, minimal, self-contained, reproducible code.
> >
> > --
> > Sent from my phone. Please excuse my brevity.
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] difference between ifelse and if...else?

2017-12-13 Thread jeremiah rounds
ifelse is vectorized.

On Wed, Dec 13, 2017 at 7:31 AM, Jinsong Zhao  wrote:

> Hi there,
>
> I don't know why the following codes are return different results.
>
> > ifelse(3 > 2, 1:3, length(1:3))
> [1] 1
> > if (3 > 2) 1:3 else length(1:3)
> [1] 1 2 3
>
> Any hints?
>
> Best,
> Jinsong
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posti
> ng-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] JSON data in data frame

2017-01-13 Thread jeremiah rounds
I TAd a course in R computing and the first thing I told students was
"inspect. inspect. inspect."
d1 <- fromJSON('
http://api.openweathermap.org/data/2.5/group?id=524901,703448,2643743=metric=ec0313a918fa729d4372555ada5fb1f8
')
names(d1)
str(d1)
d1
d1$list
your_data = d1$list

On Fri, Jan 13, 2017 at 1:12 AM, Archit Soni 
wrote:

> Hi All,
>
> Warm greetings, I am stuck at an issue to convert incoming json response to
> data frame.
>
> I am using below code to get the data
>
> library(jsonlite)
> d1 <- fromJSON('
> http://api.openweathermap.org/data/2.5/group?id=524901,
> 703448,2643743=metric=ec0313a918fa729d4372555ada5fb1f8
> ')
>
> d2 <- as.data.frame(d1)
> ​
> typeof(d2)
> list
>
> can you please guide me how can i get this data into pure data.frame
> format. The list in d1 has nested data.frame objects.
>
> Note: If you are unable to get data from api then can use below json string
> to test it out:
>
> JSON: {"cnt":3,"list":[{"coord":{"lon":37.62,"lat":55.75},"sys":
> {"type":1,"id":7323,"message":0.193,"country":"RU","sunrise"
> :1484286631,"sunset":1484313983},"weather":[{"id":600,"main":"Snow","
> description":"light
> snow","icon":"13d"}],"main":{"temp":-3.75,"pressure":1005,"
> humidity":86,"temp_min":-4,"temp_max":-3},"visibility":
> 8000,"wind":{"speed":4,"deg":170},"clouds":{"all":90},"dt":
> 1484290800,"id":524901,"name":"Moscow"},{"coord":{"lon":30.
> 52,"lat":50.43},"sys":{"type":1,"id":7358,"message":0.1885,"
> country":"UA","sunrise":1484286787,"sunset":1484317236},"weather":[{"id":
> 804,"main":"Clouds","description":"overcast
> clouds","icon":"04d"}],"main":{"temp":-2,"pressure":1009,"
> humidity":92,"temp_min":-2,"temp_max":-2},"visibility":
> 9000,"wind":{"speed":4,"deg":250,"var_beg":210,"var_end":
> 270},"clouds":{"all":90},"dt":1484290800,"id":703448,"name":
> "Kiev"},{"coord":{"lon":-0.13,"lat":51.51},"sys":{"type":1,"
> id":5187,"message":0.1973,"country":"GB","sunrise":1484294413,"sunset":
> 1484324321},"weather":[{"id":802,"main":"Clouds","description":"scattered
> clouds","icon":"03n"}],"main":{"temp":0.7,"pressure":1002,"
> temp_min":0,"temp_max":2,"humidity":98},"visibility":
> 1,"wind":{"speed":6.2,"deg":270},"clouds":{"all":40},
> "dt":1484290200,"id":2643743,"name":"London"}]}
>
> Any help is appreciated.
>
> --
> Regards
> Archit
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] About populating a dataframe in a loop

2017-01-06 Thread jeremiah rounds
As a rule never rbind in a loop. It has O(n^2) run time because the rbind
itself can be O(n) (where n is the number of data.frames).  Instead either
put them all into a list with lapply or vector("list", length=) and then
datatable::rbindlist, do.call(rbind, thelist) or use the equivalent from
dplyr.  All of which will be much more efficient.



On Fri, Jan 6, 2017 at 8:46 PM, lily li  wrote:

> Hi Rui,
>
> Thanks for your reply. Yes, when I tried to rbind two dataframes, it works.
> However, if there are more than 50, it got stuck for hours. When I tried to
> terminate the process and open the csv file separately, it has only one
> data frame. What is the problem? Thanks.
>
>
> On Fri, Jan 6, 2017 at 11:12 AM, Rui Barradas 
> wrote:
>
> > Hello,
> >
> > Works with me:
> >
> > set.seed(6574)
> >
> > pre.mat = data.frame()
> > for(i in 1:10){
> > mat.temp = data.frame(x = rnorm(5), A = sample(LETTERS, 5, TRUE))
> > pre.mat = rbind(pre.mat, mat.temp)
> > }
> >
> > nrow(pre.mat)  # should be 50
> >
> >
> > Can you give us an example that doesn't work?
> >
> > Rui Barradas
> >
> >
> > Em 06-01-2017 18:00, lily li escreveu:
> >
> >> Hi R users,
> >>
> >> I have a question about filling a dataframe in R using a for loop.
> >>
> >> I created an empty dataframe first and then filled it, using the code:
> >> pre.mat = data.frame()
> >> for(i in 1:10){
> >>  mat.temp = data.frame(some values filled in)
> >>  pre.mat = rbind(pre.mat, mat.temp)
> >> }
> >> However, the resulted dataframe has not all the rows that I desired for.
> >> What is the problem and how to solve it? Thanks.
> >>
> >> [[alternative HTML version deleted]]
> >>
> >> __
> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide http://www.R-project.org/posti
> >> ng-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] unique dates per ID

2016-11-15 Thread jeremiah rounds
library(data.table)
setDT(df)
setkeyv(df, c("Subject", "dates"))
unique(df)  #gets what you want.

On Mon, Nov 14, 2016 at 11:38 PM, Jim Lemon  wrote:

> Hi Farnoosh,
> Try this:
>
> for(id in unique(df$Subject)) {
>  whichsub<-df$Subject==id
>  if(exists("newdf"))
>   newdf<-rbind(newdf,df[whichsub,][which(!duplicated(
> df$dates[whichsub])),])
>  else newdf<-df[whichsub,][which(!duplicated(df$dates[whichsub])),]
> }
>
> Jim
>
>
> On Tue, Nov 15, 2016 at 9:38 AM, Farnoosh Sheikhi via R-help
>  wrote:
> > Hi,
> > I have a data set like below:
> > Subject<- c("2", "2", "2", "3", "3", "3", "4", "4", "5", "5", "5",
> "5")dates<-c("2011-01-01", "2011-01-01", "2011-01-03" ,"2011-01-04",
> "2011-01-05", "2011-01-06" ,"2011-01-07", "2011-01-07", "2011-01-09"
> ,"2011-01-10" ,"2011-01-11" ,"2011-01-11")deps<-c("A", "B", "CC",
> "C", "CC", "A", "F", "DD", "A", "F", "FF", "D")df <- data.frame(Subject,
> dates, deps); df
> > I want to choose unique dates per ID in a way there are not duplicate
> dates per ID. I don't mind what department to pick. I really appreciate any
> help. Best,Farnoosh
> >
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Function argument and scope

2016-11-14 Thread jeremiah rounds
Hi,

Didn't bother to run the code because someone else said it might do what
you intended, and also your problem description was complete unto itself.

The issue is that R copies on change.  You are thinking like you have a
reference, which you do not.  That is not very R like in style, but it
certainly can be accomplished if you want via change of input class (See
new.env()).  A typical R style would be to make the modifications to the
input argument, return it, and then assign it back to the input object.

e.g.
test = myFunction(test)

If you really have some reason to want to change the data.frame in a
function without re-assigning it then check out data.table, which has that
as a side effect of how it operates.

Thanks,



On Sun, Nov 13, 2016 at 2:09 PM, Bernardo Doré  wrote:

> Hello list,
>
> my first post but I've been using this list as a help source for a while.
> Couldn't live without it.
>
> I am writing a function that takes a dataframe as an argument and in the
> end I intend to assign the result of some computation back to the
> dataframe. This is what I have so far:
>
> myFunction <- function(x){
>   y <- x[1,1]
>   z <- strsplit(as.character(y), split = " ")
>   if(length(z[[1]] > 1)){
> predictedWord <- z[[1]][length(z[[1]])]
> z <- z[[1]][-c(length(z[[1]]))]
> z <- paste(z, collapse = " ")
>   }
>   x[1,1] <- z
> }
>
> And lets say I create my dataframe like this:
> test <- data.frame(var1=c("a","b","c"),var2=c("d","e","f"))
>
> and then call
> myFunction(test)
>
> The problem is when I assign x[1,1] to y in the first operation inside the
> function, x becomes a dataframe inside the function scope and loses the
> reference to the dataframe "test" passed as argument. In the end when I
> assign z to what should be row 1 and column 1 of the "test" dataframe, it
> assigns to x inside the function scope and no modification is made on
> "test".
>
> I hope the problem statement is clear.
>
> Thank you,
>
> Bernardo Doré
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Putting a bunch of Excel files as data.frames into a list fails

2016-09-28 Thread jeremiah rounds
Try changing:
v_list_of_files[v_file]
to:
v_list_of_files[[v_file]]

Also are you sure you are not generating warnings? For example,
 l = list()
l["iris"] = iris;

Also, you can change it to lapply(v_files, function(v_file){...})


Have a good one,
Jeremiah

On Wed, Sep 28, 2016 at 8:02 AM,  wrote:

> Hi All,
>
> I need to read a bunch of Excel files and store them in R.
>
> I decided to store the different Excel files in data.frames in a named
> list where the names are the file names of each file (and that is
> different from the sources as far as I can see):
>
> -- cut --
> # Sources:
> # -
> http://stackoverflow.com/questions/11433432/importing-
> multiple-csv-files-into-r
> # -
> http://stackoverflow.com/questions/9564489/opening-all-
> files-in-a-folder-and-applying-a-function
> # -
> http://stackoverflow.com/questions/12945687/how-to-
> read-all-worksheets-in-an-excel-workbook-into-an-r-list-with-data-frame-e
>
> v_file_path <- "H:/2016/Analysen/Neukunden/Input"
> v_file_pattern <- "*.xlsx"
>
> v_files <- list.files(path = v_file_path,
>   pattern = v_file_pattern,
>   ignore.case = TRUE)
> print(v_files)
>
> v_list_of_files <- list()
>
> for (v_file in v_files) {
>   v_list_of_files[v_file] <- openxlsx::read.xlsx(
> file.path(v_file_path,
>   v_file))
> }
>
> This code does not work cause it stores only the first variable of each
> Excel file in a named list.
>
> What do I need to change to get it running?
>
> Kind regards
>
> Georg
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] why data.frame, mutate package and not lists

2016-09-14 Thread jeremiah rounds
There is also this syntax for adding variables
df[, "var5"] = 1:10

and the syntax sugar for row-oriented storage:
df[1:5,]

On Wed, Sep 14, 2016 at 11:40 AM, jeremiah rounds <roundsjerem...@gmail.com>
wrote:

> "If you want to add variable to data.frame you have to use attach, detach.
> Right?"
>
> Not quite.  Use it like a list to add a variable to a data.frame
>
> e.g.
> df = list()
> df$var1 = 1:10
> df = as.data.frame(df)
> df$var2 = 1:10
> df[["var3"]] = 1:10
> df
> df = as.list(df)
> df$var4 = 1:10
> as.data.frame(df)
>
> Ironically the primary reason to use a data.frame in my head is to signal
> that you are thinking of your data as a row-oriented tabular storage.
>  "Ironic" because in technical detail that is not a requirement to be a
> data.frame, but when I reflect on the typical way a seasoned R programmer
> approaches list and data.frames that is basically what they are
> communicating.
>
> I was going to post that a reason to use data.frames is to take advantages
> of optimizations and syntax sugar for data.frames, but in reality if code
> does not assume a row-oriented data structure in a data.frame there is not
> much I can think of that exists in the way of optimization.  For example,
> we could point to "subset" and say that is a reason to use data.frames and
> not list, but that only works if you use data.frame in a conventional way.
>
> In the end, my advice to you is if it is a table make it a data.frame and
> if it is not easily thought of as a table or row-oriented data structure
> keep it as a list.
>
> Thanks,
> Jeremiah
>
>
>
>
>
> On Wed, Sep 14, 2016 at 11:15 AM, Alaios via R-help <r-help@r-project.org>
> wrote:
>
>> thanks for all the answers. I think also ggplot2 requires data.frames.If
>> you want to add variable to data.frame you have to use attach, detach.
>> Right?Any more links that discuss thoe two different approaches?Alex
>>
>> On Wednesday, September 14, 2016 5:34 PM, Bert Gunter <
>> bgunter.4...@gmail.com> wrote:
>>
>>
>>  This is partially a matter of subjectve opinion, and so pointless; but
>> I would point out that data frames are the canonical structure for a
>> great many of R's modeling and graphics functions, e.g. lm, xyplot,
>> etc.
>>
>> As for mutate() etc., that's about UI's and user friendliness, and
>> imho my ho is meaningless.
>>
>> Best,
>> Bert
>> Bert Gunter
>>
>> "The trouble with having an open mind is that people keep coming along
>> and sticking things into it."
>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>
>>
>> On Wed, Sep 14, 2016 at 6:01 AM, Alaios via R-help <r-help@r-project.org>
>> wrote:
>> > Hi all,I have seen data.frames and operations from the mutate package
>> getting really popular. In the last years I have been using extensively
>> lists, is there any reason to not use lists and use other data types for
>> data manipulation and storage?
>> > Any article that describe their differences? I would like to thank you
>> for your replyRegardsAlex
>> >[[alternative HTML version deleted]]
>> >
>> > __
>> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide http://www.R-project.org/posti
>> ng-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>>
>>
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posti
>> ng-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] why data.frame, mutate package and not lists

2016-09-14 Thread jeremiah rounds
"If you want to add variable to data.frame you have to use attach, detach.
Right?"

Not quite.  Use it like a list to add a variable to a data.frame

e.g.
df = list()
df$var1 = 1:10
df = as.data.frame(df)
df$var2 = 1:10
df[["var3"]] = 1:10
df
df = as.list(df)
df$var4 = 1:10
as.data.frame(df)

Ironically the primary reason to use a data.frame in my head is to signal
that you are thinking of your data as a row-oriented tabular storage.
 "Ironic" because in technical detail that is not a requirement to be a
data.frame, but when I reflect on the typical way a seasoned R programmer
approaches list and data.frames that is basically what they are
communicating.

I was going to post that a reason to use data.frames is to take advantages
of optimizations and syntax sugar for data.frames, but in reality if code
does not assume a row-oriented data structure in a data.frame there is not
much I can think of that exists in the way of optimization.  For example,
we could point to "subset" and say that is a reason to use data.frames and
not list, but that only works if you use data.frame in a conventional way.

In the end, my advice to you is if it is a table make it a data.frame and
if it is not easily thought of as a table or row-oriented data structure
keep it as a list.

Thanks,
Jeremiah





On Wed, Sep 14, 2016 at 11:15 AM, Alaios via R-help 
wrote:

> thanks for all the answers. I think also ggplot2 requires data.frames.If
> you want to add variable to data.frame you have to use attach, detach.
> Right?Any more links that discuss thoe two different approaches?Alex
>
> On Wednesday, September 14, 2016 5:34 PM, Bert Gunter <
> bgunter.4...@gmail.com> wrote:
>
>
>  This is partially a matter of subjectve opinion, and so pointless; but
> I would point out that data frames are the canonical structure for a
> great many of R's modeling and graphics functions, e.g. lm, xyplot,
> etc.
>
> As for mutate() etc., that's about UI's and user friendliness, and
> imho my ho is meaningless.
>
> Best,
> Bert
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Wed, Sep 14, 2016 at 6:01 AM, Alaios via R-help 
> wrote:
> > Hi all,I have seen data.frames and operations from the mutate package
> getting really popular. In the last years I have been using extensively
> lists, is there any reason to not use lists and use other data types for
> data manipulation and storage?
> > Any article that describe their differences? I would like to thank you
> for your replyRegardsAlex
> >[[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with strftime error "character string is not in a standard unambiguous format"

2016-09-12 Thread jeremiah rounds
Not sure what the issue is with the provided code  but note:
library(lubridate)
 lubridate::dmy_hm("Thu, 25 Aug 2016 6:34 PM")
[1] "2016-08-25 18:34:00 UTC"


Though if you go that route: set the TZ because on the timestamp it is
ambiguous.

On Sun, Sep 11, 2016 at 10:57 PM, Chris Evans  wrote:

> I am trying to read activity data created by Garmin. It outputs dates like
> this:
>
> "Thu, 25 Aug 2016 6:34 PM"
>
> The problem that has stumped me is this:
>
> > strftime("Thu, 25 Aug 2016 6:34 PM",format="%a, %d %b %Y %I:%M %p")
> Error in as.POSIXlt.character(x, tz = tz) :
>   character string is not in a standard unambiguous format
>
> I _thought_ I had this running OK but that error is catching me now.  I
> think I've read ?strftime and written that format string correctly to match
> the input but I'm stumped now.
>
> Can someone advise me?  Many thanks in advance,
>
> Chris
>
>
> > sessionInfo()
> R version 3.3.1 (2016-06-21)
> Platform: x86_64-w64-mingw32/x64 (64-bit)
> Running under: Windows 10 x64 (build 10586)
>
> locale:
> [1] LC_COLLATE=English_United Kingdom.1252
> [2] LC_CTYPE=English_United Kingdom.1252
> [3] LC_MONETARY=English_United Kingdom.1252
> [4] LC_NUMERIC=C
> [5] LC_TIME=English_United Kingdom.1252
>
> attached base packages:
> [1] stats graphics  grDevices utils datasets  methods   base
>
> loaded via a namespace (and not attached):
> [1] compiler_3.3.1 tools_3.3.1
> >
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Time format lagging issue

2016-08-31 Thread jeremiah rounds
Building on Don's example here is something that looks a lot like what I do
every day:
Sys.setenv(TZ="UTC")
mydf <- data.frame(t1=c('2011-12-31-22-30', '2011-12-31-23-30'))
library(lubridate)
mydf$timestamp = lubridate::ymd_hm(mydf$t1)
mydf$t2 = mydf$timestamp - period(minute=30)



On Wed, Aug 31, 2016 at 2:44 PM, MacQueen, Don  wrote:

> Try following this example:
>
> mydf <- data.frame(t1=c('201112312230', '201112312330'))
> tmp1 <- as.POSIXct(mydf$t1, format='%Y%m%d%H%M')
> tmp2 <- tmp1 - 30*60
> mydf$t2 <- format(tmp2, '%Y%m%d%H%M')
>
> It can be made into a single line, but I used intermediate variables tmp1
> and tmp2 so that it would be easier to follow.
>
> Base R is more than adequate for this task.
>
> Please get rid of the asterisks in your next email. The just get in the
> way. Learn how to send plain text email, not HTML email. Please.
>
>
>
>
> --
> Don MacQueen
>
> Lawrence Livermore National Laboratory
> 7000 East Ave., L-627
> Livermore, CA 94550
> 925-423-1062
>
>
>
>
>
> On 8/31/16, 9:07 AM, "R-help on behalf of Bhaskar Mitra"
> 
> wrote:
>
> >Hello Everyone,
> >
> >I am trying a shift the time series in a dataframe (df) by 30 minutes . My
> >current format looks something like this :
> >
> >
> >
> >*df$$Time 1*
> >
> >
> >*201112312230*
> >
> >*201112312300*
> >
> >*201112312330*
> >
> >
> >
> >*I am trying to add an additional column of time (df$Time 2) next to  Time
> >1 by lagging it by ­ 30minutes. Something like this :*
> >
> >
> >*df$Time1   **df$$Time2*
> >
> >
> >*201112312230  **201112312200*
> >
> >*201112312300  **201112312230*
> >
> >*201112312330  **201112312300*
> >
> >*201112312330  *
> >
> >
> >
> >
> >
> >*Based on some of the suggestions available, I have tried this option *
> >
> >
> >
> >*require(zoo)*
> >
> >*df1$Time2  <- lag(df1$Time1, -1, na.pad = TRUE)*
> >
> >*View(df1)*
> >
> >
> >
> >*This does not however give me the desired result. I would appreciate any
> >suggestions/advice in this regard.*
> >
> >
> >*Thanks,*
> >
> >*Bhaskar*
> >
> >   [[alternative HTML version deleted]]
> >
> >__
> >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] remove rows based on row mean

2016-08-18 Thread jeremiah rounds
oh I forgot I renamed sm.

dt = sm
library(data.table)
setDT(dt)
op = function(s){
mean0 = apply(s, 1, mean)
ret = s[which.max(mean0)]
ret$mean = mean0
ret
}
max_row = dt[, op(.SD), by = "Gene"]


Thanks,
Jeremiah

On Thu, Aug 18, 2016 at 3:21 PM, jeremiah rounds <roundsjerem...@gmail.com>
wrote:

> library(data.table)
> setDT(dt)
> op = function(s){
> mean0 = apply(s, 1, mean)
> ret = s[which.max(mean0)]
> ret$mean = mean0
> ret
> }
> max_row = dt[, op(.SD), by = "Gene"]
>
> Thanks,
> Jeremiah
>
> On Thu, Aug 18, 2016 at 2:33 PM, Adrian Johnson <oriolebaltim...@gmail.com
> > wrote:
>
>> Hi Group,
>> I have a data matrix sm (dput code given below).
>>
>> I want to create a data matrix with rows with same variable that have
>> higher mean.
>>
>> > sm
>>  Gene GSM529305 GSM529306 GSM529307 GSM529308
>> 1A1BG  6.57  6.72  6.83  6.69
>> 2A1CF  2.91  2.80  3.08  3.00
>> 3   A2LD1  5.82  7.01  6.62  6.87
>> 4 A2M  9.21  9.35  9.32  9.19
>> 5 A2M  2.94  2.50  3.16  2.76
>> 6  A4GALT  6.86  5.75  6.06  7.04
>> 7   A4GNT  3.97  3.56  4.22  3.88
>> 8AAA1  3.39  2.90  3.16  3.23
>> 9AAAS  8.26  8.63  8.40  8.70
>> 10   AAAS  6.82  7.15  7.33  6.51
>>
>> For example in rows 4 and 5 have same variable Gene A2M. I want to
>> select only row that has higher mean. I wrote the following code that
>> gives me duplicate rows with higher mean but I cannot properly write
>> the result. Could someone help.  Thanks
>>
>> ugns <- unique(sm$Gene)
>>
>> exwidh = c()
>>
>> for(i in 1:length(ugns)){
>> k = ugns[i]
>> exwidh[i] <- sm[names(sort(rowMeans(sm[which(sm[,1]==k),2:ncol(sm)]),decr
>> easing=TRUE)[1]),]
>> }
>>
>>
>>
>>
>>
>> structure(list(Gene = c("A1BG", "A1CF", "A2LD1", "A2M", "A2M",
>> "A4GALT", "A4GNT", "AAA1", "AAAS", "AAAS"), GSM529305 = c(6.57,
>> 2.91, 5.82, 9.21, 2.94, 6.86, 3.97, 3.39, 8.26, 6.82), GSM529306 = c(6.72,
>> 2.8, 7.01, 9.35, 2.5, 5.75, 3.56, 2.9, 8.63, 7.15), GSM529307 = c(6.83,
>> 3.08, 6.62, 9.32, 3.16, 6.06, 4.22, 3.16, 8.4, 7.33), GSM529308 = c(6.69,
>> 3, 6.87, 9.19, 2.76, 7.04, 3.88, 3.23, 8.7, 6.51)), .Names = c("Gene",
>> "GSM529305", "GSM529306", "GSM529307", "GSM529308"), row.names = c(NA,
>> 10L), class = "data.frame")
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posti
>> ng-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] remove rows based on row mean

2016-08-18 Thread jeremiah rounds
library(data.table)
setDT(dt)
op = function(s){
mean0 = apply(s, 1, mean)
ret = s[which.max(mean0)]
ret$mean = mean0
ret
}
max_row = dt[, op(.SD), by = "Gene"]

Thanks,
Jeremiah

On Thu, Aug 18, 2016 at 2:33 PM, Adrian Johnson 
wrote:

> Hi Group,
> I have a data matrix sm (dput code given below).
>
> I want to create a data matrix with rows with same variable that have
> higher mean.
>
> > sm
>  Gene GSM529305 GSM529306 GSM529307 GSM529308
> 1A1BG  6.57  6.72  6.83  6.69
> 2A1CF  2.91  2.80  3.08  3.00
> 3   A2LD1  5.82  7.01  6.62  6.87
> 4 A2M  9.21  9.35  9.32  9.19
> 5 A2M  2.94  2.50  3.16  2.76
> 6  A4GALT  6.86  5.75  6.06  7.04
> 7   A4GNT  3.97  3.56  4.22  3.88
> 8AAA1  3.39  2.90  3.16  3.23
> 9AAAS  8.26  8.63  8.40  8.70
> 10   AAAS  6.82  7.15  7.33  6.51
>
> For example in rows 4 and 5 have same variable Gene A2M. I want to
> select only row that has higher mean. I wrote the following code that
> gives me duplicate rows with higher mean but I cannot properly write
> the result. Could someone help.  Thanks
>
> ugns <- unique(sm$Gene)
>
> exwidh = c()
>
> for(i in 1:length(ugns)){
> k = ugns[i]
> exwidh[i] <- sm[names(sort(rowMeans(sm[which(sm[,1]==k),2:ncol(sm)]),
> decreasing=TRUE)[1]),]
> }
>
>
>
>
>
> structure(list(Gene = c("A1BG", "A1CF", "A2LD1", "A2M", "A2M",
> "A4GALT", "A4GNT", "AAA1", "AAAS", "AAAS"), GSM529305 = c(6.57,
> 2.91, 5.82, 9.21, 2.94, 6.86, 3.97, 3.39, 8.26, 6.82), GSM529306 = c(6.72,
> 2.8, 7.01, 9.35, 2.5, 5.75, 3.56, 2.9, 8.63, 7.15), GSM529307 = c(6.83,
> 3.08, 6.62, 9.32, 3.16, 6.06, 4.22, 3.16, 8.4, 7.33), GSM529308 = c(6.69,
> 3, 6.87, 9.19, 2.76, 7.04, 3.88, 3.23, 8.7, 6.51)), .Names = c("Gene",
> "GSM529305", "GSM529306", "GSM529307", "GSM529308"), row.names = c(NA,
> 10L), class = "data.frame")
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Creating Dummy Var in R for regression?

2016-08-05 Thread jeremiah rounds
Something like:

d  =  data.frame(score = sample(1:10, 100, replace=TRUE))
d$score_t = "low"
d$score_t[d$score > 3] = "medium"
d$score_t[d$score >7 ] = "high"
d$score_t = factor(d$score_t, levels = c("low", "medium", "high"),
ordered=TRUE)  #set ordered = FALSE for dummy variables
X = model.matrix(~score_t, data=d)
X



On Fri, Aug 5, 2016 at 3:23 PM, Shivi Bhatia  wrote:

> Thanks you all for the assistance. This really helps.
>
> Hi Bert: While searching nabble i got to know R with factors variables
> there is no need to create dummy variable. However please consider this
> situation:
> I am in the process of building a logistic regression model on NPS data.
> The outcome variable is CE i.e. customer experience which has 3 rating so
> ordinal logistic regression will be used. However most of my variables are
> categorical. For instance one of the variable is agent knowledge which is a
> 10 point scale.
>
> This agent knowledge is again a 3 rated scale: high medium low hence i need
> to group these 10 values into 3 groups & then as you suggested i can
> directly enter them in the model without creating n-1 categories.
>
> I have worked on SAS extensively hence found this a bit confusing.
>
> Thanks for the help.
>
> On Sat, Aug 6, 2016 at 2:30 AM, Bert Gunter 
> wrote:
>
> > Just commenting on the email subject, not the content (which you have
> > already been helped with): there is no need to *ever* create a dummy
> > variable for regression in R if what you mean by this is what is
> > conventionally meant. R will create the model matrix with appropriate
> > "dummy variables" for factors as needed. See ?contrasts and ?C for
> > relevant details and/or consult an appropriate R tutorial.
> >
> > Of course, if this is not what you meant, than ignore.
> >
> > Cheers,
> > Bert
> >
> >
> > Bert Gunter
> >
> > "The trouble with having an open mind is that people keep coming along
> > and sticking things into it."
> > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >
> >
> > On Fri, Aug 5, 2016 at 1:49 PM,   wrote:
> > > Hello,
> > >
> > > Your ifelse will never work because
> > > reasons$salutation== "Mr" & reasons$salutation=="Father" is always
> FALSE
> > > and so is reasons$salutation=="Mrs" & reasons$salutation=="Miss".
> > > Try instead | (or), not & (and).
> > >
> > > Hope this helps,
> > >
> > > Rui Barradas
> > >
> > >
> > >
> > > Citando Shivi Bhatia :
> > >
> > >> Dear Team,
> > >>
> > >> I need help with the below code in R:
> > >>
> > >> gender_rec<- c('Dr','Father','Mr'=1, 'Miss','MS','Mrs'=2, 3)
> > >>
> > >> reasons$salutation<- gender_rec[reasons$salutation].
> > >>
> > >> This code gives me the correct output but it overwrites the
> > >> reason$salutation variable. I need to create a new variable gender to
> > >> capture gender details and leave salutation as it is.
> > >>
> > >> i tried the below syntax but it is converting all to 1.
> > >>
> > >> reasons$gender<- ifelse(reasons$salutation== "Mr" &
> reasons$salutation==
> > >> "Father","Male", ifelse(reasons$salutation=="Mrs" &
> > reasons$salutation==
> > >> "Miss","Female",1))
> > >>
> > >> Please suggest.
> > >>
> > >> [[alternative HTML version deleted]]
> > >>
> > >> __
> > >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > >> https://stat.ethz.ch/mailman/listinfo/r-help
> > >> PLEASE do read the posting guide
> > >> http://www.R-project.org/posting-guide.htmland provide commented,
> > >> minimal, self-contained, reproducible code.
> > >
> > >
> > >
> > > [[alternative HTML version deleted]]
> > >
> > > __
> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide http://www.R-project.org/
> > posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reduce woes

2016-07-28 Thread jeremiah rounds
Basically using Reduce as an lapply in that example, but I think that was
caused by how people started talking about things in the first place =) But
the point is the accumulator can be anything as far as I can tell.

On Thu, Jul 28, 2016 at 12:14 PM, jeremiah rounds <roundsjerem...@gmail.com>
wrote:

> Re:
> "What I'm trying to
> work out is how to have the accumulator in Reduce not be the same type as
> the elements of the vector/list being reduced - ideally it could be an S3
> instance, list, vector, or data frame."
>
> Pretty sure that is not true.  See code that follows.  I would never solve
> this task in this way though so no comment on the use of Reduce for what
> you described.  (Note the accumulation of "functions" in a list is just a
> demo of possibilities).  You could accumulate in an environment too and
> potentially gain a lot of copy efficiency.
>
>
> lookup = list()
> lookup[[as.character(1)]] = function() print("1")
> lookup[[as.character(2)]] = function() print("2")
> lookup[[as.character(3)]] = function() print("3")
>
> data = list(c(1,2), c(1,4), c(3,3), c(2,30))
>
>
> r = Reduce(function(acc, item) {
> append(acc, list(lookup[[as.character(min(item))]]))
> }, data,list())
> r
> for(f in r) f()
>
>
> On Thu, Jul 28, 2016 at 5:09 AM, Stefan Kruger <stefan.kru...@gmail.com>
> wrote:
>
>> Ulrik - many thanks for your reply.
>>
>> I'm aware of many simple solutions as the one you suggest, both iterative
>> and functional style - but I'm trying to learn how to bend Reduce() for
>> the
>> purpose of using it in more complex processing tasks. What I'm trying to
>> work out is how to have the accumulator in Reduce not be the same type as
>> the elements of the vector/list being reduced - ideally it could be an S3
>> instance, list, vector, or data frame.
>>
>> Here's a more realistic example (in Elixir, sorry)
>>
>> Given two lists:
>>
>> 1. data: maps an id string to a vector of revision strings
>> 2. dict: maps known id/revision pairs as a string to true (or 1)
>>
>> find the items in data not already in dict, returned as a named list.
>>
>> ```elixir
>> data = %{
>> "id1" => ["rev1.1", "rev1.2"],
>> "id2" => ["rev2.1"],
>> "id3" => ["rev3.1", "rev3.2", "rev3.3"]
>> }
>>
>> dict = %{
>> "id1/rev1.1" => 1,
>> "id1/rev1.2" => 1,
>> "id3/rev3.1" => 1
>> }
>>
>> # Find the items in data not already in dict. Return as a grouped map
>>
>> Map.keys(data)
>> |> Enum.flat_map(fn id -> Enum.map(data[id], fn rev -> {id, rev} end)
>> end)
>> |> Enum.filter(fn {id, rev} -> !Dict.has_key?(dict, "#{id}/#{rev}")
>> end)
>> |> Enum.reduce(%{}, fn ({k, v}, d) -> Map.update(d, k, [v], &[v|&1])
>> end)
>> ```
>>
>>
>>
>>
>> On 28 July 2016 at 12:03, Ulrik Stervbo <ulrik.ster...@gmail.com> wrote:
>>
>> > Hi Stefan,
>> >
>> > in that case,lapply(data, length) should do the trick.
>> >
>> > Best wishes,
>> > Ulrik
>> >
>> > On Thu, 28 Jul 2016 at 12:57 Stefan Kruger <stefan.kru...@gmail.com>
>> > wrote:
>> >
>> >> David - many thanks for your response.
>> >>
>> >> What I tried to do was to turn
>> >>
>> >> data <- list(one = c(1, 1), three = c(3), two = c(2, 2))
>> >>
>> >> into
>> >>
>> >> result <- list(one = 2, three = 1, two = 2)
>> >>
>> >> that is creating a new list which has the same names as the first, but
>> >> where the values are the vector lengths.
>> >>
>> >> I know there are many other (and better) trivial ways of achieving
>> this -
>> >> my aim is less the task itself, and more figuring out if this can be
>> done
>> >> using Reduce() in the fashion I showed in the other examples I gave.
>> It's
>> >> a
>> >> building block of doing map-filter-reduce type pipelines that I'd like
>> to
>> >> understand how to do in R.
>> >>
>> >> Fumbling in the dark, I tried:
>> >>
>> >> Reduce(function(acc, item) { setNames(c(acc, length(data[item])), item
>> },
>> >> names(data), accumulate=TRUE)
>> >>
>> >> but setNames s

Re: [R] Reduce woes

2016-07-28 Thread jeremiah rounds
Re:
"What I'm trying to
work out is how to have the accumulator in Reduce not be the same type as
the elements of the vector/list being reduced - ideally it could be an S3
instance, list, vector, or data frame."

Pretty sure that is not true.  See code that follows.  I would never solve
this task in this way though so no comment on the use of Reduce for what
you described.  (Note the accumulation of "functions" in a list is just a
demo of possibilities).  You could accumulate in an environment too and
potentially gain a lot of copy efficiency.


lookup = list()
lookup[[as.character(1)]] = function() print("1")
lookup[[as.character(2)]] = function() print("2")
lookup[[as.character(3)]] = function() print("3")

data = list(c(1,2), c(1,4), c(3,3), c(2,30))


r = Reduce(function(acc, item) {
append(acc, list(lookup[[as.character(min(item))]]))
}, data,list())
r
for(f in r) f()


On Thu, Jul 28, 2016 at 5:09 AM, Stefan Kruger 
wrote:

> Ulrik - many thanks for your reply.
>
> I'm aware of many simple solutions as the one you suggest, both iterative
> and functional style - but I'm trying to learn how to bend Reduce() for the
> purpose of using it in more complex processing tasks. What I'm trying to
> work out is how to have the accumulator in Reduce not be the same type as
> the elements of the vector/list being reduced - ideally it could be an S3
> instance, list, vector, or data frame.
>
> Here's a more realistic example (in Elixir, sorry)
>
> Given two lists:
>
> 1. data: maps an id string to a vector of revision strings
> 2. dict: maps known id/revision pairs as a string to true (or 1)
>
> find the items in data not already in dict, returned as a named list.
>
> ```elixir
> data = %{
> "id1" => ["rev1.1", "rev1.2"],
> "id2" => ["rev2.1"],
> "id3" => ["rev3.1", "rev3.2", "rev3.3"]
> }
>
> dict = %{
> "id1/rev1.1" => 1,
> "id1/rev1.2" => 1,
> "id3/rev3.1" => 1
> }
>
> # Find the items in data not already in dict. Return as a grouped map
>
> Map.keys(data)
> |> Enum.flat_map(fn id -> Enum.map(data[id], fn rev -> {id, rev} end)
> end)
> |> Enum.filter(fn {id, rev} -> !Dict.has_key?(dict, "#{id}/#{rev}")
> end)
> |> Enum.reduce(%{}, fn ({k, v}, d) -> Map.update(d, k, [v], &[v|&1])
> end)
> ```
>
>
>
>
> On 28 July 2016 at 12:03, Ulrik Stervbo  wrote:
>
> > Hi Stefan,
> >
> > in that case,lapply(data, length) should do the trick.
> >
> > Best wishes,
> > Ulrik
> >
> > On Thu, 28 Jul 2016 at 12:57 Stefan Kruger 
> > wrote:
> >
> >> David - many thanks for your response.
> >>
> >> What I tried to do was to turn
> >>
> >> data <- list(one = c(1, 1), three = c(3), two = c(2, 2))
> >>
> >> into
> >>
> >> result <- list(one = 2, three = 1, two = 2)
> >>
> >> that is creating a new list which has the same names as the first, but
> >> where the values are the vector lengths.
> >>
> >> I know there are many other (and better) trivial ways of achieving this
> -
> >> my aim is less the task itself, and more figuring out if this can be
> done
> >> using Reduce() in the fashion I showed in the other examples I gave.
> It's
> >> a
> >> building block of doing map-filter-reduce type pipelines that I'd like
> to
> >> understand how to do in R.
> >>
> >> Fumbling in the dark, I tried:
> >>
> >> Reduce(function(acc, item) { setNames(c(acc, length(data[item])), item
> },
> >> names(data), accumulate=TRUE)
> >>
> >> but setNames sets all the names, not adding one - and acc is still a
> >> vector, not a list.
> >>
> >> It looks like 'lambda.tools.fold()' and possibly 'purrr.reduce()' aim at
> >> doing what I'd like to do - but I've not been able to figure out quite
> >> how.
> >>
> >> Thanks
> >>
> >> Stefan
> >>
> >>
> >>
> >> On 27 July 2016 at 20:35, David Winsemius 
> wrote:
> >>
> >> >
> >> > > On Jul 27, 2016, at 8:20 AM, Stefan Kruger  >
> >> > wrote:
> >> > >
> >> > > Hi -
> >> > >
> >> > > I'm new to R.
> >> > >
> >> > > In other functional languages I'm familiar with you can often seed a
> >> call
> >> > > to reduce() with a custom accumulator. Here's an example in Elixir:
> >> > >
> >> > > map = %{"one" => [1, 1], "three" => [3], "two" => [2, 2]}
> >> > > map |> Enum.reduce(%{}, fn ({k,v}, acc) -> Map.update(acc, k,
> >> > > Enum.count(v), nil) end)
> >> > > # %{"one" => 2, "three" => 1, "two" => 2}
> >> > >
> >> > > In R-terms that's reducing a list of vectors to become a new list
> >> mapping
> >> > > the names to the vector lengths.
> >> > >
> >> > > Even in JavaScript, you can do similar things:
> >> > >
> >> > > list = { one: [1, 1], three: [3], two: [2, 2] };
> >> > > var result = Object.keys(list).reduceRight(function (acc, item) {
> >> > >  acc[item] = list[item].length;
> >> > >  return acc;
> >> > > }, {});
> >> > > // result == { two: 2, three: 1, one: 2 }
> >> > >
> >> > > In R, from what I can gather, Reduce() is restricted such that any
> >> init
> 

Re: [R] Reducing execution time

2016-07-27 Thread jeremiah rounds
Correction to my code. I created a "doc" variable because I was thinking of
doing something faster, but I never did the change.  grep needed to work on
the original source "dat" to be used for counting.

 Fixed:

combs = structure(list(V1 = c(65L, 77L, 55L, 23L, 34L), V2 = c(23L, 34L,
34L, 77L, 65L), V3 = c(77L, 65L, 23L, 34L, 55L)), .Names = c("V1",
"V2", "V3"), class = "data.frame", row.names = c(NA, -5L))

dat = list(
c(77,65,34,23,55, 65,23,77, 44),
c(65,23,77,65,55,34, 77, 34,65, 10),
c(77,34,65),
c(55,78,56),
c(98,23,77,65,34, 65, 23, 77, 34))


words = unlist(apply(combs, 1 , function(d) paste(as.character(d),
collapse=" ")))
dat = lapply(dat, function(d) paste( as.character(d), collapse= " "))
#doc = paste(dat, collapse = " ## ") # just some arbitrary separator
character that isn't in your words
counts = sapply(words, function(w) length(grep(w, dat)))
names(counts) = words
counts
cbind(combs, data.frame(N = counts))


On Wed, Jul 27, 2016 at 11:27 AM, sri vathsan  wrote:

> Hi,
>
> It is not a just 79 triplets. As I said, there are 79 codes. I am making
> triplets out of that 79 codes and matching the triplets in the list.
>
> Please find the dput of the data below.
>
> > dput(head(newd,10))
> structure(list(uniq_id = c("1", "2", "3", "4", "5", "6", "7",
> "8", "9", "10"), hi = c("11,  22,  84,  85,  108,  111", "18,  84,  85,
> 87,  122,  134",
> "2,  18,  22", "18,  108,  122,  134,  176", "19,  85,  87,  100,  107",
> "79,  85,  111", "11,  88,  108", "19,  88,  96", "19,  85,  96",
> "19,  100,  103")), .Names = c("uniq_id", "hi"), row.names = c(NA,
> -10L), class = c("tbl_df", "tbl", "data.frame"))
> >
>
> I am trying to count the frequency of the triplets in the above data using
> the below code.
>
> # split column into a list
> myList <- strsplit(newd$hi, split=",")
> # get all pairwise combinations
> myCombos <- t(combn(unique(unlist(myList)), 3))
> # count the instances where the pair is present
> myCounts <- sapply(1:nrow(myCombos), FUN=function(i) {
>   sum(sapply(myList, function(j) {
> sum(!is.na(match(c(myCombos[i,]), j)))})==3)})
> #final matrix
> final <- cbind(matrix(as.integer(myCombos), nrow(myCombos)), myCounts)
>
> I hope I made my point clear. Please let me know if I miss anything.
>
> Regards,
> Sri
>
>
>
>
> On Wed, Jul 27, 2016 at 11:19 PM, Sarah Goslee 
> wrote:
>
> > You said you had 79 triplets and 8000 records.
> >
> > When I compared 100 triplets to 1 records it took 86 seconds.
> >
> > So obviously there is something you're not telling us about the format
> > of your data.
> >
> > If you use dput() to provide actual examples, you will get better
> > results than if we on Rhelp have to guess. Because we tend to guess in
> > ways that make the most sense after extensive R experience, and that's
> > probably not what you have.
> >
> > Sarah
> >
> > On Wed, Jul 27, 2016 at 1:29 PM, sri vathsan 
> wrote:
> > > Hi,
> > >
> > > Thanks for the solution. But I am afraid that after running this code
> > still
> > > it takes more time. It has been an hour and still it is executing. I
> > > understand the delay because each triplet has to compare almost 9000
> > > elements.
> > >
> > > Regards,
> > > Sri
> > >
> > > On Wed, Jul 27, 2016 at 9:02 PM, Sarah Goslee 
> > > wrote:
> > >>
> > >> Hi,
> > >>
> > >> It's really a good idea to use dput() or some other reproducible way
> > >> to provide data. I had to guess as to what your data looked like.
> > >>
> > >> It appears that order doesn't matter?
> > >>
> > >> Given than, here's one approach:
> > >>
> > >> combs <- structure(list(V1 = c(65L, 77L, 55L, 23L, 34L), V2 = c(23L,
> > 34L,
> > >> 34L, 77L, 65L), V3 = c(77L, 65L, 23L, 34L, 55L)), .Names = c("V1",
> > >> "V2", "V3"), class = "data.frame", row.names = c(NA, -5L))
> > >>
> > >> dat <- list(
> > >> c(77,65,34,23,55),
> > >> c(65,23,77,65,55,34),
> > >> c(77,34,65),
> > >> c(55,78,56),
> > >> c(98,23,77,65,34))
> > >>
> > >>
> > >> sapply(seq_len(nrow(combs)), function(i)sum(sapply(dat,
> > >> function(j)all(combs[i,] %in% j
> > >>
> > >> On a dataset of comparable time to yours, it takes me under a minute
> > and a
> > >> half.
> > >>
> > >> > combs <- combs[rep(1:nrow(combs), length=100), ]
> > >> > dat <- dat[rep(1:length(dat), length=1)]
> > >> >
> > >> > dim(combs)
> > >> [1] 100   3
> > >> > length(dat)
> > >> [1] 1
> > >> >
> > >> > system.time(test <- sapply(seq_len(nrow(combs)),
> > >> > function(i)sum(sapply(dat, function(j)all(combs[i,] %in% j)
> > >>user  system elapsed
> > >>  86.380   0.006  86.391
> > >>
> > >>
> > >>
> > >>
> > >> On Wed, Jul 27, 2016 at 10:47 AM, sri vathsan 
> > wrote:
> > >> > Hi,
> > >> >
> > >> > Apologizes for the less information.
> > >> >
> > >> > Basically, myCombos is a matrix with 3 variables which is a triplet
> > that
> > >> > is
> > >> > a combination of 79 codes. There are around 3lakh 

Re: [R] Reducing execution time

2016-07-27 Thread jeremiah rounds
If I understood the request this is the same programming  task as counting
words in a document and counting character sequences in a string or
matching bytes in byte arrays (though you don't want to go down that far)
 You can do something like what follows.  There are also vectorized greps
in stringr.

combs = structure(list(V1 = c(65L, 77L, 55L, 23L, 34L), V2 = c(23L, 34L,
34L, 77L, 65L), V3 = c(77L, 65L, 23L, 34L, 55L)), .Names = c("V1",
"V2", "V3"), class = "data.frame", row.names = c(NA, -5L))

dat = list(
c(77,65,34,23,55, 65,23,77, 44),
c(65,23,77,65,55,34, 77, 34,65, 10),
c(77,34,65),
c(55,78,56),
c(98,23,77,65,34, 65, 23, 77, 34))


words = unlist(apply(combs, 1 , function(d) paste(as.character(d),
collapse=" ")))
dat = lapply(dat, function(d) paste( as.character(d), collapse= " "))
doc = paste(dat, collapse = " ## ") # just some arbitrary separator
character that isn't in your words
counts = sapply(words, function(w) length(grep(w, doc)))
names(counts) = words
counts
cbind(combs, data.frame(N = counts))



On Wed, Jul 27, 2016 at 8:32 AM, Sarah Goslee 
wrote:

> Hi,
>
> It's really a good idea to use dput() or some other reproducible way
> to provide data. I had to guess as to what your data looked like.
>
> It appears that order doesn't matter?
>
> Given than, here's one approach:
>
> combs <- structure(list(V1 = c(65L, 77L, 55L, 23L, 34L), V2 = c(23L, 34L,
> 34L, 77L, 65L), V3 = c(77L, 65L, 23L, 34L, 55L)), .Names = c("V1",
> "V2", "V3"), class = "data.frame", row.names = c(NA, -5L))
>
> dat <- list(
> c(77,65,34,23,55),
> c(65,23,77,65,55,34),
> c(77,34,65),
> c(55,78,56),
> c(98,23,77,65,34))
>
>
> sapply(seq_len(nrow(combs)), function(i)sum(sapply(dat,
> function(j)all(combs[i,] %in% j
>
> On a dataset of comparable time to yours, it takes me under a minute and a
> half.
>
> > combs <- combs[rep(1:nrow(combs), length=100), ]
> > dat <- dat[rep(1:length(dat), length=1)]
> >
> > dim(combs)
> [1] 100   3
> > length(dat)
> [1] 1
> >
> > system.time(test <- sapply(seq_len(nrow(combs)),
> function(i)sum(sapply(dat, function(j)all(combs[i,] %in% j)
>user  system elapsed
>  86.380   0.006  86.391
>
>
>
>
> On Wed, Jul 27, 2016 at 10:47 AM, sri vathsan  wrote:
> > Hi,
> >
> > Apologizes for the less information.
> >
> > Basically, myCombos is a matrix with 3 variables which is a triplet that
> is
> > a combination of 79 codes. There are around 3lakh combination as such and
> > it looks like below.
> >
> > V1 V2 V3
> > 65 23 77
> > 77 34 65
> > 55 34 23
> > 23 77 34
> > 34 65 55
> >
> > Each triplet will compare in a list (mylist) having 8177 elements which
> > will looks like below.
> >
> > 77,65,34,23,55
> > 65,23,77,65,55,34
> > 77,34,65
> > 55,78,56
> > 98,23,77,65,34
> >
> > Now I want to count the no of occurrence of the triplet in the above
> list.
> > I.e., the triplet 65 23 77 is seen 3 times in the list. So my output
> looks
> > like below
> >
> > V1 V2 V3 Freq
> > 65 23 77  3
> > 77 34 65  4
> > 55 34 23  2
> >
> > I hope, I made it clear this time.
> >
> >
> > On Wed, Jul 27, 2016 at 7:00 PM, Bert Gunter 
> wrote:
> >
> >> Not entirely sure I understand, but match() is already vectorized, so
> you
> >> should be able to lose the supply(). This would speed things up a lot.
> >> Please re-read ?match *carefully* .
> >>
> >> Bert
> >>
> >> On Jul 27, 2016 6:15 AM, "sri vathsan"  wrote:
> >>
> >> Hi,
> >>
> >> I created list of 3 combination numbers (mycombos, around 3 lakh
> >> combinations) and counting the occurrence of those combination in
> another
> >> list. This comparision list (mylist) is having around 8000 records.I am
> >> using the following code.
> >>
> >> myCounts <- sapply(1:nrow(myCombos), FUN=function(i) {
> >>   sum(sapply(myList, function(j) {
> >> sum(!is.na(match(c(myCombos[i,]), j)))})==3)})
> >>
> >> The above code takes very long time to execute and is there any other
> >> effecting method which will reduce the time.
> >> --
> >>
> >> Regards,
> >> Srivathsan.K
> >>
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] C/C++/Fortran Rolling Window Regressions

2016-07-21 Thread jeremiah rounds
I agree that when appropriate Kalman Filter/Smoothing the higher-quality
way to go about estimating a time-varying coefficient (given that is what
they do),  and I have noted that both the R package "dlm" and the function
"StructTS" handle these problems quickly.  I am working on that in
parallel.

One of the things I am unsure about with Kalman Filters is how to estimate
variance parameters when the process is unusual in some way that isn't in
the model and it is not feasible to adjust the model by-hand.  dlm's dlmMLE
seems to produce non-sense (not because of the author's work but because of
assumptions).  At least with moving window regressions after the unusual
event is past your window the influence of that event is gone.That
isn't really a question for this group it is more about me reading more.
When I get that "how to handle all the strange things big data throws at
you" worked out for Kalman Filters, I will go back to those because I
certainly like what I see when everything is right.  There is a plethora of
related topics right?  Bayesian Model Averaging, G-ARCH models for
heteroscedasticity, etc.

Anyway... roll::roll_lm, cheers!

Thanks,
Jeremiah



On Thu, Jul 21, 2016 at 2:08 PM, Mark Leeds <marklee...@gmail.com> wrote:

> Hi Jermiah: another possibly faster way would be to use a kalman filtering
> framework. I forget the details but duncan and horne have a paper which
> shows how a regression can be re-computed each time a new data point is
> added .I
> forget if they handle taking one off of the back also which is what you
> need.
>
> The paper at the link below isn't the paper I'm talking about but it's
> reference[1] in that paper. Note that this suggestion might not be a better
> approach  than the various approaches already suggested so I wouldn't go
> this route unless you're very interested.
>
>
> Mark
>
> https://www.le.ac.uk/users/dsgp1/COURSES/MESOMET/ECMETXT/recurse.pdf
>
>
>
>
>
>
> On Thu, Jul 21, 2016 at 4:28 PM, Gabor Grothendieck <
> ggrothendi...@gmail.com> wrote:
>
>> I would be careful about making assumptions regarding what is faster.
>> Performance tends to be nonintuitive.
>>
>> When I ran rollapply/lm, rollapply/fastLm and roll_lm on the example
>> you provided rollapply/fastLm was three times faster than roll_lm.  Of
>> course this could change with data of different dimensions but it
>> would be worthwhile to do actual benchmarks before making assumptions.
>>
>> I also noticed that roll_lm did not give the same coefficients as the
>> other two.
>>
>> set.seed(1)
>> library(zoo)
>> library(RcppArmadillo)
>> library(roll)
>> z <- zoo(matrix(rnorm(10), ncol = 2))
>> colnames(z) <- c("y", "x")
>>
>> ## rolling regression of width 4
>> library(rbenchmark)
>> benchmark(fastLm = rollapplyr(z, width = 4,
>>  function(x) coef(fastLm(cbind(1, x[, 2]), x[, 1])),
>>  by.column = FALSE),
>>lm = rollapplyr(z, width = 4,
>>  function(x) coef(lm(y ~ x, data = as.data.frame(x))),
>>  by.column = FALSE),
>>roll_lm =  roll_lm(coredata(z[, 1, drop = F]), coredata(z[, 2, drop =
>> F]), 4,
>>  center = FALSE))[1:4]
>>
>>
>>  test replications elapsed relative
>> 1  fastLm  1000.221.000
>> 2  lm  1000.723.273
>> 3 roll_lm  1000.642.909
>>
>> On Thu, Jul 21, 2016 at 3:45 PM, jeremiah rounds
>> <roundsjerem...@gmail.com> wrote:
>> >  Thanks all.  roll::roll_lm was essentially what I wanted.   I think
>> maybe
>> > I would prefer it to have options to return a few more things, but it is
>> > the coefficients, and the remaining statistics you might want can be
>> > calculated fast enough from there.
>> >
>> >
>> > On Thu, Jul 21, 2016 at 12:36 PM, Achim Zeileis <
>> achim.zeil...@uibk.ac.at>
>> > wrote:
>> >
>> >> Jeremiah,
>> >>
>> >> for this purpose there are the "roll" and "RcppRoll" packages. Both use
>> >> Rcpp and the former also provides rolling lm models. The latter has a
>> >> generic interface that let's you define your own function.
>> >>
>> >> One thing to pay attention to, though, is the numerical reliability.
>> >> Especially on large time series with relatively short windows there is
>> a
>> >> good chance of encountering numerically challenging situations. The QR
>> >> decomposition used by lm is fairly robust while other more
>> straightforward
>> >> matrix 

Re: [R] C/C++/Fortran Rolling Window Regressions

2016-07-21 Thread jeremiah rounds
I appreciate the timing, so much so I changed the code to show the issue.
 It is a problem of scale.

 roll_lm probably has a heavy start-up cost but otherwise completely
out-performs those other versions at scale.  I suspect you are timing the
nearly  constant time start-up cost in small data.  I did give code to
paint a picture, but it was just cartoon code lifted from stackexchange.
If you want to characterize the real problem it is closer to:
30 day rolling windows on 24 daily (by hour) measurements for 5 years with
24+7 -1 dummy predictor variables and finally you need to do this for 300
sets of data.

Pseudo-code is closer to what follows and roll_lm can handle that input in
a timely manner.  You can do it with lm.fit, but you need to spend a lot of
time waiting.  The issue of accuracy needs a follow-up check.  Not sure why
it would be different.  Worth a check on that.

Thanks,
Jeremiah


library(rbenchmark)
N = 30*24*12*5
window = 30*24
npred = 15  #15 chosen arbitrarily...
set.seed(1)
library(zoo)
library(RcppArmadillo)
library(roll)
x = matrix(rnorm(N*(npred+1)), ncol = npred+1)
colnames(x) <- c("y",  paste0("x", 1:npred))
z <- zoo(x)


benchmark(
   roll_lm =  roll_lm(coredata(z[, 1, drop = F]), coredata(z[, -1, drop =
F]), window,
 center = FALSE), replications=3)

Which comes out as:
 test replications elapsed relative user.self sys.self user.child
sys.child
1 roll_lm3   6.273138.3120.654  0
  0





## You arn't going to get that below...

benchmark(fastLm = rollapplyr(z, width = window,
 function(x) coef(fastLm(cbind(1, x[, -1]), x[, 1])),
 by.column = FALSE),
   lm = rollapplyr(z, width = window,
 function(x) coef(lm(y ~ ., data = as.data.frame(x))),
 by.column = FALSE), replications=3)



On Thu, Jul 21, 2016 at 1:28 PM, Gabor Grothendieck <ggrothendi...@gmail.com
> wrote:

> I would be careful about making assumptions regarding what is faster.
> Performance tends to be nonintuitive.
>
> When I ran rollapply/lm, rollapply/fastLm and roll_lm on the example
> you provided rollapply/fastLm was three times faster than roll_lm.  Of
> course this could change with data of different dimensions but it
> would be worthwhile to do actual benchmarks before making assumptions.
>
> I also noticed that roll_lm did not give the same coefficients as the
> other two.
>
> set.seed(1)
> library(zoo)
> library(RcppArmadillo)
> library(roll)
> z <- zoo(matrix(rnorm(10), ncol = 2))
> colnames(z) <- c("y", "x")
>
> ## rolling regression of width 4
> library(rbenchmark)
> benchmark(fastLm = rollapplyr(z, width = 4,
>  function(x) coef(fastLm(cbind(1, x[, 2]), x[, 1])),
>  by.column = FALSE),
>lm = rollapplyr(z, width = 4,
>  function(x) coef(lm(y ~ x, data = as.data.frame(x))),
>  by.column = FALSE),
>roll_lm =  roll_lm(coredata(z[, 1, drop = F]), coredata(z[, 2, drop =
> F]), 4,
>  center = FALSE))[1:4]
>
>
>  test replications elapsed relative
> 1  fastLm      1000.221.000
> 2  lm  1000.723.273
> 3 roll_lm  1000.642.909
>
> On Thu, Jul 21, 2016 at 3:45 PM, jeremiah rounds
> <roundsjerem...@gmail.com> wrote:
> >  Thanks all.  roll::roll_lm was essentially what I wanted.   I think
> maybe
> > I would prefer it to have options to return a few more things, but it is
> > the coefficients, and the remaining statistics you might want can be
> > calculated fast enough from there.
> >
> >
> > On Thu, Jul 21, 2016 at 12:36 PM, Achim Zeileis <
> achim.zeil...@uibk.ac.at>
> > wrote:
> >
> >> Jeremiah,
> >>
> >> for this purpose there are the "roll" and "RcppRoll" packages. Both use
> >> Rcpp and the former also provides rolling lm models. The latter has a
> >> generic interface that let's you define your own function.
> >>
> >> One thing to pay attention to, though, is the numerical reliability.
> >> Especially on large time series with relatively short windows there is a
> >> good chance of encountering numerically challenging situations. The QR
> >> decomposition used by lm is fairly robust while other more
> straightforward
> >> matrix multiplications may not be. This should be kept in mind when
> writing
> >> your own Rcpp code for plugging it into RcppRoll.
> >>
> >> But I haven't check what the roll package does and how reliable that
> is...
> >>
> >> hth,
> >> Z
> >>
> >>
> >> On Thu, 21 Jul 2016, jeremiah rounds wrote:
> >>
> >> Hi,
> >>>
> >>> A not unusual task is performing a multiple

Re: [R] C/C++/Fortran Rolling Window Regressions

2016-07-21 Thread jeremiah rounds
 Thanks all.  roll::roll_lm was essentially what I wanted.   I think maybe
I would prefer it to have options to return a few more things, but it is
the coefficients, and the remaining statistics you might want can be
calculated fast enough from there.


On Thu, Jul 21, 2016 at 12:36 PM, Achim Zeileis <achim.zeil...@uibk.ac.at>
wrote:

> Jeremiah,
>
> for this purpose there are the "roll" and "RcppRoll" packages. Both use
> Rcpp and the former also provides rolling lm models. The latter has a
> generic interface that let's you define your own function.
>
> One thing to pay attention to, though, is the numerical reliability.
> Especially on large time series with relatively short windows there is a
> good chance of encountering numerically challenging situations. The QR
> decomposition used by lm is fairly robust while other more straightforward
> matrix multiplications may not be. This should be kept in mind when writing
> your own Rcpp code for plugging it into RcppRoll.
>
> But I haven't check what the roll package does and how reliable that is...
>
> hth,
> Z
>
>
> On Thu, 21 Jul 2016, jeremiah rounds wrote:
>
> Hi,
>>
>> A not unusual task is performing a multiple regression in a rolling window
>> on a time-series.A standard piece of advice for doing in R is
>> something
>> like the code that follows at the end of the email.  I am currently using
>> an "embed" variant of that code and that piece of advice is out there too.
>>
>> But, it occurs to me that for such an easily specified matrix operation
>> standard R code is really slow.   rollapply constantly returns to R
>> interpreter at each window step for a new lm.   All lm is at its heart is
>> (X^t X)^(-1) * Xy,  and if you think about doing that with Rcpp in rolling
>> window you are just incrementing a counter and peeling off rows (or
>> columns
>> of X and y) of a particular window size, and following that up with some
>> matrix multiplication in a loop.   The psuedo-code for that Rcpp
>> practically writes itself and you might want a wrapper of something like:
>> rolling_lm (y=y, x=x, width=4).
>>
>> My question is this: has any of the thousands of R packages out there
>> published anything like that.  Rolling window multiple regressions that
>> stay in C/C++ until the rolling window completes?  No sense and writing it
>> if it exist.
>>
>>
>> Thanks,
>> Jeremiah
>>
>> Standard (slow) advice for "rolling window regression" follows:
>>
>>
>> set.seed(1)
>> z <- zoo(matrix(rnorm(10), ncol = 2))
>> colnames(z) <- c("y", "x")
>>
>> ## rolling regression of width 4
>> rollapply(z, width = 4,
>>   function(x) coef(lm(y ~ x, data = as.data.frame(x))),
>>   by.column = FALSE, align = "right")
>>
>> ## result is identical to
>> coef(lm(y ~ x, data = z[1:4,]))
>> coef(lm(y ~ x, data = z[2:5,]))
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] C/C++/Fortran Rolling Window Regressions

2016-07-21 Thread jeremiah rounds
Hi,

A not unusual task is performing a multiple regression in a rolling window
on a time-series.A standard piece of advice for doing in R is something
like the code that follows at the end of the email.  I am currently using
an "embed" variant of that code and that piece of advice is out there too.

But, it occurs to me that for such an easily specified matrix operation
standard R code is really slow.   rollapply constantly returns to R
interpreter at each window step for a new lm.   All lm is at its heart is
(X^t X)^(-1) * Xy,  and if you think about doing that with Rcpp in rolling
window you are just incrementing a counter and peeling off rows (or columns
of X and y) of a particular window size, and following that up with some
matrix multiplication in a loop.   The psuedo-code for that Rcpp
practically writes itself and you might want a wrapper of something like:
rolling_lm (y=y, x=x, width=4).

My question is this: has any of the thousands of R packages out there
published anything like that.  Rolling window multiple regressions that
stay in C/C++ until the rolling window completes?  No sense and writing it
if it exist.


Thanks,
Jeremiah

Standard (slow) advice for "rolling window regression" follows:


set.seed(1)
z <- zoo(matrix(rnorm(10), ncol = 2))
colnames(z) <- c("y", "x")

## rolling regression of width 4
rollapply(z, width = 4,
   function(x) coef(lm(y ~ x, data = as.data.frame(x))),
   by.column = FALSE, align = "right")

## result is identical to
coef(lm(y ~ x, data = z[1:4,]))
coef(lm(y ~ x, data = z[2:5,]))

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] if + is.na

2009-06-14 Thread Jeremiah Rounds

Your error message is because if wants a single value and you are giving it a 
vector.  

 

Typically you want to use functions all or any to correct this error message 
(look them up ?all ?any) and eg if(any(is.na(...)))   But in this case to 
accomplish the task you're after I don't even think you want to use an if.  I 
am not going to give you precise code because I wasn't able to decipher exactly 
what you were trying to do but something like:

 

b[is.na(a)] = 43

 

might be helpful.  This line would put a 43 in b in the corresponding entry 
that was na in a.

 

Good luck!.

 


 
 Date: Sun, 14 Jun 2009 12:48:58 -0700
 From: gregori...@gmail.com
 To: r-help@r-project.org
 Subject: [R] if + is.na
 
 
 Hello!
 I wont to use a function is.na() 
 
 I have two vectors:
  a=c(1,NA,3,3,3)
  b=c(0,0,0,0,0)
 and when I use is.na function it's ok:
  is.na(a)
 [1] FALSE TRUE FALSE FALSE FALSE
 
 but I would create sth like this:
 
 for i in 1:length(a){
 if (wsp[i] == is.na(a)) {b=43}
 }
 or like this 
 
 if(is.na(a)) b=3 else a
 [1] 1 NA 3 3 3
 
 but I always get an error:
 the condition has length  1 and only the first element will be used
 
 Could you help me how I may avoid this problem and use function is.na inside
 function if - else
 Please
 
 
 -- 
 View this message in context: 
 http://www.nabble.com/if-%2B-is.na-tp24025136p24025136.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

_
Lauren found her dream laptop. Find the PC that’s right for you.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help to speed up loops in r

2009-06-09 Thread Jeremiah Rounds

Other then the reengineering of the approach, one thing that helps is don't 
index rows of data frames via loops... ever.  It is actually faster to convert 
to a matrix, do the operations, and then convert back to a data frame if you 
have too.

 

As an example I have your code in a function:

 

foo = function(averagedreplicates, zz){
iindex = 1:(dim(averagedreplicates)[2])
for (i in iindex) {
 cat(i,'\n')  #calculates Meanss
#Sample A
 averagedreplicates[i,2] - (zz[i,2] + zz[i,3])/2
 averagedreplicates[i,3] - (zz[i,4] + zz[i,5])/2
 averagedreplicates[i,4] - (zz[i,6] + zz[i,7])/2
 averagedreplicates[i,5] - (zz[i,8] + zz[i,9])/2
 averagedreplicates[i,6] - (zz[i,10] + zz[i,11])/2
 #Sample B
 averagedreplicates[i,7] - (zz[i,12] + zz[i,13])/2
 averagedreplicates[i,8] - (zz[i,14] + zz[i,15])/2
 averagedreplicates[i,9] - (zz[i,16] + zz[i,17])/2
 averagedreplicates[i,10] - (zz[i,18] + zz[i,19])/2
 averagedreplicates[i,11] - (zz[i,20] + zz[i,21])/2
 #Sample C
 averagedreplicates[i,12] - (zz[i,22] + zz[i,23])/2
 averagedreplicates[i,13] - (zz[i,24] + zz[i,25])/2
 averagedreplicates[i,14] - (zz[i,26] + zz[i,27])/2
 averagedreplicates[i,15] - (zz[i,28] + zz[i,29])/2
 averagedreplicates[i,16] - (zz[i,30] + zz[i,31])/2
 #Sample D
 averagedreplicates[i,17] - (zz[i,32] + zz[i,33])/2
 averagedreplicates[i,18] - (zz[i,34] + zz[i,35])/2
 averagedreplicates[i,19] - (zz[i,36] + zz[i,37])/2
 averagedreplicates[i,20] - (zz[i,38] + zz[i,39])/2
 averagedreplicates[i,21] - (zz[i,40] + zz[i,41])/2
}
return(averagedreplicates)
}

 

I then make matrix and data.frame versions of things similar in size to what 
you are working with:

 

zz.as.m = matrix(runif(95000*41),95000,41)
zz.as.df = as.data.frame(zz.as.m)
ar.as.m = matrix(0,95000,21)
ar.as.df = as.data.frame(ar.as.m)


 

And we can time the matrix versions:

 

start = Sys.time()
x = foo(ar.as.m,zz.as.m)
stop = Sys.time()
stop-start  # .06 seconds for me

 

 

And on the data frame versions?

 

#using the data frame versions
start = Sys.time()
x = foo(ar.as.df,zz.as.df)
stop = Sys.time()
stop-start  # 31 seconds for me

 

 

It takes for me 516 times as long to do the same work in data frames as it 
would have took in matrixes for me.

 

People say never use loops in R, and I wish they wouldn't say it like that 
because it distracts from the facts of the matter which is that sometimes 
looping in R is quite reasonably fast.  And sometimes... like when you are 
indexing rows of a data frame it is horrible.  These are the little things I 
learned combing through my Masters project for speed.

 

The only caveat of following this advice of always do this sort of work in 
matrixes is that it can be a little time consuming(developer time)  repairing 
factors. But in terms of code run time it is absolute essential to use the 
right data structure for the job.

 

Hope this is of assistance,

Jeremiah Rounds

  
 
 Date: Mon, 8 Jun 2009 15:45:40 +
 From: amitrh...@yahoo.co.uk
 To: r-help@r-project.org
 Subject: [R] help to speed up loops in r
 
 
 Hi
 i am using a script which involves the following loop. It attempts to reduce 
 a data frame(zz) of 95000 * 41 down to a data frame (averagedreplicates) of 
 95000 * 21 by averaging the replicate values as you can see in the script 
 below. This script however is very slow (2days). Any suggestions to speed it 
 up. 
 
 NB I have also tried using rowMeans rather than adding the 2 values and 
 dividing by 2. (same problem)
 
 
 
 
 #SCRIPT STARTS
 for (i in 1:length(averagedreplicates[,1]))
 #for (i in 1:dim(averagedreplicates)[1])
 {
 cat(i,'\n')
 
 
 #calculates Meanss
 #Sample A
 averagedreplicates[i,2] - (zz[i,2] + zz[i,3])/2
 averagedreplicates[i,3] - (zz[i,4] + zz[i,5])/2
 averagedreplicates[i,4] - (zz[i,6] + zz[i,7])/2
 averagedreplicates[i,5] - (zz[i,8] + zz[i,9])/2
 averagedreplicates[i,6] - (zz[i,10] + zz[i,11])/2
 
 #Sample B
 averagedreplicates[i,7] - (zz[i,12] + zz[i,13])/2
 averagedreplicates[i,8] - (zz[i,14] + zz[i,15])/2
 averagedreplicates[i,9] - (zz[i,16] + zz[i,17])/2
 averagedreplicates[i,10] - (zz[i,18] + zz[i,19])/2
 averagedreplicates[i,11] - (zz[i,20] + zz[i,21])/2
 
 #Sample C
 averagedreplicates[i,12] - (zz[i,22] + zz[i,23])/2
 averagedreplicates[i,13] - (zz[i,24] + zz[i,25])/2
 averagedreplicates[i,14] - (zz[i,26] + zz[i,27])/2
 averagedreplicates[i,15] - (zz[i,28] + zz[i,29])/2
 averagedreplicates[i,16] - (zz[i,30] + zz[i,31])/2
 
 #Sample D
 averagedreplicates[i,17] - (zz[i,32] + zz[i,33])/2
 averagedreplicates[i,18] - (zz[i,34] + zz[i,35])/2
 averagedreplicates[i,19] - (zz[i,36] + zz[i,37])/2
 averagedreplicates[i,20] - (zz[i,38] + zz[i,39])/2
 averagedreplicates[i,21] - (zz[i,40] + zz[i,41])/2
 }
 
 
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R

Re: [R] how to randomly eliminate half the entries in a vector?

2009-02-17 Thread Jeremiah Rounds

Here is what I got for script through your third question:

 

set.seed(1)

x1 = rbinom(200,1,.5)

x2 = rbinom(200,1,.5)

differ = x1 != x2

differ.indexes = (1:length(x1))[differ == TRUE]

#you were unclear if you want to round up or round down on odd index of 
differ.indexes

n = floor( length(differ.indexes)/2)

#sampling without replacement

random.indexes = sample(differ.indexes,n )

swapping = x1[random.indexes] #with 1s and 0s you can do this without this 
variable.  

x1[random.indexes] = x2[random.indexes] 

x2[random.indexes] = swapping

 

 

Good luck,

Jeremiah
 
 Date: Tue, 17 Feb 2009 20:17:51 -0500
 From: esmail...@gmail.com
 To: r-help@r-project.org
 Subject: [R] how to randomly eliminate half the entries in a vector?
 
 Hello all,
 
 I need some help with a nice R-idiomatic and efficient solution to a
 small problem.
 
 Essentially, I am trying to eliminate randomly half of the entries in
 a vector that contains index values into some other vectors.
 
 
 More details:
 
 I am working with two strings/vectors of 0s and 1s. These will contain
 about 200 elements (always the same number for both)
 
 I want to:
 
 1. determines the locations of where the two strings differ
 
 -- easy using xor(s1, s2)
 
 2. *randomly* selects *half* of those positions
 
 -- not sure how to do this. I suppose the result would be
 a list of index positions of size sum(xor(s1, s2))/2
 
 
 3. exchange (flip) the bits in those random positions for both strings
 
 -- I have something that seems to do that, but it doesn't look
 slick and I wonder how efficient it is.
 
 Mostly I need help for #2, but will happily accept suggestions for #3,
 or for that matter anything that looks odd.
 
 Below my partial solution .. the HUX function is what I am trying
 to finish if someone can point me in the right direction.
 
 Thanks
 Esmail
 --
 
 
 
 rm(list=ls())
 
 
 # create a binary vector of size len
 #
 create_bin_Chromosome - function(len)
 {
 sample(0:1, len, replace=T)
 }
 
 
 
 
 # HUX - half uniform crossover
 #
 # 1. determines the locations of where the two strings
 # differ (easy xor)
 #
 # 2. randomly selects half of those positions
 #
 # 3. exchanges (flips) the bits in those positions for
 # both
 #
 HUX - function(b1, b2)
 {
 # 1. find differing bits
 r=xor(b1, b2)
 
 # positions where bits differ
 different = which(r==TRUE)
 
 cat(\nhrp: , different, \n)
 # 2. ??? how to do this best so that each time
 # a different half subset is selected? I.e.,
 # sum(r)/2 positions.
 
 
 # 3. this flips *all* positions, should really only flip
 # half of them (randomly selected half)
 new_b1 = b1
 new_b2 = b2
 
 for(i in different) # should contain half the entries (randomly)
 {
 new_b1[i] = b2[i]
 new_b2[i] = b1[i]
 }
 
 result - matrix(c(new_b1, new_b2), 2, LEN, byrow=T)
 result
 }
 
 
 
 LEN = 5
 b1=create_bin_Chromosome(LEN)
 b2=create_bin_Chromosome(LEN)
 
 cat(b1, \n)
 cat(b2, \n)
 
 idx=HUX(b1, b2)
 cat(\n\n)
 cat(idx[1,], \n)
 cat(idx[2,], \n)
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

_

 go.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to paste graph from R in Latex?

2008-05-17 Thread Jeremiah Rounds


For school work I use png.   Png files are more efficient size/quality wise 
than png, and also lend themselves to more generic application/viewing than ps.





In R this typically takes the form of:


setwd(...) #set working directory before starting any work typically at the top 
of scripts


... # stuff





png(filename,height=800, width=800)


   #graphical commands


dev.off()





One of the great things about the png command is the size formatting.  One 
great trick is to increase the size of the plotting area, plot, and then in 
latex shrink the graphic down.  There is alot of graphics where this makes 
everything look better with very little work due to everything drawing at a 
finer resolution (in some lossy sense).





In your latex you will want to use package epsfig because under windows the 
png bounding box info isn't what default latex packages expect and epsfig can 
fix that easily.





Typically this has the form


\usepackage{epsfig}


\begin{document}


  \begin{figure}[!htbp]


   \center


   \caption{Jittered pairs plot of severity predictors colored by red is 
severity 1.}


   \label{bcpairs}




   \epsfig{file=bcpairs.png, bb= 0 0 800 800,width=5.25in, clip=}



   \end{figure}


\end{document}


The key line is \epsfig.  bb = is the bounding box which corresponds to 
whatever you had in the png command in R.  width is where you resize it.  You 
supply the width and the package will 1 to 1 rescale it.








There are two tricks I picked up in my travels using this for homework.  Well 
there are three, but I don't have example of the 3rd handy (side by side 
subfigures).





One is clipping a figure to get rid of a piece of it.  That is a simple as 
changing the bb command to only bound the parts you want.





The other is shifting the graphic into the left margin a little bit.  Handy for 
using the entire page on some graphics that just arnt easy to make any smaller.





That is done like so:


\begin{figure}[tbp]






  \caption{Wine data pairs plots colored by cultivar.}


  \label{winepairs}



  \begin{minipage}{9in}


 \hspace{-.75in}


  \epsfig{file=ex2pairs.png, bb= 0 0 1200 1200,width=7in, clip=}


   \end{minipage}


\end{figure}





The key there is you start a minipage and then shift it to the left.  Note here 
the command in R was:


png(ex2pairs.png, height=1200, width=1200)  for a large scatterplot.





A large scatterplot is an example of something that often looks better painted 
at a higher resolution, saved, and then shrunk down.





-





Someone mentioned Sweave.  Sweaves value really depends on who you are and what 
your doing.   Its work cycle is not appropriate for students or anyone that 
needs rapid cycle prototyping imo.  Its great flaw is that it does not work 
well with changing a little something--looking at the results in R  followed 
by changing a little something in latex--looking at the results in dvi 
repeated over and over and over again.  The reason is it has to repeat far to 
much work in each cycle.  Often times repeating long calculations.





This system you open a script in tinn-r.  You run it.  You have your texmaker 
open.  You compilete your document.  You dont like the graphic.  You make your 
change to the plotting in your script.  You highlight it and send it to r.  You 
open it in a graphics viewer via double click or you simply compile your latex 
document again.  Check it.





Sweave is not at all friendly to that check your work as you go mentality.  
It really needs a graphical interface that lets you indicate what not to redo, 
and just redo things incrementally.















 Date: Fri, 16 May 2008 18:24:00 -0700
 From: [EMAIL PROTECTED]
 To: R-help@r-project.org
 Subject: [R] How to paste graph from R in Latex?

 Dear R-expert,
 Is it possible to save graph from R into Latex document? I can see save as 
 metafile , PNG, pdf etc, but I'm not sure which one to use.
 Thank  you so much for your help.




 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

_
Give to a good cause with every e-mail. Join the i’m Initiative from Microsoft.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] heatmap on pre-established hclust output?

2008-05-16 Thread Jeremiah Rounds

 To: [EMAIL PROTECTED] From: [EMAIL PROTECTED] Date: Fri, 16 May 2008 
 17:55:26 +0200 Subject: [R] heatmap on pre-established hclust output?  
 Hi,  Can someone please guide me towards how to produce heatmap output 
 from the output of hclust run prior to the actual heatmap call? I have 
 some rather lengthy clustering going on and tweeking the visual output with 
 heatmap recalculating the clustering every time is not feasible.  Thanks, 
 Joh
 
I can't say that i have actually tackled this, but I have some experience with 
the functions you mentioned.  Heatmap takes an hclustfun function parameter.  
You can create a custom clustering function.  I don't believe there is a rule 
that says you have to do actual work in that function call.  Look at just 
returning the results of your more complicated clustering in that call without 
actually doing the calculations.
 
Jeremiah Rounds
Graduate Student
Utah State University
 
  __ R-help@r-project.org 
  mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read 
  the posting guide http://www.R-project.org/posting-guide.html and provide 
  commented, minimal, self-contained, reproducible code.
_
E-mail for the greater good. Join the i’m Initiative from Microsoft.

ood
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] strip white in character strings

2008-05-14 Thread Jeremiah Rounds




 Date: Wed, 14 May 2008 12:06:39 -0400
 From: [EMAIL PROTECTED]
 To: [EMAIL PROTECTED]
 Subject: [R] strip white in character strings

 Dear all,

 I have several datasets and I want to generate pdf plots from them.
 I also want to generate automatically the names of the files. They are
 country-specific and the element mycurrentdata[1,1] contains this
 information.

 So what I do is something like this:
 pdf(file=paste(mycurrentdata[1,1], .pdf, sep=), width=...etc)

 The only problem I have is that some of the country names contain white
 space (e.g., United Kingdom). This is no problem for generating the
 pdf plots but it may become problematic during further processing (e.g.
 incl. the plots in LaTeX documents).

 Is there an easy function to strip white space out of character strings
 (similar to the strip.white=TRUE option in read.table/scan)?


How about 


 a =  United Kingdom  
 paste(unlist(strsplit(a,split= )), collapse=)
[1] UnitedKingdom

Note better might is using generic trimming functions after the split to catch 
any left over non-space white space stuff in each split.











 I'd appreciate any kind of help and I hope I did not miss anything
 completely obvious.

 Thanks,
 Roland

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

_

 1.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Newbie question about vector matrix multiplication

2008-05-14 Thread Jeremiah Rounds




 Date: Wed, 14 May 2008 15:18:32 -0400
 From: [EMAIL PROTECTED]
 To: r-help@r-project.org
 Subject: [R] Newbie question about vector matrix multiplication

 Hello All,

 I have a covariance matrix, generated by read.table, and cov:

 co-cov(read.table(c:/r.x))

 X Y Z

 X 0.0012517684 0.0002765438 0.0007887114

 Y 0.0002765438 0.0002570286 0.0002117336

 Z 0.0007887114 0.0002117336 0.0009168750



 And a weight vector generated by

 w- read.table(c:/r.weights)

 X Y Z

 1 0.5818416 0.2158531 0.2023053



 I want to compute the product of the matrix and vectors termwise to
 generate a 3x3 matrix, where m[i,j]=w[i]*co[i,j]*w[j].

 0.000423773 7.47216E-08 4.41255E-08

 7.47216E-08 1.96566E-11 4.29229E-11

 4.41255E-08 4.29229E-11 4.11045E-11


First off your example matrix does not seem to represent the equation you wrote 
down.  For example m[1,3] should be m[1,3] = 0.5818416 * 0.0007887114* 
0.2023053  = .000928.  I apologize if that represents something incorrect on 
part.  However if I am correct then I believe what you seek is the line below:


m = w %*% t(w)*co

To get there btw picture moving the weights together then picture multiplying 
two equal sized matrixes together by each coefficient.







 Is this possible without writing explicit loops?

 Thank you,

 Dan Stanger

 Eaton Vance Management
 200 State Street
 Boston, MA 02109
 617 598 8261




 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

_


esh_messenger_052008
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] array dimension changes with assignment

2008-05-13 Thread Jeremiah Rounds

Why does the assignment of a 3178x93 object to another 3178x93 object remove 
the dimension attribute? GT - array(dim = c(6,nrow(InData),ncol(InSNPs))) 
dim(GT)[1]   6 3178  93 SNP1 - InSNPs[InData[,C1],] dim(SNP1)[1] 3178 
 93 SNP2 - InSNPs[InData[,C2],] dim(SNP2)[1] 3178  93 
dim(pmin(SNP1,SNP2))[1] 3178  93 GT[1,,] - pmin(SNP1,SNP2) dim(GT)NULL  
 # why?? GT[2,,] - 
pmax(SNP1,SNP2)Error in GT[2, , ] - pmax(SNP1, SNP2) : incorrect number of 
.subscripts
---

My understanding is that an array is just a list with a dimension attribute, so 
first note that loosing the dim attribute is not a great loss.  It does not 
represent an inefficiency.

But consider this code:
 GT - array(dim = c(6,3178,  93) ) dim(GT)[1]6 3178   93   SNP1 
 -as.array(matrix(0,nrow=3178,ncol=93)) dim(SNP1)[1] 3178   93
 GT[1,,] - SNP1 dim(GT)[1]6 3178   93


Here what you wanted to happen happened just fine.  So the question you might 
ask yourself is: what is different?  And that leads to asking what class is the 
SNP1 object?  If you can coerce into an array you probably can avoid the issue.

Jeremiah Rounds
Graduate Student
Utah State University
_
Get Free (PRODUCT) RED™  Emoticons, Winks and Display Pics.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.