Re: [R] Data handling

2013-10-16 Thread Raoni Rodrigues
Thanks A.K. and Jim!

Thanks very much, both solution works fine! But I can´t figure out what was
the problem with my code. Make step-by-step is not recommended? Is just
that difference?

Thanks again for help!

Raoni


2013/10/15 arun smartpink...@yahoo.com

 Try:
  op - options(digits.secs=4)

   TimeCC - as.POSIXct(paste0(paste(teste[,1],teste[,2]),
 sub(^0,,teste[,3])),format=%m/%d/%y %H:%M:%OS)
 options(op) #reset

 A.K.


 On Tuesday, October 15, 2013 10:29 AM, Raoni Rodrigues 
 caciquesamu...@gmail.com wrote:
 Hello all,

 I'm having a problem with data handling. My input data is (dput in the
 after the signature):

 Date Time Fraction
 06/19/13 22:15:39   0.3205
 06/19/13 22:15:44   0.3205
 06/19/13 22:15:49   0.3205
 06/19/13 22:15:54   0.3205
 06/19/13 22:15:59   0.3205
 06/19/13 22:16:09   0.3205

 Date in format month/day/year, Time in HH:MM:SS and fraction represents the
 fractions of seconds. I need to have a vector in a format year-month-day
 hh:mm:ss.. Or, in format: format = %F %H:%M:%OS4, as POSIXct class.

 I made the the conversion step-by-step to have sure that nothing is missed
 in the way:

  options (digits.sec = 4)
  getOption (digits.sec)
 [1] 4
  teste$Date1 = as.Date (teste$Date, format = %m/%d/%y)
  class (teste$Date1)
 [1] Date
  teste$Fraction = sub (0., , teste$Fraction)
  teste$TimeC = paste (teste$Time, teste$Fraction, sep = .)
  teste$TimeCC = paste (teste$Date1, teste$TimeC)

  head (teste)
   Date Time Fraction  Date1TimeC
 TimeCC
 1 06/19/13 22:15:39 .325 2013-06-19 22:15:39.325 2013-06-19
 22:15:39.3205
 2 06/19/13 22:15:44 .325 2013-06-19 22:15:44.325 2013-06-19
 22:15:44.3205
 3 06/19/13 22:15:49 .325 2013-06-19 22:15:49.325 2013-06-19
 22:15:49.3205
 4 06/19/13 22:15:54 .325 2013-06-19 22:15:54.325 2013-06-19
 22:15:54.3205
 5 06/19/13 22:15:59 .325 2013-06-19 22:15:59.325 2013-06-19
 22:15:59.3205
 6 06/19/13 22:16:09 .325 2013-06-19 22:16:09.325 2013-06-19
 22:16:09.3205

 So far so well. The problem is when I tried to convert to POSIXct class. If
 I use just:

 teste$TimeCC = format (teste$TimeCC, format = %F %H:%M:%OS4)
 teste$TimeCC = as.POSIXct (teste$TimeCC)

 I lost the fraction of seconds. If I use:

 teste$TimeCC = as.POSIXct(strptime (teste$TimeCC, format = %F
 %H:%M:%OS4))

 I lost all information and get just NA.

 Thanks in advanced,

 --
 Raoni Rosa Rodrigues
 Research Associate of Fish Transposition Center CTPeixes
 Universidade Federal de Minas Gerais - UFMG
 Brasil
 rodrigues.ra...@gmail.com

 dput of input data

 structure(list(Date = c(06/19/13, 06/19/13, 06/19/13, 06/19/13,
 06/19/13, 06/19/13), Time = c(22:15:39, 22:15:44, 22:15:49,
 22:15:54, 22:15:59, 22:16:09), Fraction = c(0.3205, 0.3205,
 0.3205, 0.3205, 0.3205, 0.3205)), .Names = c(Date,
 Time, Fraction), row.names = c(NA, 6L), class = data.frame)

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Raoni Rosa Rodrigues
Research Associate of Fish Transposition Center CTPeixes
Universidade Federal de Minas Gerais - UFMG
Brasil
rodrigues.ra...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Data handling

2013-10-15 Thread Raoni Rodrigues
Hello all,

I'm having a problem with data handling. My input data is (dput in the
after the signature):

Date Time Fraction
 06/19/13 22:15:39   0.3205
 06/19/13 22:15:44   0.3205
 06/19/13 22:15:49   0.3205
 06/19/13 22:15:54   0.3205
 06/19/13 22:15:59   0.3205
 06/19/13 22:16:09   0.3205

Date in format month/day/year, Time in HH:MM:SS and fraction represents the
fractions of seconds. I need to have a vector in a format year-month-day
hh:mm:ss.. Or, in format: format = %F %H:%M:%OS4, as POSIXct class.

I made the the conversion step-by-step to have sure that nothing is missed
in the way:

 options (digits.sec = 4)
 getOption (digits.sec)
[1] 4
 teste$Date1 = as.Date (teste$Date, format = %m/%d/%y)
 class (teste$Date1)
[1] Date
 teste$Fraction = sub (0., , teste$Fraction)
 teste$TimeC = paste (teste$Time, teste$Fraction, sep = .)
 teste$TimeCC = paste (teste$Date1, teste$TimeC)

 head (teste)
  Date Time Fraction  Date1TimeC  TimeCC
1 06/19/13 22:15:39 .325 2013-06-19 22:15:39.325 2013-06-19
22:15:39.3205
2 06/19/13 22:15:44 .325 2013-06-19 22:15:44.325 2013-06-19
22:15:44.3205
3 06/19/13 22:15:49 .325 2013-06-19 22:15:49.325 2013-06-19
22:15:49.3205
4 06/19/13 22:15:54 .325 2013-06-19 22:15:54.325 2013-06-19
22:15:54.3205
5 06/19/13 22:15:59 .325 2013-06-19 22:15:59.325 2013-06-19
22:15:59.3205
6 06/19/13 22:16:09 .325 2013-06-19 22:16:09.325 2013-06-19
22:16:09.3205

So far so well. The problem is when I tried to convert to POSIXct class. If
I use just:

teste$TimeCC = format (teste$TimeCC, format = %F %H:%M:%OS4)
teste$TimeCC = as.POSIXct (teste$TimeCC)

I lost the fraction of seconds. If I use:

teste$TimeCC = as.POSIXct(strptime (teste$TimeCC, format = %F %H:%M:%OS4))

I lost all information and get just NA.

Thanks in advanced,

-- 
Raoni Rosa Rodrigues
Research Associate of Fish Transposition Center CTPeixes
Universidade Federal de Minas Gerais - UFMG
Brasil
rodrigues.ra...@gmail.com

dput of input data

structure(list(Date = c(06/19/13, 06/19/13, 06/19/13, 06/19/13,
06/19/13, 06/19/13), Time = c(22:15:39, 22:15:44, 22:15:49,
22:15:54, 22:15:59, 22:16:09), Fraction = c(0.3205, 0.3205,
0.3205, 0.3205, 0.3205, 0.3205)), .Names = c(Date,
Time, Fraction), row.names = c(NA, 6L), class = data.frame)

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data handling

2013-10-15 Thread arun
Try:
 op - options(digits.secs=4)

  TimeCC - as.POSIXct(paste0(paste(teste[,1],teste[,2]), 
sub(^0,,teste[,3])),format=%m/%d/%y %H:%M:%OS)
options(op) #reset

A.K.


On Tuesday, October 15, 2013 10:29 AM, Raoni Rodrigues 
caciquesamu...@gmail.com wrote:
Hello all,

I'm having a problem with data handling. My input data is (dput in the
after the signature):

    Date     Time Fraction
06/19/13 22:15:39   0.3205
06/19/13 22:15:44   0.3205
06/19/13 22:15:49   0.3205
06/19/13 22:15:54   0.3205
06/19/13 22:15:59   0.3205
06/19/13 22:16:09   0.3205

Date in format month/day/year, Time in HH:MM:SS and fraction represents the
fractions of seconds. I need to have a vector in a format year-month-day
hh:mm:ss.. Or, in format: format = %F %H:%M:%OS4, as POSIXct class.

I made the the conversion step-by-step to have sure that nothing is missed
in the way:

 options (digits.sec = 4)
 getOption (digits.sec)
[1] 4
 teste$Date1 = as.Date (teste$Date, format = %m/%d/%y)
 class (teste$Date1)
[1] Date
 teste$Fraction = sub (0., , teste$Fraction)
 teste$TimeC = paste (teste$Time, teste$Fraction, sep = .)
 teste$TimeCC = paste (teste$Date1, teste$TimeC)

 head (teste)
      Date     Time Fraction      Date1        TimeC                  TimeCC
1 06/19/13 22:15:39     .325 2013-06-19 22:15:39.325 2013-06-19
22:15:39.3205
2 06/19/13 22:15:44     .325 2013-06-19 22:15:44.325 2013-06-19
22:15:44.3205
3 06/19/13 22:15:49     .325 2013-06-19 22:15:49.325 2013-06-19
22:15:49.3205
4 06/19/13 22:15:54     .325 2013-06-19 22:15:54.325 2013-06-19
22:15:54.3205
5 06/19/13 22:15:59     .325 2013-06-19 22:15:59.325 2013-06-19
22:15:59.3205
6 06/19/13 22:16:09     .325 2013-06-19 22:16:09.325 2013-06-19
22:16:09.3205

So far so well. The problem is when I tried to convert to POSIXct class. If
I use just:

teste$TimeCC = format (teste$TimeCC, format = %F %H:%M:%OS4)
teste$TimeCC = as.POSIXct (teste$TimeCC)

I lost the fraction of seconds. If I use:

teste$TimeCC = as.POSIXct(strptime (teste$TimeCC, format = %F %H:%M:%OS4))

I lost all information and get just NA.

Thanks in advanced,

-- 
Raoni Rosa Rodrigues
Research Associate of Fish Transposition Center CTPeixes
Universidade Federal de Minas Gerais - UFMG
Brasil
rodrigues.ra...@gmail.com

dput of input data

structure(list(Date = c(06/19/13, 06/19/13, 06/19/13, 06/19/13,
06/19/13, 06/19/13), Time = c(22:15:39, 22:15:44, 22:15:49,
22:15:54, 22:15:59, 22:16:09), Fraction = c(0.3205, 0.3205,
0.3205, 0.3205, 0.3205, 0.3205)), .Names = c(Date,
Time, Fraction), row.names = c(NA, 6L), class = data.frame)

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data handling

2013-10-15 Thread jim holtman
Try this; your time is converted back to a character string if you
want to show the fractional part.

 x - read.table(text = Date Time Fraction
+  06/19/13 22:15:39   0.3205
+  06/19/13 22:15:44   0.3205
+  06/19/13 22:15:49   0.3205
+  06/19/13 22:15:54   0.3205
+  06/19/13 22:15:59   0.3205
+  06/19/13 22:16:09   0.3205, as.is = TRUE, header = TRUE)
  x$newTime - as.POSIXct(
+ paste0(x$Date, ' ', x$Time , '.', substring(x$Fraction, 3))
+ , format = %m/%d/%y %H:%M:%OS
+ )
  x$formatted - format(x$newTime, format = %m/%d/%y %H:%M:%OS4)



 x
  Date Time Fraction newTime  formatted
1 06/19/13 22:15:39   0.3205 2013-06-19 22:15:39 06/19/13 22:15:39.3204
2 06/19/13 22:15:44   0.3205 2013-06-19 22:15:44 06/19/13 22:15:44.3204
3 06/19/13 22:15:49   0.3205 2013-06-19 22:15:49 06/19/13 22:15:49.3204
4 06/19/13 22:15:54   0.3205 2013-06-19 22:15:54 06/19/13 22:15:54.3204
5 06/19/13 22:15:59   0.3205 2013-06-19 22:15:59 06/19/13 22:15:59.3204
6 06/19/13 22:16:09   0.3205 2013-06-19 22:16:09 06/19/13 22:16:09.3204


Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.


On Tue, Oct 15, 2013 at 10:27 AM, Raoni Rodrigues
caciquesamu...@gmail.com wrote:
 Hello all,

 I'm having a problem with data handling. My input data is (dput in the
 after the signature):

 Date Time Fraction
  06/19/13 22:15:39   0.3205
  06/19/13 22:15:44   0.3205
  06/19/13 22:15:49   0.3205
  06/19/13 22:15:54   0.3205
  06/19/13 22:15:59   0.3205
  06/19/13 22:16:09   0.3205

 Date in format month/day/year, Time in HH:MM:SS and fraction represents the
 fractions of seconds. I need to have a vector in a format year-month-day
 hh:mm:ss.. Or, in format: format = %F %H:%M:%OS4, as POSIXct class.

 I made the the conversion step-by-step to have sure that nothing is missed
 in the way:

 options (digits.sec = 4)
 getOption (digits.sec)
 [1] 4
 teste$Date1 = as.Date (teste$Date, format = %m/%d/%y)
 class (teste$Date1)
 [1] Date
 teste$Fraction = sub (0., , teste$Fraction)
 teste$TimeC = paste (teste$Time, teste$Fraction, sep = .)
 teste$TimeCC = paste (teste$Date1, teste$TimeC)

 head (teste)
   Date Time Fraction  Date1TimeC  TimeCC
 1 06/19/13 22:15:39 .325 2013-06-19 22:15:39.325 2013-06-19
 22:15:39.3205
 2 06/19/13 22:15:44 .325 2013-06-19 22:15:44.325 2013-06-19
 22:15:44.3205
 3 06/19/13 22:15:49 .325 2013-06-19 22:15:49.325 2013-06-19
 22:15:49.3205
 4 06/19/13 22:15:54 .325 2013-06-19 22:15:54.325 2013-06-19
 22:15:54.3205
 5 06/19/13 22:15:59 .325 2013-06-19 22:15:59.325 2013-06-19
 22:15:59.3205
 6 06/19/13 22:16:09 .325 2013-06-19 22:16:09.325 2013-06-19
 22:16:09.3205

 So far so well. The problem is when I tried to convert to POSIXct class. If
 I use just:

 teste$TimeCC = format (teste$TimeCC, format = %F %H:%M:%OS4)
 teste$TimeCC = as.POSIXct (teste$TimeCC)

 I lost the fraction of seconds. If I use:

 teste$TimeCC = as.POSIXct(strptime (teste$TimeCC, format = %F %H:%M:%OS4))

 I lost all information and get just NA.

 Thanks in advanced,

 --
 Raoni Rosa Rodrigues
 Research Associate of Fish Transposition Center CTPeixes
 Universidade Federal de Minas Gerais - UFMG
 Brasil
 rodrigues.ra...@gmail.com

 dput of input data

 structure(list(Date = c(06/19/13, 06/19/13, 06/19/13, 06/19/13,
 06/19/13, 06/19/13), Time = c(22:15:39, 22:15:44, 22:15:49,
 22:15:54, 22:15:59, 22:16:09), Fraction = c(0.3205, 0.3205,
 0.3205, 0.3205, 0.3205, 0.3205)), .Names = c(Date,
 Time, Fraction), row.names = c(NA, 6L), class = data.frame)

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data handling

2013-10-15 Thread jim holtman
FYI.

The fractional part is printed as '3204' instead of '3205' since with
POSIXct you only have accuracy to the millisecond for times around
now.

Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.


On Tue, Oct 15, 2013 at 12:45 PM, jim holtman jholt...@gmail.com wrote:
 Try this; your time is converted back to a character string if you
 want to show the fractional part.

 x - read.table(text = Date Time Fraction
 +  06/19/13 22:15:39   0.3205
 +  06/19/13 22:15:44   0.3205
 +  06/19/13 22:15:49   0.3205
 +  06/19/13 22:15:54   0.3205
 +  06/19/13 22:15:59   0.3205
 +  06/19/13 22:16:09   0.3205, as.is = TRUE, header = TRUE)
  x$newTime - as.POSIXct(
 + paste0(x$Date, ' ', x$Time , '.', substring(x$Fraction, 3))
 + , format = %m/%d/%y %H:%M:%OS
 + )
  x$formatted - format(x$newTime, format = %m/%d/%y %H:%M:%OS4)



 x
   Date Time Fraction newTime  formatted
 1 06/19/13 22:15:39   0.3205 2013-06-19 22:15:39 06/19/13 22:15:39.3204
 2 06/19/13 22:15:44   0.3205 2013-06-19 22:15:44 06/19/13 22:15:44.3204
 3 06/19/13 22:15:49   0.3205 2013-06-19 22:15:49 06/19/13 22:15:49.3204
 4 06/19/13 22:15:54   0.3205 2013-06-19 22:15:54 06/19/13 22:15:54.3204
 5 06/19/13 22:15:59   0.3205 2013-06-19 22:15:59 06/19/13 22:15:59.3204
 6 06/19/13 22:16:09   0.3205 2013-06-19 22:16:09 06/19/13 22:16:09.3204


 Jim Holtman
 Data Munger Guru

 What is the problem that you are trying to solve?
 Tell me what you want to do, not how you want to do it.


 On Tue, Oct 15, 2013 at 10:27 AM, Raoni Rodrigues
 caciquesamu...@gmail.com wrote:
 Hello all,

 I'm having a problem with data handling. My input data is (dput in the
 after the signature):

 Date Time Fraction
  06/19/13 22:15:39   0.3205
  06/19/13 22:15:44   0.3205
  06/19/13 22:15:49   0.3205
  06/19/13 22:15:54   0.3205
  06/19/13 22:15:59   0.3205
  06/19/13 22:16:09   0.3205

 Date in format month/day/year, Time in HH:MM:SS and fraction represents the
 fractions of seconds. I need to have a vector in a format year-month-day
 hh:mm:ss.. Or, in format: format = %F %H:%M:%OS4, as POSIXct class.

 I made the the conversion step-by-step to have sure that nothing is missed
 in the way:

 options (digits.sec = 4)
 getOption (digits.sec)
 [1] 4
 teste$Date1 = as.Date (teste$Date, format = %m/%d/%y)
 class (teste$Date1)
 [1] Date
 teste$Fraction = sub (0., , teste$Fraction)
 teste$TimeC = paste (teste$Time, teste$Fraction, sep = .)
 teste$TimeCC = paste (teste$Date1, teste$TimeC)

 head (teste)
   Date Time Fraction  Date1TimeC  TimeCC
 1 06/19/13 22:15:39 .325 2013-06-19 22:15:39.325 2013-06-19
 22:15:39.3205
 2 06/19/13 22:15:44 .325 2013-06-19 22:15:44.325 2013-06-19
 22:15:44.3205
 3 06/19/13 22:15:49 .325 2013-06-19 22:15:49.325 2013-06-19
 22:15:49.3205
 4 06/19/13 22:15:54 .325 2013-06-19 22:15:54.325 2013-06-19
 22:15:54.3205
 5 06/19/13 22:15:59 .325 2013-06-19 22:15:59.325 2013-06-19
 22:15:59.3205
 6 06/19/13 22:16:09 .325 2013-06-19 22:16:09.325 2013-06-19
 22:16:09.3205

 So far so well. The problem is when I tried to convert to POSIXct class. If
 I use just:

 teste$TimeCC = format (teste$TimeCC, format = %F %H:%M:%OS4)
 teste$TimeCC = as.POSIXct (teste$TimeCC)

 I lost the fraction of seconds. If I use:

 teste$TimeCC = as.POSIXct(strptime (teste$TimeCC, format = %F %H:%M:%OS4))

 I lost all information and get just NA.

 Thanks in advanced,

 --
 Raoni Rosa Rodrigues
 Research Associate of Fish Transposition Center CTPeixes
 Universidade Federal de Minas Gerais - UFMG
 Brasil
 rodrigues.ra...@gmail.com

 dput of input data

 structure(list(Date = c(06/19/13, 06/19/13, 06/19/13, 06/19/13,
 06/19/13, 06/19/13), Time = c(22:15:39, 22:15:44, 22:15:49,
 22:15:54, 22:15:59, 22:16:09), Fraction = c(0.3205, 0.3205,
 0.3205, 0.3205, 0.3205, 0.3205)), .Names = c(Date,
 Time, Fraction), row.names = c(NA, 6L), class = data.frame)

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Data handling/optimum glm method.

2012-03-29 Thread abigailclifton
Hi there,

I am trying to fit a generalised linear model to some loan application and 
default data. The purpose of this is to eventually work out the probability an 
applicant will default.

However, R seems to crash or die when I run glm on anything greater than a 
5-way saturated model for my data.

My first question: is the best way to fit a generalised linear model in R to 
fit the saturated model and extract the significant terms only, or to start at 
the null model and to work up to the optimum one?
 
I am importing a csv file with 3500 rows and 27 columns (3500x27 matrix).

My second question: is there anyway to increase the memory I have so R can cope 
with more analysis?

I can send my code if it would help to answer the question.

Kind regards,

AJC
Sent from my BlackBerry smartphone from Virgin Media

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data handling/optimum glm method.

2012-03-29 Thread Ben Bolker
 abigailclifton at me.com writes:


 I am trying to fit a generalised linear model to some loan
 application and default data. The purpose of this is to eventually
 work out the probability an applicant will default.
 
 However, R seems to crash or die when I run glm on anything
  greater than a 5-way saturated model for my data.

  What does crash or die mean?  Are you getting error messages?
What are they? Is the R application actually quitting?
 
 My first question: is the best way to fit a generalised linear model
 in R to fit the saturated model and extract the significant terms
 only, or to start at the null model and to work up to the optimum
 one?

  This is more of a statistical practice question than an R question.
Opinions differ but in general I would say if it is computationally
feasible that you should start (and maybe finish) with the 
full model.
 
 I am importing a csv file with 3500 rows and 27 columns (3500x27 matrix).
 
 My second question: is there anyway to increase the memory 
 I have so R can cope with more analysis?

   help(Memory-limits)
 
 I can send my code if it would help to answer the question.

  Please read the posting guide (link at the bottom of every R-help
posting) and follow its advice.  We don't know enough about your
situation to help.  You could also try reading 
http://tinyurl.com/reproducible-000 ...

  This works for me:

z - matrix(rnorm(3500*27),ncol=27)
y - sample(0:1,replace=TRUE,size=3500)
colnames(z) - c(letters,A)
d - data.frame(y=y,z)
gg - glm(y~.,data=d,family=binomial)
gg - glm(y~a*b*c*d*e*f*g*h,data=d,family=binomial)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data handling/optimum glm method.

2012-03-29 Thread Bert Gunter
Ben:

On Thu, Mar 29, 2012 at 5:41 AM, Ben Bolker bbol...@gmail.com wrote:
  abigailclifton at me.com writes:


 I am trying to fit a generalised linear model to some loan
 application and default data. The purpose of this is to eventually
 work out the probability an applicant will default.

 However, R seems to crash or die when I run glm on anything
  greater than a 5-way saturated model for my data.

  What does crash or die mean?  Are you getting error messages?
 What are they? Is the R application actually quitting?

 My first question: is the best way to fit a generalised linear model
 in R to fit the saturated model and extract the significant terms
 only, or to start at the null model and to work up to the optimum
 one?

  This is more of a statistical practice question than an R question.
 Opinions differ
Well, to clarify: I do not think opinions differ on the first proposal
 -- reduce model to only significant terms. This should **not** be
done.

I also would say (more tentatively) that modern practice rejects the
notion of an optimum model to begin with,preferring shrinkage of
other methodology.

Cheers,
Bert


 but in general I would say if it is computationally
 feasible that you should start (and maybe finish) with the
 full model.

 I am importing a csv file with 3500 rows and 27 columns (3500x27 matrix).

 My second question: is there anyway to increase the memory
 I have so R can cope with more analysis?

   help(Memory-limits)

 I can send my code if it would help to answer the question.

  Please read the posting guide (link at the bottom of every R-help
 posting) and follow its advice.  We don't know enough about your
 situation to help.  You could also try reading
 http://tinyurl.com/reproducible-000 ...

  This works for me:

 z - matrix(rnorm(3500*27),ncol=27)
 y - sample(0:1,replace=TRUE,size=3500)
 colnames(z) - c(letters,A)
 d - data.frame(y=y,z)
 gg - glm(y~.,data=d,family=binomial)
 gg - glm(y~a*b*c*d*e*f*g*h,data=d,family=binomial)

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data Handling

2010-08-02 Thread Lily_stats

Hi,

I have managed to convert my data frames into xts such as :

 str(z)
An ‘xts’ object from 1983-01-03 19:00:00 to 2006-01-01 22:00:00 containing:
  Data: num [1:182959, 1:2] 12.6 11.3 12.7 12.8 10.9 ...
 - attr(*, dimnames)=List of 2
  ..$ : NULL
  ..$ : chr [1:2] v DD
  Indexed by objects of class: [POSIXt,POSIXct] TZ:
  Original class: 'data.frame'
  xts Attributes:
 NULL
I have a second set of data and would like to pull out the values of v
when time series from z and z1 are exact.

I have tried to look at cbind etc, but I am stuck and very confused!

Any help is appreciated

On Fri, Jul 30, 2010 at 12:44 PM, raghu [via R] 
ml-node+2307836-1452362937-369...@n4.nabble.comml-node%2b2307836-1452362937-369...@n4.nabble.com
 wrote:

 Convert your datasets into xts objects and then do a cbind ordering by the
 column you want. Do a ?cbind.

 HTH
 Raghu

  On Fri, Jul 30, 2010 at 10:33 AM, Lily_stats [via R] [hidden 
 email]http://user/SendEmail.jtp?type=nodenode=2307836i=0
  wrote:

 Hi,

 I am very new to R so these questions may seem simple!

 I have a huge 2 sets of data(matrix 5x2++) in the following formats ,
 for example data.txt and data2.txt:

 Date   Time X   Y
 03/03/1983  20:00   0.1  990

 I would like to recreate a new matrix which filters through data.txt and
 data2.txt to get something as below :

 Date Time   X_data1  X_data2
  Y_data1  Y_data2
 31/12/2000 12:00 2.25
  0990

 So I basically need :
 1) When Date AND Time from data1.txt and data2.txt match, list the
 corresponding X and Y values (X_data1,X_data2,Y_data1,Y_data2)

 Thank you in advance, and I hope I have been clear enough in my message

 --
  View message @
 http://r.789695.n4.nabble.com/Data-Handling-tp2307770p2307770.htmlhttp://r.789695.n4.nabble.com/Data-Handling-tp2307770p2307770.html?by-user=t
 To start a new topic under R help, email [hidden 
 email]http://user/SendEmail.jtp?type=nodenode=2307836i=1
 To unsubscribe from R help, click here.




 --
 'Raghu'


 --
 View message @
 http://r.789695.n4.nabble.com/Data-Handling-tp2307770p2307836.html
 To unsubscribe from Data Handling, click here (link removed) .




-- 
View this message in context: 
http://r.789695.n4.nabble.com/Data-Handling-tp2307770p2310318.html
Sent from the R help mailing list archive at Nabble.com.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data Handling

2010-07-30 Thread Lily_stats

Hi,

I am trying to convert my dataset into xts. I have tried the following :

data1-read.table(data1.txt,header=F)
data2-read.table(data2.txt,header=F)

data1.xtsas.xts(data1,descr=my new xts object)

However, I get an error :

Error in as.POSIXlt.character(x, tz, ...) : 
  character string is not in a standard unambiguous format

I understand that my date and time format might not be accepted and have
tried to convert this but failed. 

Could you suggest something ?

My date is in the format : dd/mm/
My time is in the format : hh:00

Thank you in advance


-- 
View this message in context: 
http://r.789695.n4.nabble.com/Data-Handling-tp2307770p2307936.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data Handling

2010-07-30 Thread Gabor Grothendieck
On Fri, Jul 30, 2010 at 9:02 AM, Lily_stats sund...@gmail.com wrote:

 Hi,

 I am trying to convert my dataset into xts. I have tried the following :

 data1-read.table(data1.txt,header=F)
 data2-read.table(data2.txt,header=F)

 data1.xtsas.xts(data1,descr=my new xts object)

 However, I get an error :

 Error in as.POSIXlt.character(x, tz, ...) :
  character string is not in a standard unambiguous format

 I understand that my date and time format might not be accepted and have
 tried to convert this but failed.

 Could you suggest something ?

 My date is in the format : dd/mm/
 My time is in the format : hh:00

 Thank you in advance


 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Data-Handling-tp2307770p2307936.html
 Sent from the R help mailing list archive at Nabble.com.


Can't say too much since there is no detail in your post but you can
do something like this:

library(xts) # this also loads zoo
library(chron) # if you wish to use chron
z - read.zoo(...)
x - as.xts(z)

where you may need to use FUN= and possibly the index.column= and
other arguments to read.zoo.  See ?read.zoo and the R News 4/1 article
on dates.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Data Handling

2010-07-30 Thread Lily_stats

Hi, 

I am very new to R so these questions may seem simple!

I have a huge 2 sets of data(matrix 5x2++) in the following formats ,
for example data.txt and data2.txt:

Date   Time X   Y
03/03/1983  20:00   0.1  990

I would like to recreate a new matrix which filters through data.txt and
data2.txt to get something as below :

Date Time   X_data1  X_data2   
Y_data1  Y_data2
31/12/2000 12:00 2.25  0
   
990

So I basically need :
1) When Date AND Time from data1.txt and data2.txt match, list the
corresponding X and Y values (X_data1,X_data2,Y_data1,Y_data2)

Thank you in advance, and I hope I have been clear enough in my message
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Data-Handling-tp2307770p2307770.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data Handling

2010-07-30 Thread raghu

Convert your datasets into xts objects and then do a cbind ordering by the
column you want. Do a ?cbind.

HTH
Raghu

On Fri, Jul 30, 2010 at 10:33 AM, Lily_stats [via R] 
ml-node+2307770-1033893256-309...@n4.nabble.comml-node%2b2307770-1033893256-309...@n4.nabble.com
 wrote:

 Hi,

 I am very new to R so these questions may seem simple!

 I have a huge 2 sets of data(matrix 5x2++) in the following formats ,
 for example data.txt and data2.txt:

 Date   Time X   Y
 03/03/1983  20:00   0.1  990

 I would like to recreate a new matrix which filters through data.txt and
 data2.txt to get something as below :

 Date Time   X_data1  X_data2
  Y_data1  Y_data2
 31/12/2000 12:00 2.25
  0990

 So I basically need :
 1) When Date AND Time from data1.txt and data2.txt match, list the
 corresponding X and Y values (X_data1,X_data2,Y_data1,Y_data2)

 Thank you in advance, and I hope I have been clear enough in my message

 --
  View message @
 http://r.789695.n4.nabble.com/Data-Handling-tp2307770p2307770.html
 To start a new topic under R help, email
 ml-node+789696-608741344-309...@n4.nabble.comml-node%2b789696-608741344-309...@n4.nabble.com
 To unsubscribe from R help, click here (link removed) .





-- 
'Raghu'

-- 
View this message in context: 
http://r.789695.n4.nabble.com/Data-Handling-tp2307770p2307836.html
Sent from the R help mailing list archive at Nabble.com.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data Handling

2010-07-30 Thread raghu

Please try:
data -  xts(data[,2:n], order.by=as.POSIXct(strptime(data[,1],
%d/%m/%Y)))

Use similar strptime for hours also.n=number of columns.

Good Luck
Raghu

On Fri, Jul 30, 2010 at 2:02 PM, Lily_stats [via R] 
ml-node+2307936-1777222343-309...@n4.nabble.comml-node%2b2307936-1777222343-309...@n4.nabble.com
 wrote:

 Hi,

 I am trying to convert my dataset into xts. I have tried the following :

 data1-read.table(data1.txt,header=F)
 data2-read.table(data2.txt,header=F)

 data1.xtsas.xts(data1,descr=my new xts object)

 However, I get an error :

 Error in as.POSIXlt.character(x, tz, ...) :
   character string is not in a standard unambiguous format

 I understand that my date and time format might not be accepted and have
 tried to convert this but failed.

 Could you suggest something ?

 My date is in the format : dd/mm/
 My time is in the format : hh:00

 Thank you in advance



 --
  View message @
 http://r.789695.n4.nabble.com/Data-Handling-tp2307770p2307936.html
 To start a new topic under R help, email
 ml-node+789696-608741344-309...@n4.nabble.comml-node%2b789696-608741344-309...@n4.nabble.com
 To unsubscribe from R help, click here (link removed) .





-- 
'Raghu'

-- 
View this message in context: 
http://r.789695.n4.nabble.com/Data-Handling-tp2307770p2307959.html
Sent from the R help mailing list archive at Nabble.com.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.