[R] Generating sequence of dates

2009-10-28 Thread Vadlamani, Satish {FLNA}
Hello All:
I have the following question

# instantiate a date
current = as.Date(2009/10/25)

#generate a sequence of dates in the future
future_dates = seq(current,by='1 week',length=53)

Question: How to generate a sequence of past dates starting one week in the 
past relative to the current date. Obviously, what I wrote below is not 
correct. I think I can write a for loop and push each value into a vector. Is 
this the best way? Thanks.

Satish


past_dates = seq(current,by=-'1 week',length=156)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Generating sequence of dates

2009-10-28 Thread Vadlamani, Satish {FLNA}
Thanks. Please expect more newbie questions!!
Satish


-Original Message-
From: jim holtman [mailto:jholt...@gmail.com] 
Sent: Wednesday, October 28, 2009 7:05 AM
To: Vadlamani, Satish {FLNA}
Cc: R-help@r-project.org
Subject: Re: [R] Generating sequence of dates

try this:

 current = as.Date(2009/10/25)
 start - seq(current, by='-1 week', length=2)[2]
 seq(start, by='1 week', length=10)
 [1] 2009-10-18 2009-10-25 2009-11-01 2009-11-08 2009-11-15
2009-11-22 2009-11-29 2009-12-06 2009-12-13
[10] 2009-12-20



On Wed, Oct 28, 2009 at 7:57 AM, Vadlamani, Satish {FLNA}
satish.vadlam...@fritolay.com wrote:
 Hello All:
 I have the following question

 # instantiate a date
 current = as.Date(2009/10/25)

 #generate a sequence of dates in the future
 future_dates = seq(current,by='1 week',length=53)

 Question: How to generate a sequence of past dates starting one week in the 
 past relative to the current date. Obviously, what I wrote below is not 
 correct. I think I can write a for loop and push each value into a vector. Is 
 this the best way? Thanks.

 Satish


 past_dates = seq(current,by=-'1 week',length=156)

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Question on Bias calculations and question on read.fwf

2009-10-28 Thread Vadlamani, Satish {FLNA}
Hi All:
Bear with me on this longer e-mail.

Questions:
1) Can you share with me on any example code that you may have that calculates 
bias of a statistical forecast in a time series?

2) Supposed I have the file in the fixed width format (details below).
1-62 character key
63-76 sales data point 1
77-90 sales data 2
91-94 sales data 3
and so on (each of the data points are 14 characters in width)
What is the read.fwf command that will extract these columns?


Some more details below. If you have any thoughts, please share with me.
Basically I want to do some analysis on how we are biased on our forecasts. I 
have several files as shown below. I have put one record each for the sales 
file and the forecast file. The file is fixed width format. THe first 62 
characters is the key for the records. THis should be further broken down into 
several column values. For
A006004004016004016011 can be broken down as follows:
Category = A006,
BU = 004
Class = 004
Size = 016
BDC = 004016011

I then want to do  cbind on both of these dataframes and compare the 
statistical forecast and the actual sales for a given time window.

EXAMPLE RECORD FROM THE Sales file (columns truncated)
A0050010240032314231003030050303A00600400401600401601123.200
23.70022.80023.300

Example record from the Stat Forecast file (columns truncated)
A0050010240032314231003030050303A00600400401600401605134.800
35.50034.20034.900

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help with read.fwf

2009-10-28 Thread Vadlamani, Satish {FLNA}
Hi All:
I am trying to use read.fwf and encountering the following error below. Any 
ideas on what I can do?
I tried to use read.table (and the default for read.table is space) and it 
works. I am not sure why read.fwf is not working

 test_data_frame = read.fwf(file=small.txt,widths=width_vec,header=FALSE)
Error in file(FILENAME, a) : cannot open the connection
In addition: Warning message:
In file(FILENAME, a) :
  cannot open file 'C:\temp\RtmpLN6W00\Rfwf.2ea6bb3': No such file or directory


 code below

setwd(d:/edump_data/11x4_2009)
current = as.Date(2009/10/25)
current = chron(10/25/2009, format=m/d/y)
next_date = current + 7
prev_date = current - 7
last_date = current - 7*156
#156 buckets in the past, one current bucket and 52 future buckets. Total is 
209 buckets
future_dates = seq.dates(next_date,by='weeks',length=52)
past_dates = seq.dates(last_date,by='weeks',length=156)
num_buckets = rep(14,209)
width_vec = c(62,num_buckets)
test_data_frame = read.fwf(file=small.txt,widths=width_vec,header=false)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help with creating some loops

2009-10-30 Thread Vadlamani, Satish {FLNA}
Hi All:

I have a data frame called all_corn. This has 31 columns. The first column is a 
character key. The next 15 columns (stat1,stat2,...,stat15) are the statistical 
forecast. The last 15 columns (sls1,sls2,...,sls5) are actual sales.
I want to calculate textbook tracking signal and cuulative percent error.

1) I am showing some of the calculations below. How can I make a loop out of 
this instead of manually doing this 15 times?
2) Once All these calculations are done, how do I put all these columns 
(err1,err2, etc.) into the same data frame?

Thanks.

attach(all_corn)

cum_sls1 - sls1
err1 - sls1-stat1
cum_err1 - sls1-stat1
cum_abs_err1 - abs(err1)
mad1 - abs(cum_err1)/1
cum_pct_err1 - (ifelse(cum_sls1  0, cum_err1/cum_sls1, 1))*100
ts1 - ifelse(mad1  0, cum_err1/mad1, 0)

cum_sls2 - cum_sls1 + sls2
err2 - sls2-stat2
cum_err2 - cum_err1 + sls2-stat2
cum_abs_err2 - cum_abs_err1 + abs(err2)
mad2 - cum_abs_err2/2
cum_pct_err2 - (ifelse(cum_sls2  0, cum_err2/cum_sls2, 1))*100
ts2 - ifelse(mad2  0, cum_err2/mad2, 0)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with creating some loops

2009-10-30 Thread Vadlamani, Satish {FLNA}
Hi:
In general, how to I cast a character to the class that I am trying to change.

For example, if I have a data frame df1. df1 has a column x. suppose I want to 
a substring of x (the first 3 chars). Then I want to do something like

df1$new = substring(of x)

Example
Data frame df1
x
abcd
efgh

Now df1$new should be
ab
ef

Thanks.
Satish



_
From:   Vadlamani, Satish {FLNA}
Sent:   Friday, October 30, 2009 8:40 AM
To: R-help@r-project.org
Subject:Help with creating some loops

Hi All:

I have a data frame called all_corn. This has 31 columns. The first column is a 
character key. The next 15 columns (stat1,stat2,...,stat15) are the statistical 
forecast. The last 15 columns (sls1,sls2,...,sls5) are actual sales.
I want to calculate textbook tracking signal and cuulative percent error.

1) I am showing some of the calculations below. How can I make a loop out of 
this instead of manually doing this 15 times?
2) Once All these calculations are done, how do I put all these columns 
(err1,err2, etc.) into the same data frame?

Thanks.

attach(all_corn)

cum_sls1 - sls1
err1 - sls1-stat1
cum_err1 - sls1-stat1
cum_abs_err1 - abs(err1)
mad1 - abs(cum_err1)/1
cum_pct_err1 - (ifelse(cum_sls1  0, cum_err1/cum_sls1, 1))*100
ts1 - ifelse(mad1  0, cum_err1/mad1, 0)

cum_sls2 - cum_sls1 + sls2
err2 - sls2-stat2
cum_err2 - cum_err1 + sls2-stat2
cum_abs_err2 - cum_abs_err1 + abs(err2)
mad2 - cum_abs_err2/2
cum_pct_err2 - (ifelse(cum_sls2  0, cum_err2/cum_sls2, 1))*100
ts2 - ifelse(mad2  0, cum_err2/mad2, 0)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Merge records in the same dataframe

2009-11-05 Thread Vadlamani, Satish {FLNA}
Hi:

Suppose that I have a data frame as below

x1 x2 x3 ... x10 wk1 wk2 ... Wk208 (these are the column names)

For each record, x1, x2, x3 ... x10 are attributes. and wk1, wk2, ..., wk208 
are the sales recoreded for this attribute combination. Suppose that now, that 
I want to do the following

1. Merge the data frame so that I have a new data frame grouped by values of x2 
and x3 (for example). That is, if two records have the same values of x2 and 
x3, they should be summed.

I tried to look at merge, tapply etc. but did not see a fit with I want to do 
above.

Thanks in advance.

Satish

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to read numeric as text

2009-11-05 Thread Vadlamani, Satish {FLNA}
Hi:
If I want to read a file with read.table. I want x1 and x2 to be read as 
character and x3 as numeric. How to do this? Thanks.
Satish


x1 ,x2,x3
10,20,30
11 ,22,35

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Installing R and modules on Unix OS

2010-01-27 Thread Vadlamani, Satish {FLNA}
Hi:
I have a question about installing R (and modules) on a Unix system (AIX).
Can I just gunzip (or the equivalent) the installation files into my home 
directory or will I need someone with root access to install R? I am hoping 
that the answer is the former (I can unzip all files to a directory R that I 
create under my home directory and I can start using it).

Could you please help me with this and any other instructions to install R and 
modules when you do not have root access? Thanks.
Satish

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Reading large files

2010-02-04 Thread Vadlamani, Satish {FLNA}
Folks:
I am trying to read in a large file. Definition of large is:
Number of lines: 333, 250
Size: 850 MB

The maching is a dual core intel, with 4 GB RAM and nothing else running on it. 
I read the previous threads on read.fwf and did not see any conclusive 
statements on how to read fast. Example record and R code given below. I was 
hoping to purchase a better machine and do analysis with larger datasets - but 
these preliminary results do not look good.

Does anyone have any experience with large files ( 1GB) and using them with 
Revolution-R?


Thanks.

Satish

Example Code
key_vec - c(1,3,3,4,2,8,8,2,2,3,2,2,1,3,3,3,3,9)
key_names - 
c(allgeo,area1,zone,dist,ccust1,whse,bindc,ccust2,account,area2,ccust3,customer,allprod,cat,bu,class,size,bdc)
key_info - data.frame(key_vec,key_names)
col_names - c(key_names,sas_time$week)
num_buckets - rep(12,209)
width_vec = c(key_vec,num_buckets)
col_classes-c(rep(factor,18),rep(numeric,209))
#threewkoutstat - 
read.fwf(file=3wkoutstatfcst_file02.dat,widths=width_vec,header=FALSE,colClasses=col_classes,n=100)
threewkoutstat - 
read.fwf(file=3wkoutstatfcst_file02.dat,widths=width_vec,header=FALSE,colClasses=col_classes)
names(threewkoutstat) - col_names

Example record (only one record pasted below)
A00400100379949254925004A0010020020150020150090.00  
  0.000.000.000.000.000.000.00  
  0.000.000.000.000.000.00
0.000.000.000.000.000.000.00
0.000.000.000.000.000.00
0.000.000.000.000.000.000.00
0.000.000.000.000.000.00
0.000.000.000.000.000.000.00
0.000.000.000.000.000.00
0.000.000.000.000.000.000.00
0.000.000.000.000.000.00
0.000.000.000.000.000.000.00
0.000.000.000.00   !
  0.000.000.000.000.000.00
0.000.000.000.000.000.000.00
0.000.000.000.000.000.00
0.000.000.000.000.000.000.00
0.000.000.000.000.000.00
0.000.000.000.000.000.000.00
0.000.000.000.000.000.60
0.600.600.700.000.000.000.00
0.000.000.000.000.000.00
0.000.000.000.000.000.000.00
0.000.000.000.000.000.00
0.000.000.000.000.000.000.00
0.000.000.000.00   !
  0.000.000.000.000.000.00 
   0.000.000.000.000.000.000.00 
   0.000.000.000.000.000.00
0.000.000.000.000.000.000.00
0.000.000.000.000.000.00
0.000.000.000.000.000.000.00
0.000.000.000.000.000.00
0.000.000.000.000.00

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reading large files

2010-02-05 Thread Vadlamani, Satish {FLNA}
Hi Gabor:
Thanks. My files are all in fixed width format. They are a lot of them. It 
would take me some effort to convert them to CSV. I guess this cannot be 
avoided? I can write some Perl scripts to convert fixed width format to CSV 
format and then start with your suggestion. Could you let me know your thoughts 
on the approach?
Satish
 

-Original Message-
From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com] 
Sent: Friday, February 05, 2010 5:16 PM
To: Vadlamani, Satish {FLNA}
Cc: r-help@r-project.org
Subject: Re: [R] Reading large files

If your problem is just how long it takes to load the file into R try
read.csv.sql in the sqldf package.  A single read.csv.sql call can
create an SQLite database and table layout for you, read the file into
the database (without going through R so R can't slow this down),
extract all or a portion into R based on the sql argument you give it
and then remove the database.  See the examples on the home page:
http://code.google.com/p/sqldf/#Example_13._read.csv.sql_and_read.csv2.sql

On Fri, Feb 5, 2010 at 2:11 PM, Satish Vadlamani
satish.vadlam...@fritolay.com wrote:

 Matthew:
 If it is going to help, here is the explanation. I have an end state in
 mind. It is given below under End State header. In order to get there, I
 need to start somewhere right? I started with a 850 MB file and could not
 load in what I think is reasonable time (I waited for an hour).

 There are references to 64 bit. How will that help? It is a 4GB RAM machine
 and there is no paging activity when loading the 850 MB file.

 I have seen other threads on the same types of questions. I did not see any
 clear cut answers or errors that I could have been making in the process. If
 I am missing something, please let me know. Thanks.
 Satish


 End State
 Satish wrote: at one time I will need to load say 15GB into R


 -
 Satish Vadlamani
 --
 View this message in context: 
 http://n4.nabble.com/Reading-large-files-tp1469691p1470667.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reading large files

2010-02-06 Thread Vadlamani, Satish {FLNA}
Jim, Gabor:
Thanks so much for the suggestions where I can use read.csv.sql and embed Perl 
(or gawk). I just want to mention that I am running on Windows. I am going to 
read the documentation the filter argument and see if it can take a decent 
sized Perl script and then use its output as input.

Suppose that I write a Perl script that parses this fwf file and creates a CSV 
file. Can I embed this within the read.csv.sql call? Or, can it only be a 
statement or something? If you know the answer, please let me know. Otherwise, 
I will try a few things and report back the results.

Thanks again.
Saitsh
 

-Original Message-
From: jim holtman [mailto:jholt...@gmail.com] 
Sent: Saturday, February 06, 2010 6:16 AM
To: Gabor Grothendieck
Cc: Vadlamani, Satish {FLNA}; r-help@r-project.org
Subject: Re: [R] Reading large files

In perl the 'unpack' command makes it very easy to parse fixed fielded data.

On Fri, Feb 5, 2010 at 9:09 PM, Gabor Grothendieck
ggrothendi...@gmail.com wrote:
 Note that the filter= argument on read.csv.sql can be used to pass the
 input through a filter written in perl, [g]awk or other language.
 For example: read.csv.sql(..., filter = gawk -f myfilter.awk)

 gawk has the FIELDWIDTHS variable for automatically parsing fixed
 width fields, e.g.
 http://www.delorie.com/gnu/docs/gawk/gawk_44.html
 making this very easy but perl or whatever you are most used to would
 be fine too.

 On Fri, Feb 5, 2010 at 8:50 PM, Vadlamani, Satish {FLNA}
 satish.vadlam...@fritolay.com wrote:
 Hi Gabor:
 Thanks. My files are all in fixed width format. They are a lot of them. It 
 would take me some effort to convert them to CSV. I guess this cannot be 
 avoided? I can write some Perl scripts to convert fixed width format to CSV 
 format and then start with your suggestion. Could you let me know your 
 thoughts on the approach?
 Satish


 -Original Message-
 From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com]
 Sent: Friday, February 05, 2010 5:16 PM
 To: Vadlamani, Satish {FLNA}
 Cc: r-help@r-project.org
 Subject: Re: [R] Reading large files

 If your problem is just how long it takes to load the file into R try
 read.csv.sql in the sqldf package.  A single read.csv.sql call can
 create an SQLite database and table layout for you, read the file into
 the database (without going through R so R can't slow this down),
 extract all or a portion into R based on the sql argument you give it
 and then remove the database.  See the examples on the home page:
 http://code.google.com/p/sqldf/#Example_13._read.csv.sql_and_read.csv2.sql

 On Fri, Feb 5, 2010 at 2:11 PM, Satish Vadlamani
 satish.vadlam...@fritolay.com wrote:

 Matthew:
 If it is going to help, here is the explanation. I have an end state in
 mind. It is given below under End State header. In order to get there, I
 need to start somewhere right? I started with a 850 MB file and could not
 load in what I think is reasonable time (I waited for an hour).

 There are references to 64 bit. How will that help? It is a 4GB RAM machine
 and there is no paging activity when loading the 850 MB file.

 I have seen other threads on the same types of questions. I did not see any
 clear cut answers or errors that I could have been making in the process. If
 I am missing something, please let me know. Thanks.
 Satish


 End State
 Satish wrote: at one time I will need to load say 15GB into R


 -
 Satish Vadlamani
 --
 View this message in context: 
 http://n4.nabble.com/Reading-large-files-tp1469691p1470667.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reading large files

2010-02-06 Thread Vadlamani, Satish {FLNA}
Gabor:
Can I pass colClasses as a vector to read.csv.sql? Thanks.
Satish
 

-Original Message-
From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com] 
Sent: Saturday, February 06, 2010 9:41 AM
To: Vadlamani, Satish {FLNA}
Cc: r-help@r-project.org
Subject: Re: [R] Reading large files

Its just any Windows batch command string that filters stdin to
stdout.  What the command consists of should not be important.   An
invocation of perl that runs a perl script that filters stdin to
stdout might look like this:
  read.csv.sql(myfile.dat, filter = perl myprog.pl)

For an actual example see the source of read.csv2.sql which defaults
to using a Windows vbscript program as a filter.

On Sat, Feb 6, 2010 at 10:16 AM, Vadlamani, Satish {FLNA}
satish.vadlam...@fritolay.com wrote:
 Jim, Gabor:
 Thanks so much for the suggestions where I can use read.csv.sql and embed 
 Perl (or gawk). I just want to mention that I am running on Windows. I am 
 going to read the documentation the filter argument and see if it can take a 
 decent sized Perl script and then use its output as input.

 Suppose that I write a Perl script that parses this fwf file and creates a 
 CSV file. Can I embed this within the read.csv.sql call? Or, can it only be a 
 statement or something? If you know the answer, please let me know. 
 Otherwise, I will try a few things and report back the results.

 Thanks again.
 Saitsh


 -Original Message-
 From: jim holtman [mailto:jholt...@gmail.com]
 Sent: Saturday, February 06, 2010 6:16 AM
 To: Gabor Grothendieck
 Cc: Vadlamani, Satish {FLNA}; r-help@r-project.org
 Subject: Re: [R] Reading large files

 In perl the 'unpack' command makes it very easy to parse fixed fielded data.

 On Fri, Feb 5, 2010 at 9:09 PM, Gabor Grothendieck
 ggrothendi...@gmail.com wrote:
 Note that the filter= argument on read.csv.sql can be used to pass the
 input through a filter written in perl, [g]awk or other language.
 For example: read.csv.sql(..., filter = gawk -f myfilter.awk)

 gawk has the FIELDWIDTHS variable for automatically parsing fixed
 width fields, e.g.
 http://www.delorie.com/gnu/docs/gawk/gawk_44.html
 making this very easy but perl or whatever you are most used to would
 be fine too.

 On Fri, Feb 5, 2010 at 8:50 PM, Vadlamani, Satish {FLNA}
 satish.vadlam...@fritolay.com wrote:
 Hi Gabor:
 Thanks. My files are all in fixed width format. They are a lot of them. It 
 would take me some effort to convert them to CSV. I guess this cannot be 
 avoided? I can write some Perl scripts to convert fixed width format to CSV 
 format and then start with your suggestion. Could you let me know your 
 thoughts on the approach?
 Satish


 -Original Message-
 From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com]
 Sent: Friday, February 05, 2010 5:16 PM
 To: Vadlamani, Satish {FLNA}
 Cc: r-help@r-project.org
 Subject: Re: [R] Reading large files

 If your problem is just how long it takes to load the file into R try
 read.csv.sql in the sqldf package.  A single read.csv.sql call can
 create an SQLite database and table layout for you, read the file into
 the database (without going through R so R can't slow this down),
 extract all or a portion into R based on the sql argument you give it
 and then remove the database.  See the examples on the home page:
 http://code.google.com/p/sqldf/#Example_13._read.csv.sql_and_read.csv2.sql

 On Fri, Feb 5, 2010 at 2:11 PM, Satish Vadlamani
 satish.vadlam...@fritolay.com wrote:

 Matthew:
 If it is going to help, here is the explanation. I have an end state in
 mind. It is given below under End State header. In order to get there, I
 need to start somewhere right? I started with a 850 MB file and could not
 load in what I think is reasonable time (I waited for an hour).

 There are references to 64 bit. How will that help? It is a 4GB RAM machine
 and there is no paging activity when loading the 850 MB file.

 I have seen other threads on the same types of questions. I did not see any
 clear cut answers or errors that I could have been making in the process. 
 If
 I am missing something, please let me know. Thanks.
 Satish


 End State
 Satish wrote: at one time I will need to load say 15GB into R


 -
 Satish Vadlamani
 --
 View this message in context: 
 http://n4.nabble.com/Reading-large-files-tp1469691p1470667.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Jim Holtman
 Cincinnati, OH

Re: [R] Reading large files

2010-02-06 Thread Vadlamani, Satish {FLNA}
Gabor:

I had success with the following.
1. I created a csv file with a perl script called out.txt. Then ran the 
following successfully
library(sqldf)
test_df - read.csv.sql(file=out.txt, sql = select * from file, header = 
TRUE, sep = ,, dbname = tempfile())

2. I did not have success with the following. Could you tell me what I may be 
doing wrong? I could paste the perl script if necessary. From the perl script, 
I am reading the file, creating the csv record and printing each record one by 
one and then exiting.

Thanks.

Not had success with below..
#test_df - read.csv2.sql(file=3wkoutstatfcst_small.dat, sql = select * from 
file, header = TRUE, sep = ,, filter=perl parse_3wkout.pl, dbname = 
tempfile())
test_df 

Error message below:
test_df - read.csv2.sql(file=3wkoutstatfcst_small.dat, sql = select * from 
file, header = TRUE, sep = ,, filter=perl parse_3wkout.pl, dbname = 
tempfile())
Error in readRegistry(key, maxdepth = 3) : 
  Registry key 'SOFTWARE\R-core' not found
In addition: Warning messages:
1: closing unused connection 14 (3wkoutstatfcst_small.dat) 
2: closing unused connection 13 (3wkoutstatfcst_small.dat) 
3: closing unused connection 11 (3wkoutstatfcst_small.dat) 
4: closing unused connection 9 (3wkoutstatfcst_small.dat) 
5: closing unused connection 3 (3wkoutstatfcst_small.dat) 
 test_df - read.csv2.sql(file=3wkoutstatfcst_small.dat, sql = select * 
 from file, header = TRUE, sep = ,, filter=perl parse_3wkout.pl, dbname = 
 tempfile())
Error in readRegistry(key, maxdepth = 3) : 
  Registry key 'SOFTWARE\R-core' not found

-Original Message-
From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com] 
Sent: Saturday, February 06, 2010 12:14 PM
To: Vadlamani, Satish {FLNA}
Cc: r-help@r-project.org
Subject: Re: [R] Reading large files

No.

On Sat, Feb 6, 2010 at 1:01 PM, Vadlamani, Satish {FLNA}
satish.vadlam...@fritolay.com wrote:
 Gabor:
 Can I pass colClasses as a vector to read.csv.sql? Thanks.
 Satish


 -Original Message-
 From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com]
 Sent: Saturday, February 06, 2010 9:41 AM
 To: Vadlamani, Satish {FLNA}
 Cc: r-help@r-project.org
 Subject: Re: [R] Reading large files

 Its just any Windows batch command string that filters stdin to
 stdout.  What the command consists of should not be important.   An
 invocation of perl that runs a perl script that filters stdin to
 stdout might look like this:
  read.csv.sql(myfile.dat, filter = perl myprog.pl)

 For an actual example see the source of read.csv2.sql which defaults
 to using a Windows vbscript program as a filter.

 On Sat, Feb 6, 2010 at 10:16 AM, Vadlamani, Satish {FLNA}
 satish.vadlam...@fritolay.com wrote:
 Jim, Gabor:
 Thanks so much for the suggestions where I can use read.csv.sql and embed 
 Perl (or gawk). I just want to mention that I am running on Windows. I am 
 going to read the documentation the filter argument and see if it can take a 
 decent sized Perl script and then use its output as input.

 Suppose that I write a Perl script that parses this fwf file and creates a 
 CSV file. Can I embed this within the read.csv.sql call? Or, can it only be 
 a statement or something? If you know the answer, please let me know. 
 Otherwise, I will try a few things and report back the results.

 Thanks again.
 Saitsh


 -Original Message-
 From: jim holtman [mailto:jholt...@gmail.com]
 Sent: Saturday, February 06, 2010 6:16 AM
 To: Gabor Grothendieck
 Cc: Vadlamani, Satish {FLNA}; r-help@r-project.org
 Subject: Re: [R] Reading large files

 In perl the 'unpack' command makes it very easy to parse fixed fielded data.

 On Fri, Feb 5, 2010 at 9:09 PM, Gabor Grothendieck
 ggrothendi...@gmail.com wrote:
 Note that the filter= argument on read.csv.sql can be used to pass the
 input through a filter written in perl, [g]awk or other language.
 For example: read.csv.sql(..., filter = gawk -f myfilter.awk)

 gawk has the FIELDWIDTHS variable for automatically parsing fixed
 width fields, e.g.
 http://www.delorie.com/gnu/docs/gawk/gawk_44.html
 making this very easy but perl or whatever you are most used to would
 be fine too.

 On Fri, Feb 5, 2010 at 8:50 PM, Vadlamani, Satish {FLNA}
 satish.vadlam...@fritolay.com wrote:
 Hi Gabor:
 Thanks. My files are all in fixed width format. They are a lot of them. It 
 would take me some effort to convert them to CSV. I guess this cannot be 
 avoided? I can write some Perl scripts to convert fixed width format to 
 CSV format and then start with your suggestion. Could you let me know your 
 thoughts on the approach?
 Satish


 -Original Message-
 From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com]
 Sent: Friday, February 05, 2010 5:16 PM
 To: Vadlamani, Satish {FLNA}
 Cc: r-help@r-project.org
 Subject: Re: [R] Reading large files

 If your problem is just how long it takes to load the file into R try
 read.csv.sql in the sqldf package.  A single read.csv.sql call can
 create an SQLite database and table layout

Re: [R] Reading large files

2010-02-06 Thread Vadlamani, Satish {FLNA}

Gabor:
Here is the update. As you can see, I got the same error as below in 1.

1. Error
 test_df - read.csv.sql(file=out_small.txt, sql = select * from file, 
header = TRUE, sep = ,, filter=perl parse_3wkout.pl, eol=\n)
Error in readRegistry(key, maxdepth = 3) : 
  Registry key 'SOFTWARE\R-core' not found 

2. But the loading of the bigger file was successful as you can see below. 857 
MB, 333,250 rows, 227 columns. This is good.

I will have to just do an inline edit in Perl and change the file to csv from 
within R and then call the read.csv.sql. 

If you have any suggestions to fix 1, I would like to try them.

 system.time(test_df - read.csv.sql(file=out.txt))
   user  system elapsed 
 192.53   15.50  213.68 
Warning message:
closing unused connection 3 (out.txt) 

Thanks again.

Satish

-Original Message-
From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com] 
Sent: Saturday, February 06, 2010 3:02 PM
To: Vadlamani, Satish {FLNA}
Cc: r-help@r-project.org
Subject: Re: [R] Reading large files

Note that you can shorten #1 to read.csv.sql(out.txt) since your
other arguments are the default values.

For the second one, use read.csv.sql, eliminate the arguments that are
defaults anyways (should not cause a problem but its error prone) and
add an explicit eol= argument since SQLite can have problems with end
of line in some cases.  Also test out your perl script separately from
R first to ensure that it works:

test_df - read.csv.sql(file=3wkoutstatfcst_small.dat, filter=perl
parse_3wkout.pl, eol = \n)

SQLite has some known problems with end of line so try it with and
without the eol= argument just in case.  When I just made up the
following gawk example I noticed that I did need to specify the eol=
argument.

Also I have added a complete example using gawk as Example 13c on the
home page just now:
http://code.google.com/p/sqldf/#Example_13._read.csv.sql_and_read.csv2.sql


On Sat, Feb 6, 2010 at 3:52 PM, Vadlamani, Satish {FLNA}
satish.vadlam...@fritolay.com wrote:
 Gabor:

 I had success with the following.
 1. I created a csv file with a perl script called out.txt. Then ran the 
 following successfully
 library(sqldf)
 test_df - read.csv.sql(file=out.txt, sql = select * from file, header = 
 TRUE, sep = ,, dbname = tempfile())

 2. I did not have success with the following. Could you tell me what I may be 
 doing wrong? I could paste the perl script if necessary. From the perl 
 script, I am reading the file, creating the csv record and printing each 
 record one by one and then exiting.

 Thanks.

 Not had success with below..
 #test_df - read.csv2.sql(file=3wkoutstatfcst_small.dat, sql = select * 
 from file, header = TRUE, sep = ,, filter=perl parse_3wkout.pl, dbname = 
 tempfile())
 test_df

 Error message below:
 test_df - read.csv2.sql(file=3wkoutstatfcst_small.dat, sql = select * 
 from file, header = TRUE, sep = ,, filter=perl parse_3wkout.pl, dbname = 
 tempfile())
 Error in readRegistry(key, maxdepth = 3) :
  Registry key 'SOFTWARE\R-core' not found
 In addition: Warning messages:
 1: closing unused connection 14 (3wkoutstatfcst_small.dat)
 2: closing unused connection 13 (3wkoutstatfcst_small.dat)
 3: closing unused connection 11 (3wkoutstatfcst_small.dat)
 4: closing unused connection 9 (3wkoutstatfcst_small.dat)
 5: closing unused connection 3 (3wkoutstatfcst_small.dat)
 test_df - read.csv2.sql(file=3wkoutstatfcst_small.dat, sql = select * 
 from file, header = TRUE, sep = ,, filter=perl parse_3wkout.pl, dbname 
 = tempfile())
 Error in readRegistry(key, maxdepth = 3) :
  Registry key 'SOFTWARE\R-core' not found

 -Original Message-
 From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com]
 Sent: Saturday, February 06, 2010 12:14 PM
 To: Vadlamani, Satish {FLNA}
 Cc: r-help@r-project.org
 Subject: Re: [R] Reading large files

 No.

 On Sat, Feb 6, 2010 at 1:01 PM, Vadlamani, Satish {FLNA}
 satish.vadlam...@fritolay.com wrote:
 Gabor:
 Can I pass colClasses as a vector to read.csv.sql? Thanks.
 Satish


 -Original Message-
 From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com]
 Sent: Saturday, February 06, 2010 9:41 AM
 To: Vadlamani, Satish {FLNA}
 Cc: r-help@r-project.org
 Subject: Re: [R] Reading large files

 Its just any Windows batch command string that filters stdin to
 stdout.  What the command consists of should not be important.   An
 invocation of perl that runs a perl script that filters stdin to
 stdout might look like this:
  read.csv.sql(myfile.dat, filter = perl myprog.pl)

 For an actual example see the source of read.csv2.sql which defaults
 to using a Windows vbscript program as a filter.

 On Sat, Feb 6, 2010 at 10:16 AM, Vadlamani, Satish {FLNA}
 satish.vadlam...@fritolay.com wrote:
 Jim, Gabor:
 Thanks so much for the suggestions where I can use read.csv.sql and embed 
 Perl (or gawk). I just want to mention that I am running on Windows. I am 
 going to read the documentation the filter argument and see if it can take 
 a decent sized

Re: [R] Reading large files

2010-02-06 Thread Vadlamani, Satish {FLNA}
Gabor:
Please see the results below. Sourcing your new R script worked (although with 
the same error message). If I put eol=\n option, it is adding a \r to the 
last column. I took out the eol option below. This is just some more feedback 
to you.

I am thinking that I will just do an inline edit in Perl (that is create the 
csv file through Perl by overwriting the current file) and then use 
read.csv.sql without the filter= option. This seems to be more tried and 
tested. If you have any suggestions, please let me know. Thanks.
Satish


BEFORE SOURCING YOUR NEW R SCRIPT
 test_df - read.csv.sql(file=3wkoutstatfcst_small.dat, sql = select * from 
 file, header = TRUE, sep = ,, filter=perl parse_3wkout.pl)
Error in readRegistry(key, maxdepth = 3) : 
  Registry key 'SOFTWARE\R-core' not found
 test_df
Error: object 'test_df' not found

AFTER SOURCING YOUR NEW R SCRIPT
 source(f:/dp_modeling_team/downloads/R/sqldf.R)
 test_df - read.csv.sql(file=3wkoutstatfcst_small.dat, sql = select * from 
 file, header = TRUE, sep = ,, filter=perl parse_3wkout.pl)
Error in readRegistry(key, maxdepth = 3) : 
  Registry key 'SOFTWARE\R-core' not found
In addition: Warning messages:
1: closing unused connection 5 (3wkoutstatfcst_small.dat) 
2: closing unused connection 4 (3wkoutstatfcst_small.dat) 
3: closing unused connection 3 (3wkoutstatfcst_small.dat) 
 test_df
   allgeo area1 zone dist ccust1 whse bindc ccust2 account area2 ccust3
1   A 41   37 99 4925  4925 99  99 4 99
2   A 41   37 99 4925  4925 99  99 4 99 

-Original Message-
From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com] 
Sent: Saturday, February 06, 2010 4:28 PM
To: Vadlamani, Satish {FLNA}
Cc: r-help@r-project.org
Subject: Re: [R] Reading large files

The software attempts to read the registry and temporarily augment the
path in case you have Rtools installed so that the filter can access
all the tools that Rtools provides.  I am not sure why its failing on
your system but there is evidently some differences between systems
here and I have added some code to trap and bypass that portion in
case it fails.  I have added the new version to the svn repository so
try this:

library(sqldf)
# overwrite with development version
source(http://sqldf.googlecode.com/svn/trunk/R/sqldf.R;)
# your code to call read.csv.sql


On Sat, Feb 6, 2010 at 5:18 PM, Vadlamani, Satish {FLNA}
satish.vadlam...@fritolay.com wrote:

 Gabor:
 Here is the update. As you can see, I got the same error as below in 1.

 1. Error
  test_df - read.csv.sql(file=out_small.txt, sql = select * from file, 
 header = TRUE, sep = ,, filter=perl parse_3wkout.pl, eol=\n)
 Error in readRegistry(key, maxdepth = 3) :
  Registry key 'SOFTWARE\R-core' not found

 2. But the loading of the bigger file was successful as you can see below. 
 857 MB, 333,250 rows, 227 columns. This is good.

 I will have to just do an inline edit in Perl and change the file to csv from 
 within R and then call the read.csv.sql.

 If you have any suggestions to fix 1, I would like to try them.

  system.time(test_df - read.csv.sql(file=out.txt))
   user  system elapsed
  192.53   15.50  213.68
 Warning message:
 closing unused connection 3 (out.txt)

 Thanks again.

 Satish

 -Original Message-
 From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com]
 Sent: Saturday, February 06, 2010 3:02 PM
 To: Vadlamani, Satish {FLNA}
 Cc: r-help@r-project.org
 Subject: Re: [R] Reading large files

 Note that you can shorten #1 to read.csv.sql(out.txt) since your
 other arguments are the default values.

 For the second one, use read.csv.sql, eliminate the arguments that are
 defaults anyways (should not cause a problem but its error prone) and
 add an explicit eol= argument since SQLite can have problems with end
 of line in some cases.  Also test out your perl script separately from
 R first to ensure that it works:

 test_df - read.csv.sql(file=3wkoutstatfcst_small.dat, filter=perl
 parse_3wkout.pl, eol = \n)

 SQLite has some known problems with end of line so try it with and
 without the eol= argument just in case.  When I just made up the
 following gawk example I noticed that I did need to specify the eol=
 argument.

 Also I have added a complete example using gawk as Example 13c on the
 home page just now:
 http://code.google.com/p/sqldf/#Example_13._read.csv.sql_and_read.csv2.sql


 On Sat, Feb 6, 2010 at 3:52 PM, Vadlamani, Satish {FLNA}
 satish.vadlam...@fritolay.com wrote:
 Gabor:

 I had success with the following.
 1. I created a csv file with a perl script called out.txt. Then ran the 
 following successfully
 library(sqldf)
 test_df - read.csv.sql(file=out.txt, sql = select * from file, header = 
 TRUE, sep = ,, dbname = tempfile())

 2. I did not have success with the following. Could you tell me what I may 
 be doing wrong? I could paste the perl script if necessary. From the perl 
 script, I am reading the file, creating the csv record

Re: [R] Reading large files

2010-02-06 Thread Vadlamani, Satish {FLNA}
Gabor:
It did suppress the message now and I was able to load the data. Question.

1. test_df - read.csv.sql(file=3wkoutstatfcst_small.dat, filter=perl 
parse_3wkout.pl) 

In the statement above, should the filename in file= and the file name that the 
perl script uses through the filter= command be the same? I would think not.  I 
would say that if filter= is passed to the statement, then the filename should 
be ignored. Is this how it works?

Thanks.
Satish


-Original Message-
From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com] 
Sent: Saturday, February 06, 2010 4:58 PM
To: Vadlamani, Satish {FLNA}
Cc: r-help@r-project.org
Subject: Re: [R] Reading large files

I have uploaded another version which suppresses display of the error
message but otherwise works the same.  Omitting the redundant
arguments we have:

ibrary(sqldf)
# next line is only needed once per session to read in devel version
source(http://sqldf.googlecode.com/svn/trunk/R/sqldf.R;)

test_df - read.csv.sql(file=3wkoutstatfcst_small.dat, filter=perl
parse_3wkout.pl)


On Sat, Feb 6, 2010 at 5:48 PM, Vadlamani, Satish {FLNA}
satish.vadlam...@fritolay.com wrote:
 Gabor:
 Please see the results below. Sourcing your new R script worked (although 
 with the same error message). If I put eol=\n option, it is adding a \r 
 to the last column. I took out the eol option below. This is just some more 
 feedback to you.

 I am thinking that I will just do an inline edit in Perl (that is create the 
 csv file through Perl by overwriting the current file) and then use 
 read.csv.sql without the filter= option. This seems to be more tried and 
 tested. If you have any suggestions, please let me know. Thanks.
 Satish


 BEFORE SOURCING YOUR NEW R SCRIPT
 test_df - read.csv.sql(file=3wkoutstatfcst_small.dat, sql = select * 
 from file, header = TRUE, sep = ,, filter=perl parse_3wkout.pl)
 Error in readRegistry(key, maxdepth = 3) :
  Registry key 'SOFTWARE\R-core' not found
 test_df
 Error: object 'test_df' not found

 AFTER SOURCING YOUR NEW R SCRIPT
 source(f:/dp_modeling_team/downloads/R/sqldf.R)
 test_df - read.csv.sql(file=3wkoutstatfcst_small.dat, sql = select * 
 from file, header = TRUE, sep = ,, filter=perl parse_3wkout.pl)
 Error in readRegistry(key, maxdepth = 3) :
  Registry key 'SOFTWARE\R-core' not found
 In addition: Warning messages:
 1: closing unused connection 5 (3wkoutstatfcst_small.dat)
 2: closing unused connection 4 (3wkoutstatfcst_small.dat)
 3: closing unused connection 3 (3wkoutstatfcst_small.dat)
 test_df
   allgeo area1 zone dist ccust1 whse bindc ccust2 account area2 ccust3
 1       A     4    1   37     99 4925  4925     99      99     4     99
 2       A     4    1   37     99 4925  4925     99      99     4     99

 -Original Message-
 From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com]
 Sent: Saturday, February 06, 2010 4:28 PM
 To: Vadlamani, Satish {FLNA}
 Cc: r-help@r-project.org
 Subject: Re: [R] Reading large files

 The software attempts to read the registry and temporarily augment the
 path in case you have Rtools installed so that the filter can access
 all the tools that Rtools provides.  I am not sure why its failing on
 your system but there is evidently some differences between systems
 here and I have added some code to trap and bypass that portion in
 case it fails.  I have added the new version to the svn repository so
 try this:

 library(sqldf)
 # overwrite with development version
 source(http://sqldf.googlecode.com/svn/trunk/R/sqldf.R;)
 # your code to call read.csv.sql


 On Sat, Feb 6, 2010 at 5:18 PM, Vadlamani, Satish {FLNA}
 satish.vadlam...@fritolay.com wrote:

 Gabor:
 Here is the update. As you can see, I got the same error as below in 1.

 1. Error
  test_df - read.csv.sql(file=out_small.txt, sql = select * from file, 
 header = TRUE, sep = ,, filter=perl parse_3wkout.pl, eol=\n)
 Error in readRegistry(key, maxdepth = 3) :
  Registry key 'SOFTWARE\R-core' not found

 2. But the loading of the bigger file was successful as you can see below. 
 857 MB, 333,250 rows, 227 columns. This is good.

 I will have to just do an inline edit in Perl and change the file to csv 
 from within R and then call the read.csv.sql.

 If you have any suggestions to fix 1, I would like to try them.

  system.time(test_df - read.csv.sql(file=out.txt))
   user  system elapsed
  192.53   15.50  213.68
 Warning message:
 closing unused connection 3 (out.txt)

 Thanks again.

 Satish

 -Original Message-
 From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com]
 Sent: Saturday, February 06, 2010 3:02 PM
 To: Vadlamani, Satish {FLNA}
 Cc: r-help@r-project.org
 Subject: Re: [R] Reading large files

 Note that you can shorten #1 to read.csv.sql(out.txt) since your
 other arguments are the default values.

 For the second one, use read.csv.sql, eliminate the arguments that are
 defaults anyways (should not cause a problem but its error prone) and
 add an explicit eol= argument since SQLite

[R] dataframe question

2010-02-07 Thread Vadlamani, Satish {FLNA}
Folks:
 Good day. Please see the code below. three_wk_out is a dataframe with columns 
wk1 through wk209. I want to change the format of the columns. I am trying the 
code below but it does not work.  I need $week in the for loop interpreted as 
wk1, wk2, etc. Could you please help? Thanks.
Satish

R code below
week_list - paste(wk,c(1:209),sep=)
for (week in week_list)
{
three_wk_out$week - as.numeric(three_wk_out$week)
}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] dataframe question

2010-02-07 Thread Vadlamani, Satish {FLNA}
David:
Thanks for the idea. Both the one that you suggested and the one that Bill 
Venables suggested are very good. Unfortunately, this statement is creating out 
of memory issues like below (system limitations).

When I had padded white space before the number, read.csv.sql is correctly 
treating it as a factor. I am going to take out the padding so that it treats 
it as numeric and then I can proceed with further steps.

Satish

Out of memory warning
Reached total allocation of 1535Mb: see help(memory.size)
34: In ans[[i]] - tmp :
  Reached total allocation of 1535Mb: see help(memory.size)

 Bill Venable's suggestion below

week_list - paste(wk, 1:209, sep=)  
### no need for c(...)

for(week in week_list) 
three_wk_out[[week]] - as.numeric(three_wk_out[[week]]) 

### no need for '{...}'

Bill Venables
CSIRO/CMIS Cleveland Laboratories


-Original Message-
From: David Winsemius [mailto:dwinsem...@comcast.net] 
Sent: Sunday, February 07, 2010 8:51 PM
To: Vadlamani, Satish {FLNA}
Cc: r-help@r-project.org help
Subject: Re: [R] dataframe question


On Feb 7, 2010, at 8:14 PM, David Winsemius wrote:


 On Feb 7, 2010, at 7:51 PM, Vadlamani, Satish {FLNA} wrote:

 Folks:
 Good day. Please see the code below. three_wk_out is a dataframe  
 with columns wk1 through wk209. I want to change the format of the  
 columns. I am trying the code below but it does not work.  I need  
 $week in the for loop interpreted as wk1, wk2, etc. Could you  
 please help? Thanks.
 Satish

 R code below
 week_list - paste(wk,c(1:209),sep=)


 Or more functionally:

 three_wk_out - as.data.frame( lapply(three_wk_out, some_function) )

Or if you wanted to just change the particular columns that matched  
the wk pattern:

idx - grep(wk, names(three_wk_out))
three_wk_out[, idx ] - apply( three_wk_out[, idx ], 2, as.numeric)


(I probably should have used apply( ___ , 2,  fn) in the prior effort  
rather than coercing a list back to a dataframe.)



 E.g.:
 

  a b c x
 1 1 0 0 1
 2 2 3 2 4
 3 1 2 1 5
 4 2 0 3 2

  df - as.data.frame(lapply(df, ^, 2))
  df
   a  b  c   x
 1  1  0  0   1
 2 16 81 16 256
 3  1 16  1 625
 4 16  0 81  16


 for (week in week_list)
 {
   three_wk_out$week - as.numeric(three_wk_out$week)
 }

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Contributed packages

2010-02-07 Thread Vadlamani, Satish {FLNA}
Folks:
If you wanted to find out about what are the contributed packages and classify 
them, how would you go about it? For someone new like me, I would like to know 
what the possibilities are. When I click on install packages on my Windows 
version of R, it gives me a list but it is hard to figure out from that list 
what is the purpose of each package and to what class it belongs (for example, 
class of regular expressions).

What is the equivalent of CPAN.org for Perl in R where you can browse Perl 
modules by category? Thanks.
Satish

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Comparing R and SAs

2009-06-09 Thread Vadlamani, Satish {FLNA}
Hi:
For those of you who are adept at both SAS and R, I have the following 
questions:

a) What are some reasons / tasks for which you would use R over SAS and vice 
versa?
b) What are some things for which R is a must have that SAS cannot fulfill the 
requirements?

I am on the ramp up on both of them. The general feeling that I am getting by 
following this group is that R updates to the product are at a much faster pace 
and therefore, this would be better for someone who wants the bleeding edge 
(correct me if I am wrong). But I am also interested in what is inherently 
better in R that SAS cannot offer perhaps because of the design.

Thanks.
Satish

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] 64 bit compiled version of R on windows

2009-03-30 Thread Vadlamani, Satish {FLNA}
Hi:
1) Does anyone have experience with 64 bit compiled version of R on windows? Is 
this available or one has to compile it oneself?
2) If we do compile the source in 64 bit, would we then need to compile any 
additional modules also in 64 bit?

I am just trying to prepare for the time when I will get larger datasets to 
analyze. Each of the datasets is about 1 GB in size and I will try to bring in 
about 16 of them in memory at the same time. At least that is the plan.

I asked a related question in the past and someone recommended the product 
RevolutionR - I am looking into this also. If you can think of any other 
options, please mention. I have not been doing low level programming for a 
while now and therefore, the self compilation on windows would be the least 
preferable (and then I have to worry about how to compile any modules that I 
need). Thanks.

Thanks.
Satish

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Test mail

2009-03-04 Thread Vadlamani, Satish {FLNA}
Hi:
This is a test mail. Thanks.
Satish

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Question about the use of large datasets in R

2009-03-04 Thread Vadlamani, Satish {FLNA}
Hi:
Sorry if this is a double post. I posted the same thing this morning and did 
not see it.

I just started using R and am asking the following questions so that I can plan 
for the future when I may have to analyze volume data.

1) What are the limitations of R when it comes to handling large datasets? Say 
for example something like 200M rows and 15 columns data frame (between 1.5 to 
2 GB in size)? Will the limitation be based on the specifications of the 
hardware or R itself?
2) Is R 32 bit compiled or 64 bit (on say Windows and AIX)
3) Are there any other points to note / things to keep in mind when handling 
large datasets?
4) Should I be looking at SAS also only for this reason (we do have SAS 
in-house but the problem is that I am still not sure what we have license for, 
etc.)

Any pointers / thoughts will be appreciated.

Satish

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.