Re: [R] Stacking several vectors from the list

2010-06-29 Thread Arsenio Starodoumov
Monday, June 28, 2010, 4:40:11 PM, you wrote:

 On Mon, Jun 28, 2010 at 7:30 PM,  astar...@uci.edu wrote:
 Hi everybody,

 I'm working on the very
 messy data, I have tried to clean it up in SAS and
 SAS/IML but there is not enough info on how to handle certain things
 in SAS so I have turned to R. The thing itself should be rather
 simple, so i was wondering if someone could help me out.

 The original .csv has ([1] 7138 6338 ) dimensions with funds with the 
 corresponding dates and observations for each date for around 10 years and 
 4000+ funds, meaning in COL5 has the next fund's name and so on.

 COL1                  COL2               COL3           COL4
 HBNNF US Equity Date            EQY_SH_OUT      PX_VOLUME
                        #NAME?         #N/A N/A   135000
                        7/7/2008        #N/A N/A          105000
                        7/17/2008       #N/A N/A          59
                        7/22/2008       #N/A N/A          4


 so in R this .csv is somehow read as list (using typeof) and not as 
 dataframe, and a lot of stuff like regexpr searches in the

 The typeof of a data.frame is list so you do have a data frame --
 not a list.  Perhaps the problem is that you do not want factor
 columns but want character columns instead.  Use read.csv(..., as.is =
 TRUE)

Thanks!! This  as.is trick solved the list issue and the whole
indexing problem. Now the table is a true dataframe searchable and
indexable. I'm still reading on those differences between in list
and dataframe types.

Arsenio

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Stacking several vectors from the list

2010-06-28 Thread astarodo
Hi everybody,

I'm working on the very
messy data, I have tried to clean it up in SAS and
SAS/IML but there is not enough info on how to handle certain things
in SAS so I have turned to R. The thing itself should be rather
simple, so i was wondering if someone could help me out.

The original .csv has ([1] 7138 6338 ) dimensions with funds with the 
corresponding dates and observations for each date for around 10 years and 
4000+ funds, meaning in COL5 has the next fund's name and so on.

COL1  COL2   COL3   COL4
HBNNF US Equity DateEQY_SH_OUT  PX_VOLUME
#NAME? #N/A N/A   135000
7/7/2008#N/A N/A  105000
7/17/2008   #N/A N/A  59
7/22/2008   #N/A N/A  4


so in R this .csv is somehow read as list (using typeof) and not as dataframe, 
and a lot of stuff like regexpr searches in the whole file do not work or 
behave strangely. I want to stack the fund data, and create a long dataset with 
a fund name, date, eqy_sh_out and px_volume, with fund name present for each 
date.
That should look like this,

Fund_name   DateEQY_SH_OUT  PX_VOLUME
HBNNF US Equity 7/7/2008#N/A N/A105000
HBNNF US Equity 7/17/2008   #N/A N/A59
HBNNF US Equity 7/22/2008   #N/A N/A4
HBNNF US Equity 7/24/2008   #N/A N/A3000
HBNNF US Equity 7/31/2008   #N/A N/A1000
HBNNF US Equity 8/20/2008   #N/A N/A1000
HBNNF US Equity 8/26/2008   #N/A N/A2000
HBNNF US Equity 8/27/2008   #N/A N/A2000
HBNNF US Equity 9/2/2008#N/A N/A5000
HND CN Equity   1/17/2008   #N/A N/A28000
HND CN Equity   1/18/2008   #N/A N/A25000
HND CN Equity   1/21/2008   #N/A N/A5000
HND CN Equity   1/22/2008   #N/A N/A101000
HND CN Equity   1/23/2008   #N/A N/A122000


Any way to accomplish this? Should be an easy way, but i have never worked with 
lists and somehow it doesn't read as a dataframe with strange results.

 small_raw[1,1]
[1] HBNNF US Equity
Levels:  0.26 0.46 COL1 HBNNF US Equity

 grep(Equity,as.character(small_raw))
integer(0)

 small_raw[[1]]
  [1] HBNNF US Equity
  [5]
  [9]
 [13]
 [17]
 [21]
 [25]
 [29]
 [33]
 [37]
 [41]
 [45]
 [49]
 [53]
 [57]
 [61]
 [65]
 [69]
 [73]
 [77]
 [81]
 [85]
 [89]
 [93]
 [97] 0.460.46   
[101] 0.460.26   
[105] 0.260.26   
[109] 0.260.26   
[113] 0.260.26   
[117] 0.260.26   
[121] 0.260.26   
[125] 0.260.26   
[129] 0.260.26   
[133] 0.260.26   
[137] 0.260.26   
[141] 0.260.26   
[145] 0.260.26   
[149] 0.26   

Re: [R] Stacking several vectors from the list

2010-06-28 Thread Gabor Grothendieck
On Mon, Jun 28, 2010 at 7:30 PM,  astar...@uci.edu wrote:
 Hi everybody,

 I'm working on the very
 messy data, I have tried to clean it up in SAS and
 SAS/IML but there is not enough info on how to handle certain things
 in SAS so I have turned to R. The thing itself should be rather
 simple, so i was wondering if someone could help me out.

 The original .csv has ([1] 7138 6338 ) dimensions with funds with the 
 corresponding dates and observations for each date for around 10 years and 
 4000+ funds, meaning in COL5 has the next fund's name and so on.

 COL1                  COL2               COL3           COL4
 HBNNF US Equity Date            EQY_SH_OUT      PX_VOLUME
                        #NAME?         #N/A N/A   135000
                        7/7/2008        #N/A N/A          105000
                        7/17/2008       #N/A N/A          59
                        7/22/2008       #N/A N/A          4


 so in R this .csv is somehow read as list (using typeof) and not as 
 dataframe, and a lot of stuff like regexpr searches in the

The typeof of a data.frame is list so you do have a data frame --
not a list.  Perhaps the problem is that you do not want factor
columns but want character columns instead.  Use read.csv(..., as.is =
TRUE)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Stacking several vectors from the list

2010-06-28 Thread Gabor Grothendieck
On Mon, Jun 28, 2010 at 7:40 PM, Gabor Grothendieck
ggrothendi...@gmail.com wrote:
 On Mon, Jun 28, 2010 at 7:30 PM,  astar...@uci.edu wrote:
 Hi everybody,

 I'm working on the very
 messy data, I have tried to clean it up in SAS and
 SAS/IML but there is not enough info on how to handle certain things
 in SAS so I have turned to R. The thing itself should be rather
 simple, so i was wondering if someone could help me out.

 The original .csv has ([1] 7138 6338 ) dimensions with funds with the 
 corresponding dates and observations for each date for around 10 years and 
 4000+ funds, meaning in COL5 has the next fund's name and so on.

 COL1                  COL2               COL3           COL4
 HBNNF US Equity Date            EQY_SH_OUT      PX_VOLUME
                        #NAME?         #N/A N/A   135000
                        7/7/2008        #N/A N/A          105000
                        7/17/2008       #N/A N/A          59
                        7/22/2008       #N/A N/A          4


 so in R this .csv is somehow read as list (using typeof) and not as 
 dataframe, and a lot of stuff like regexpr searches in the

 The typeof of a data.frame is list so you do have a data frame --
 not a list.  Perhaps the problem is that you do not want factor
 columns but want character columns instead.  Use read.csv(..., as.is =
 TRUE)


Just to be clear a data frame is a list so not a list means not just a
list -- its also a data frame.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.