[R] Unique subsetting question

2010-09-22 Thread AndrewPage

Hi all,

I'm looking at a large data set, and I'm interested in removing rows where
only one variable is duplicated.  Here's an example:

 presidents
 Qtr1 Qtr2 Qtr3 Qtr4
1945   NA   87   82   75
1946   63   50   43   32
1947   35   60   54   55
1948   36   39   NA   NA
1949   69   57   57   51
1950   45   37   46   39
1951   36   24   32   23
1952   25   32   NA   32
1953   59   74   75   60
1954   71   61   71   57
1955   71   68   79   73
1956   76   71   67   75
1957   79   62   63   57
1958   60   49   48   52
1959   57   62   61   66
1960   71   62   61   57
1961   72   83   71   78
1962   79   71   62   74
1963   76   64   62   57
1964   80   73   69   69
1965   71   64   69   62
1966   63   46   56   44
1967   44   52   38   46
1968   36   49   35   44
1969   59   65   65   56
1970   66   53   61   52
1971   51   48   54   49
1972   49   61   NA   NA
1973   68   44   40   27
1974   28   25   24   24

See how in 1954 and 1955, the Qtr1 approval rating is the same?  Let's say I
wanted to return the presidents data frame, but only have unique values for
Qtr1.  I doesn't matter which years are displayed for duplicated values-- it
just matters that each value is not displayed more than once.  Any way I can
do this but still have it be a data frame that shows Qtr2, 3, and 4 values?

Thanks in advance,
Andrew
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Unique-subsetting-question-tp2550453p2550453.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Unique subsetting question

2010-09-22 Thread AndrewPage

I understand how duplicated and unique work for a list where all parts of a
given row are duplicated, or how to find duplicated values if I'm just
looking at that first column, but in this case  the rows for 1954 and 1955
are not completely the same; only quarter 1 is duplicated, so I'm not sure
how to apply either duplicated or unique in that case.

Thanks,
Andrew
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Unique-subsetting-question-tp2550453p2550651.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Unique subsetting question

2010-09-22 Thread AndrewPage

I just figured that out, but the real data I'm using is a data frame for
sure, so I'll find another example.
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Unique-subsetting-question-tp2550453p2550736.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Unique subsetting question

2010-09-22 Thread AndrewPage

How about this:


s = c(aa, bb, cc, , aa, dd, , aa) 

n = c(2, 3, 5, 6, 7, 8, 9, 3) 

b = c(TRUE, FALSE, TRUE, TRUE, FALSE, TRUE, TRUE, FALSE) 

df = data.frame(n, s, b)   # df is a data frame 


I want to display df with no value in s occurring more than once.  Also, I
want to delete the rows where s contains .
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Unique-subsetting-question-tp2550453p2550769.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Unique subsetting question

2010-09-22 Thread AndrewPage

Thanks-- that works for what I'm trying to do.  I was also wondering, in the
data frame example you gave, if I just wanted to get rid of rows where the
a value is 5, how would I do that?  
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Unique-subsetting-question-tp2550453p2550836.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Unique subsetting question

2010-09-22 Thread AndrewPage

Oops, yeah I didn't see that.

Thanks,
Andrew
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Unique-subsetting-question-tp2550453p2550865.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Finding the right url for RCurl

2010-08-05 Thread AndrewPage

Hi all,

I am using RCurl to try and download data from a website, but I'm having
trouble finding out what URL to use.  Here is the site:

http://www.invescopowershares.com/products/holdings.aspx?ticker=PGX

See how in the upper right, above the displayed sheet, there's a link to
download the data as a .csv file?  When I hit copy url and paste into
getURL in R, it doesn't work.  That's no surprise because there isn't a URL
in what gets pasted.  I was just wondering if there's any way around this.

Thanks in advance,

Andrew
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Finding-the-right-url-for-RCurl-tp2314163p2314163.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Finding the right url for RCurl

2010-08-05 Thread AndrewPage

Thanks for the help so far-- one interesting thing about this particular page
is that the data displayed on the website actually differs from the data you
can access with the download link.  The XML package command works, but the
table it produces in R has the following column names:



 x1 =
 readHTMLTable(http://www.invescopowershares.com/products/holdings.aspx?ticker=PGX;,
 which 
+ = 13, header = TRUE)
 colnames(x1)
[1]   Coupon Rate   Maturity Date Ratingâ\u0080 %
Weight 
Warning message:
it is not known that wchar_t is Unicode on this platform 



 whereas the .csv file you can get with the link has 8 columns,
including a PositionDate column, a Shares column, etc. that aren't
present on the page's table.

What makes this even more confusing is that the XML table contains MORE
information than is presented on the page, such as Maturity Date.

What I'm really looking for is a way to access the .csv file, so I doubt
that reading info from the webpage will be sufficient seeing as it seems to
be displaying different data.

--Andrew


-- 
View this message in context: 
http://r.789695.n4.nabble.com/Finding-the-right-url-for-RCurl-tp2314163p2315461.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Odd crash with tcl/tk

2010-07-28 Thread AndrewPage

Hi,

Recently, I've been trying to use packages in R that require loading the
Tcl/Tk interface.  However, I get a strange result and a crash that I
haven't been able to find discussion about on these boards (or any others).

When I enter library(tcltk), it reads Loading Tcl/Tk interface ... , but
then never says done or displays some sort of error message.  Looks like
this:

 x11()
 library(tcltk)
Loading Tcl/Tk interface ... 
 


Now you can type additional commands in, at your peril!  For example, if I
type in the text library, nothing happens, but library( causes R to
freeze up irreparably, with executing:
try(gsub('\\s+','',paste(capture.output(print(args(library,collapse=)),silent=TRUE)
displayed at the bottom.  When this happens, there's nothing you can do but
restart R because it's completely frozen.

I'm running R version 2.11.1 Patched (2010-07-27 r52627)

[R.app GUI 1.35 (5603) i386-apple-darwin9.8.0]

with XQuartz 2.3.5 (xorg-server 1.4.2-apple53)

on a mac (snow leopard)

Thanks for any help/suggestions in advance,

Andrew
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Odd-crash-with-tcl-tk-tp2305032p2305032.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Command that is conditional upon file retrieval: is it possible?

2010-07-21 Thread AndrewPage

Hi all,

I'm currently working on an R program where I have to access an FTP server
to download some of the data I need.  However, the people who post up the
files I access are at times inconsistent with regards to time posted, if
they post at all, etc  Here's some of the code I use:

library(RCurl)

url1 = paste(ftp://user:passw...@a.great.website.com/;, file, num1,
.csv, sep = )

data1 = getURL(url1)

write(data1, file = paste(inMyFolder, num1, .csv, sep = ))


Sometimes this process works perfectly, and sometimes I get an error message
like this attached to data1 = getURL(url1):

Error in curlPerform(curl = curl, .opts = opts, .encoding = .encoding) : 
  RETR response: 550

That's because that particular file isn't on the FTP server yet.  Now...
let's just imagine that there's another way for me to access the file
elsewhere, and I can drag it into the working directory with the same name
as the file I'm telling R to write immediately after finding it on the FTP
server.

So here's my question: is it possible to write a command that will write the
file if there isn't an error message going along with data1 =
getURL(url1), but won't write the file if it can't find it  As of right
now, if I got the error message, dragged the file into the working directory
and ran the program again, R will overwrite my good file with an empty one
because in all cases, I'm telling it to write a file with that name that
includes the information in data1.

Thanks in advance,
Andrew
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Command-that-is-conditional-upon-file-retrieval-is-it-possible-tp2297811p2297811.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Search and extract string function

2010-07-15 Thread AndrewPage

Hi all,

I'm trying to write a function that will search and extract from a long
character string, but with a twist: I want to use the characters before and
the characters after what I want to extract as reference points.  For
example, say I'm working with data entries that looks like this:

Drink=Coffee:Location=Office:Time=Morning:Market=Flat

Drink=Water:Location=Office:Time=Afternoon:Market=Up

Drink=Water:Location=Gym:Time=Evening:Market=Closed

Drink=Wine:Location=Restaurant:Time=LateEvening:Market=Closed


...

For my function, I'd like to find what's located between Location=, and
:Time= in every instance, and extract it, to return something like
Office, Office, Gym, Restaurant.

In a previous discussion I found
(http://tolstoy.newcastle.edu.au/R/help/05/03/0344.html), someone wrote a
function where you could find and substitute characters in a string, based
on pre and post variables:

interp - function(x, e = parent.frame(), pre = \\$, post =  ) {
for(el in ls(e)) {
tag - paste(pre, el, post, sep = ) 
if (length(grep(tag, x))) x - gsub(tag, eval(parse(text = el), 
e), x)
}
x
}

I'm not sure how to modify it, however, to do what I want it to do.  Any
suggestions?

Thanks in advance,

Andrew
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Search-and-extract-string-function-tp2290268p2290268.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Search and extract string function

2010-07-15 Thread AndrewPage

Actually I have one more question that's somewhat related-- I'm starting out
by importing a .txt file that isn't divided into vectors and is at times
inconsistent with regards to spacing, indents, etc., so I can't rely on
those.  It looks something like this:


Drink=Coffee:Location=Office:Time=Morning:Market=Flat 

Drink=Water:Location=Office:Time=Afternoon:Market=Up 

Drink=Water:Location=Gym:Time=Evening:Market=Closed 
Drink=Wine:Location=Restaurant:Time=LateEvening:Market=Closed 
   Drink=Coffee:Location=Office:Time=Morning:Market=Flat 
Drink=Water:Location=Office:Time=Afternoon:Market=Up 

Drink=Water:Location=Gym:Time=Evening:Market=Closed 
Drink=Wine:Location=Restaurant:Time=LateEvening:Market=Closed
Drink=Coffee:Location=Office:Time=Morning:Market=Flat 

Drink=Water:Location=Office:Time=Afternoon:Market=Up 

Drink=Water:Location=Gym:Time=Evening:Market=Closed 

Drink=Wine:Location=Restaurant:Time=LateEvening:Market=Closed



How can I take a single string like this and divide it into twelve vectors,
like this:

FixedData
[1] Drink=Coffee:Location=Office:Time=Morning:Market=Flat
[2] Drink=Water:Location=Office:Time=Afternoon:Market=Up 
[3] Drink=Water:Location=Gym:Time=Evening:Market=Closed  
[4] Drink=Wine:Location=Restaurant:Time=LateEvening:Market=Closed
[5] Drink=Coffee:Location=Office:Time=Morning:Market=Flat
[6] Drink=Water:Location=Office:Time=Afternoon:Market=Up 
[7] Drink=Water:Location=Gym:Time=Evening:Market=Closed  
[8] Drink=Wine:Location=Restaurant:Time=LateEvening:Market=Closed
[9] Drink=Coffee:Location=Office:Time=Morning:Market=Flat
[10] Drink=Water:Location=Office:Time=Afternoon:Market=Up 
[11] Drink=Water:Location=Gym:Time=Evening:Market=Closed  
[12] Drink=Wine:Location=Restaurant:Time=LateEvening:Market=Closed

Thanks again for all of the help!

--Andrew

-- 
View this message in context: 
http://r.789695.n4.nabble.com/Search-and-extract-string-function-tp2290268p2290375.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.