[R] in par(mfrow=c(1, 2)), how to keep one half plot static and the other half changing

2012-11-27 Thread Baoqiang Cao
Hi,

I'm trying to plot something in the following way and would like if
you could help:

I'd like in a same plot window, two plots are shown, the left one is a
bird-view plot of the whole data, the right half keep changing, i.e.,
different plots will be shown up on request, so that when I
select/click on some where in the left plot, the right plot will be
the corresponding plot.

What I did is:

par(mfrow=c(1,2))
plot(x, y)

while(1) {
...
pxy - locator(1, type=p)

#select data point (dx,dy) based on pxy for a new plot
..

plot(dx,dy)
}

I ended up with the left plot is overwritten by plot(dx,dy). Is there
anyway to keep the left side intact while changing plots on the right
side?


Thanks a lot!

Baoqiang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lm on matrix data

2012-10-10 Thread Baoqiang Cao
Hi,

I have a question about using lm on matrix, have to admit it is very
trivial but I just couldn't find the answer after searched the mailing
list and other online tutorial. It would be great if you could help.

I have a matrix trainx of 492(rows) by 220(columns) that is my x,
and trainy is 492 by 1. Also, I have the newdata testx which is 240
(rows) by 220 (columns). Here is what I got:

py - predict(lm(trainy ~ trainx ), data.frame(testx))
Warning message:
'newdata' had 240 rows but variable(s) found have 492 rows

The fitting formula I intended is: trainy ~ trainx[,1] + trainx[,2] +
.. +trainx[,220].

Any help, please?

Best,
Baoqiang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] tm package: problem of TermDocumentMatrix and minWordLength

2012-05-16 Thread Baoqiang Cao
try this:

dtm - DocumentTermMatrix(examplecorpus, control = list(wordLengths=c(1,100)))



On Wed, May 16, 2012 at 6:22 AM, C.H. chainsawti...@gmail.com wrote:
 Dear All,

 The following code illustrate the problem.

 [R code]
 require(tm)
 exampledoc - c(R is good, R is really good)
 examplecorpus - Corpus(VectorSource(exampledoc), encoding = UTF-8)
 dtm - DocumentTermMatrix(examplecorpus, control = list(minWordLength = 1))
 as.matrix(dtm)
 [/R code]

 The term R and is were not included in the dtm even the control
 parameter minWordLength was set to 1.

    Terms
 Docs good really
   1    1      0
   2    1      1

 Would you reproduce this problem?

 The following is my sessionInfo

 sessionInfo()
 R version 2.15.0 (2012-03-30)
 Platform: i686-pc-linux-gnu (32-bit)

 locale:
  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=C                 LC_NAME=C
  [9] LC_ADDRESS=C               LC_TELEPHONE=C
 [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

 attached base packages:
 [1] stats     graphics  grDevices utils     datasets  methods   base

 other attached packages:
 [1] tm_0.5-7.1

 loaded via a namespace (and not attached):
 [1] compiler_2.15.0 slam_0.1-23     tools_2.15.0

 Regards,

 CH

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] error while install RCurl

2011-06-21 Thread Baoqiang Cao
Hi,

I got an error when tried to install RCurl, here is what I did:

R CMD INSTALL RCurl_1.6-6.tar.gz
* installing to library ‘/home/b/R/i486-pc-linux-gnu-library/2.13’
* installing *source* package ‘RCurl’ ...
checking for curl-config... no
Cannot find curl-config
ERROR: configuration failed for package ‘RCurl’
* removing ‘/home/b/R/i486-pc-linux-gnu-library/2.13/RCurl’


 my R version is:

 version
   _
platform   i486-pc-linux-gnu
arch   i486
os linux-gnu
system i486, linux-gnu
status
major  2
minor  13.0
year   2011
month  04
day13
svn rev55427
language   R
version.string R version 2.13.0 (2011-04-13)

Any help, please?

Baoqiang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] about spearman and kendal correlation coefficient calculation in cor

2011-05-16 Thread Baoqiang Cao
Hi,

I have the following two measurements stored in mat:

 print(mat)
   [,1] [,2]
 [1,] -14.80976 -265.786
 [2,] -14.92417  -54.724
 [3,] -13.92087  -58.912
 [4,]  -9.11503 -115.580
 [5,] -17.05970 -278.749
 [6,] -25.23313 -219.513
 [7,] -19.62465 -497.873
 [8,] -13.92087 -659.486
 [9,] -14.24629 -131.680
[10,] -20.81758 -604.961
[11,] -15.32194  -18.735

To calculate the ranking correlation, I used cor:
 cor(mat[,1], mat[,2], method=spearman)
[1] 0.2551259
 cor(mat[,1], mat[,2], method=kendal)
[1] 0.1834940

However, when I tried to reproduce the two correlation coefficients by
following their defination, I got different results than from cor.
For Spearman, I got:
 
od1 = order(mat[,1], decreasing=T)
od2= order(mat[,2], decreasing=T)  
 1-6*sum((od1-od2)^2)/(length(od1)^3-length(od1))
[1] -0.2909091

This is different with from cor, which is 0.255.

For Kendal, I got:

accord=0
disaccord=0

experi=mat[,1]
target=mat[,2]
N= length(experi)
for(i in 1:(N-1)) {
   for(j in (i+1):N) {
  if((target[i]  target[j])  (experi[i]  experi[j])) {
 accord=accord+1
  } else if ((target[i]  target[j])  (experi[i]  experi[j])) {
 disaccord=disaccord+1
  }
   }
}

 (accord-disaccord)/(N*(N-1)/2)
[1] -0.2181818

This is also different with from cor, which is 0.183.

Anybody could help me out explaining the right answer? Thanks in
advance!

Baoqiang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] about spearman and kendal correlation coefficient calculation in cor

2011-05-16 Thread Baoqiang Cao
Thank you very much Thomas!
I indeed learned a lot.

Baoqiang

On Tue, 2011-05-17 at 09:46 +1200, Thomas Lumley wrote:
 }

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] help on compare two ranks

2011-04-14 Thread Baoqiang Cao
Hi,

I have to one set of inputs and their observed true values of each
input. Now I have a model takes the input and predict a value. I only
consider the ranks based on either the observed true values or the
predicted values. My question is how do I compare this two rank in R?
That is, how close the rank from prediction to the rank from true
observations?

x_1: y_1, p_1,
x_2: y_2, p_2,
...
x_m: y_m, p_m,

R_y is the ranking based on {y_1, y_2,..,y_m}. R_p is the ranking from
{p_1, p_2, ..., p_m}. How do I know in R how good/bad R_p is given
R_y?

I searched r-help and got some clue of using wilcox.test but still a
bit confused of how to compare the two ranking against the true
ranking. Any advice will be highly appreciated. I apologize if this is
a too statistics question.

Thanks in advance.
Baoqiang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] use pcls to solve least square fitting with constraints

2010-12-06 Thread Baoqiang Cao
Hi,

I have a least square fitting problem with linear inequality
constraints. pcls seems capable of solving it so I tried it,
unfortunately, it is stuck with the following error:
 M - list()
 M$y = Dmat[,1]
 M$X = Cmat
 M$Ain = as.matrix(Amat)
 M$bin = rep(0, dim(Amat)[1])
 M$p=qr.solve(as.matrix(Cmat), Dmat[,1])
 M$w = rep(1, length(M$y))
 M$C = matrix(0,0,0)
 p-pcls(M)
Error in t(qr.qty(qra, t(M$X))[(j + 1):k, ]) :
  error in evaluating the argument 'x' in selecting a method for function 't'

After some searches, I still couldn't find any solution, any help
and/or advice will be highly appreciated!

Baoqiang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to know if a file exists on a remote server?

2010-11-30 Thread Baoqiang Cao
Hi,

I'd like to download some data files from a remote server, the problem
here is that some of the files actually don't exist, which I don't
know before try. Just wondering if a function in R could tell me if a
file exists on a remote server? I searched this mailing list and after
read severals mails, still clueless.  Any help will be highly
appreciated.

B.C.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to know if a file exists on a remote server?

2010-11-30 Thread Baoqiang Cao
Thanks Steven!
It is excellent code indeed!

On Tue, Nov 30, 2010 at 11:26 AM, steven mosher mosherste...@gmail.com wrote:
  I would use RCurl.

  if you have, for example, the url of an ftp site you can merely do a
 getURL() and the contents will be returned. That call will return data that
 can be coerced into a data.frame that will look like a directory structure
 listing the file names.

 If you need code just ask, but the RCurl docs are pretty good.



 On Tue, Nov 30, 2010 at 8:10 AM, Baoqiang Cao bqcaom...@gmail.com wrote:

 Hi,

 I'd like to download some data files from a remote server, the problem
 here is that some of the files actually don't exist, which I don't
 know before try. Just wondering if a function in R could tell me if a
 file exists on a remote server? I searched this mailing list and after
 read severals mails, still clueless.  Any help will be highly
 appreciated.

 B.C.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to know if a file exists on a remote server?

2010-11-30 Thread Baoqiang Cao
Thanks Steven again!
I have to say that these codes are fairly sophisticated to me, but I
enjoy using already!

BC

On Tue, Nov 30, 2010 at 12:02 PM, steven mosher mosherste...@gmail.com wrote:
 No problem, you can also  get the directory with a curlOption of dirlistonly

 see the example code in the package. This will depend on the version of
 libcurl that you have.

 If you have an older version, my code will get you the directory.

 From the Rcurl examples:

 the files within a directory.
 url =
 'ftp://ftp.wcc.nrcs.usda.gov/data/snow/snow_course/table/history/idaho/'
 filenames = getURL(url, ftp.use.epsv = FALSE, dirlistonly = TRUE)

   # Deal with newlines as \n or \r\n. (BDR)
   # Or alternatively, instruct libcurl to change \n's to \r\n's for us with
 crlf = TRUE
   # filenames = getURL(url, ftp.use.epsv = FALSE, ftplistonly = TRUE, crlf =
 TRUE)
 filenames = paste(url, strsplit(filenames, \r*\n)[[1]], sep = )
 con = getCurlHandle( ftp.use.epsv = FALSE)
 contents = sapply(filenames[1:5], getURL, curl = con)
 names(contents) = filenames[1:length(contents)]


 On Tue, Nov 30, 2010 at 9:56 AM, Baoqiang Cao bqcaom...@gmail.com wrote:

 Thanks Steven!
 It is excellent code indeed!

 On Tue, Nov 30, 2010 at 11:26 AM, steven mosher mosherste...@gmail.com
 wrote:
   I would use RCurl.
 
   if you have, for example, the url of an ftp site you can merely do a
  getURL() and the contents will be returned. That call will return data
  that
  can be coerced into a data.frame that will look like a directory
  structure
  listing the file names.
 
  If you need code just ask, but the RCurl docs are pretty good.
 
 
 
  On Tue, Nov 30, 2010 at 8:10 AM, Baoqiang Cao bqcaom...@gmail.com
  wrote:
 
  Hi,
 
  I'd like to download some data files from a remote server, the problem
  here is that some of the files actually don't exist, which I don't
  know before try. Just wondering if a function in R could tell me if a
  file exists on a remote server? I searched this mailing list and after
  read severals mails, still clueless.  Any help will be highly
  appreciated.
 
  B.C.
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to know if a file exists on a remote server?

2010-11-30 Thread Baoqiang Cao
Hi Georg,

Your code does work, I mean, it doesn't give me any error message,
which is critical for me because I need use it in a loop and plus I
don't know how to catch error message. Before your message, I was
using download.file but the loop was stopped because of the error
message when a file doesn't exist. So I guess, the option
method=wget made the difference.

To summarize (in case it is useful to others), there are (at least)
two ways to download files:

1) Georg Ruß:
 v = download.file(url,destf,method=wget)
if(v!=0) {
#download.file failed
}
#no error message though

2)

Henrique Dallazuanna and Steven Mosher both suggested using RCurl,
here is an example code from Henrique for checking if a file exists on
a server:

library(RCurl)
h = basicHeaderGatherer()
Lines - getURI(http://www.pdb.org/pdb/files/2J0S.1001;,
headerfunction = h$update)
h$value()[['status']]

If the status is 404, then not found. If exists then status should be 200.


What a productive day!

BC
On Tue, Nov 30, 2010 at 1:34 PM, Georg Ruß resea...@georgruss.de wrote:
 On 30/11/10 10:10:07, Baoqiang Cao wrote:
 I'd like to download some data files from a remote server, the problem
 here is that some of the files actually don't exist, which I don't
 know before try. Just wondering if a function in R could tell me if a
 file exists on a remote server?

 Hi Baoqiang,

 try downloading the file with R's download.file() function. Then you
 should examine the returned value.

 Citing a part of ?download.file below:

 Value:
 An (invisible) integer code, ‘0’ for success and non-zero for
 failure.  For the ‘wget’ and ‘lynx’ methods this is the status
 code returned by the external program.  The ‘internal’ method can
 return ‘1’, but will in most cases throw an error.

 So if you call your download via

 v - download.file(url, destfile, method=wget)

 and v is not equal to zero, then the file is likely to be non-existent (at
 least the download failed). Note: the method internal doesn't really
 change the value of v, I just tried that. With wget it returns 0 for
 success and 2048 (or some other value) for non-success.

 Regards,
 Georg.
 --
 Research Assistant
 Otto-von-Guericke-Universität Magdeburg
 resea...@georgruss.de
 http://research.georgruss.de


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to know if a file exists on a remote server?

2010-11-30 Thread Baoqiang Cao
I must say that your reminder is exactly what happened on me, that is,
the file is there but some of the downloaded files are corrupted. The
download.file didn't return anything alarming, but just could't open
some files. The problem was solved in my case by turning on
method=wget.

Thanks again!

BC

On Tue, Nov 30, 2010 at 5:39 PM, steven mosher mosherste...@gmail.com wrote:
  study  trycatch()

  also, be awre that even with RCurl, that you may find the file there and
 then fail or lose
 the connection.

 worse still you may get a currupt file on download. So there is a lot of
 checking to do
 to make bullet proof code that downloads files.





 On Tue, Nov 30, 2010 at 3:16 PM, Baoqiang Cao bqcaom...@gmail.com wrote:

 Hi Georg,

 Your code does work, I mean, it doesn't give me any error message,
 which is critical for me because I need use it in a loop and plus I
 don't know how to catch error message. Before your message, I was
 using download.file but the loop was stopped because of the error
 message when a file doesn't exist. So I guess, the option
 method=wget made the difference.

 To summarize (in case it is useful to others), there are (at least)
 two ways to download files:

 1) Georg Ruß:
  v = download.file(url,destf,method=wget)
 if(v!=0) {
 #download.file failed
 }
 #no error message though

 2)

 Henrique Dallazuanna and Steven Mosher both suggested using RCurl,
 here is an example code from Henrique for checking if a file exists on
 a server:
 
 library(RCurl)
 h = basicHeaderGatherer()
 Lines - getURI(http://www.pdb.org/pdb/files/2J0S.1001;,
 headerfunction = h$update)
 h$value()[['status']]

 If the status is 404, then not found. If exists then status should be 200.
 

 What a productive day!

 BC
 On Tue, Nov 30, 2010 at 1:34 PM, Georg Ruß resea...@georgruss.de wrote:
  On 30/11/10 10:10:07, Baoqiang Cao wrote:
  I'd like to download some data files from a remote server, the problem
  here is that some of the files actually don't exist, which I don't
  know before try. Just wondering if a function in R could tell me if a
  file exists on a remote server?
 
  Hi Baoqiang,
 
  try downloading the file with R's download.file() function. Then you
  should examine the returned value.
 
  Citing a part of ?download.file below:
 
  Value:
  An (invisible) integer code, ‘0’ for success and non-zero for
  failure.  For the ‘wget’ and ‘lynx’ methods this is the status
  code returned by the external program.  The ‘internal’ method can
  return ‘1’, but will in most cases throw an error.
 
  So if you call your download via
 
  v - download.file(url, destfile, method=wget)
 
  and v is not equal to zero, then the file is likely to be non-existent
  (at
  least the download failed). Note: the method internal doesn't really
  change the value of v, I just tried that. With wget it returns 0 for
  success and 2048 (or some other value) for non-success.
 
  Regards,
  Georg.
  --
  Research Assistant
  Otto-von-Guericke-Universität Magdeburg
  resea...@georgruss.de
  http://research.georgruss.de
 

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] problem with catching error message in open function

2008-09-24 Thread Baoqiang Cao

Hi R-helpers,

I'm extracting data from an public server, since there is a restriction 
such that I have to submit my entries one by one (I have 10^5 entries). 
I partially succeeded with using


tf - open(url, r)
if(isOpen(tf, r)) {readLines(tf)}
close(tf)

Some entries successfully were returned and read by readLine, then all 
the sudden, the program stopped with the following error:


The error message is
unable to open connection
cannot open: HTTP status was '502 Bad Gateway'
Execution halted

Is there anyway that I could catch the error from the open function? 
I'd like the program not being halted when the bad connection happens. 
Any help please?


Best,
Baoqiang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] function as.dist failed on large matrix on 64bit machine

2008-08-14 Thread Baoqiang Cao

Hi there,

I'm having a problem with as.dist when I tried to convert a numerical 
matrix to dist. The data matrix is 10^4 by 10^4. I got the following:


d - as.dist(dat)
Error: cannot allocate vector of size 762.9Mb

I need convert dat to dist because I will use hclust to do some 
clustering analysis.


The machine is 2Gb memory 64bit Linux, and I also failed on a 16Gb 
memory 64bit Linux. Is there anyway that I can
get around it? Or how large is the limitation on my machine? Thanks in 
advance.


Best,
Baoqiang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.