Re: [R] own TAB expansion

2010-10-09 Thread Deepayan Sarkar
On Fri, Oct 8, 2010 at 6:19 AM, Sebastian Gibb li...@sebastiangibb.de wrote:
 Hello Duncan,

 thank for your advice, but it doesn't work like expected:

 setClass(Class=A, representation=representation(slotA=numeric,
 slotB=numeric));
 setMethod($, A, function(x, name) {return(slot(x, name));})
 setGeneric(.DollarNames)
 setMethod(.DollarNames, signature(x=A), function(x,
 pattern)grep(pattern=pattern, x=c(slotA, slotB), value=T))

 a - new(A, slotA=1, slotB=2)
 a$sl  TAB
 # doesn't print slotA/slotB
 a$

 What I'm doing wrong?

There is a namespace issue with making .DollarNames() generic;
basically, the completion code in the utils namespace never sees the
new S4 generic. See a previous discussion at

http://www.mail-archive.com/r-de...@r-project.org/msg20553.html

Defining a S3 method should work (without the need for a dummy S3
class even with inheritance if you are working with R 2.12):

.DollarNames.A -
function(x, pattern) {
grep(pattern=pattern, x=c(slotA, slotB), value=T)
}

-Deepayan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] point characters THICKER in xyplot()

2010-10-09 Thread Deepayan Sarkar
On Fri, Oct 8, 2010 at 5:19 PM, array chip arrayprof...@yahoo.com wrote:
 Hi, how can I make the point characters thicker (NOT larger) in xyplot when
 groups= argument is used?

 dat-data.frame(x=1:100,y=1:100,group=rep(LETTERS[1:5],each=20))

    ### lwd=2 doesn't work here
 xyplot(y~x,groups=group,data=dat,col=1:4,pch=1:4,lwd=2)

    ### lwd=2 works with panel.points(), but grouping is messed up!
 xyplot(y~x,groups=group,data=dat,col=1:4,pch=1:4,
    panel=function(...) {panel.points(...,lwd=2)})

    ### group is correct with panel.superpose(), but lwd=2 doesn't work!
 xyplot(y~x,groups=group,data=dat,col=1:4,pch=1:4,
    panel=function(...) {panel.superpose(...,lwd=2)})

 Any suggestions?

xyplot(y~x,groups=group,data=dat,col=1:4,pch=1:4,lwd=2,
   panel = panel.superpose, panel.groups = panel.points)

panel.xyplot() should also honor lwd at some point (but I haven't
gotten around to it yet).

-Deepayan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to retrieve user coordinates in xyplot

2010-10-09 Thread Deepayan Sarkar
On Fri, Oct 8, 2010 at 3:52 PM, array chip arrayprof...@yahoo.com wrote:
 Hi, is there a way to retrieve the extremes of the user coordinates of the
 plotting region, like what par(usr) does in general graphics? I'd like to 
 use
 them to print additional texts at certain place inside each panel. Thanks

?current.panel.limits

-Deepayan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R: Why this deosn't work?, matrix, rounding error?

2010-10-09 Thread Gavin Simpson
On Fri, 2010-10-08 at 09:30 -0700, skan wrote:
 It's a problem much bigger.
 I use a matrix to store the results of a bigger problem.
 I loop through several variables and store the results of a computation on
 that matrix.
 At the beginning of the problem I initialize the matrix to zeros and I
 calculate its size from some input.
 
 And that seems not to work well maybe because of some rounding error.

Several people have responded with a solution to your Q on
stackoverflow:

matrix(0, ncota*nslope, 4)

As the 0 will get recycled to appropriate length.

G
-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] font question on pdf device

2010-10-09 Thread Kari Ruohonen
On Fri, 2010-10-08 at 14:19 +0100, Ted Harding wrote:
 On 08-Oct-10 12:44:12, Kari Ruohonen wrote:
  Hi,
  I wonder if this is something on my machine locally or R in general.
  
  When I do the following:
  plot(c(0,1),c(0,1),main=expression(paste(symbol(D),D,sep=)))
  
  I get a plot with a title having uppercase delta followed by D. But
  in the following
  
  pdf(file=deltaTest.pdf)
  plot(c(0,1),c(0,1),main=expression(paste(symbol(D),D,sep=)))
  dev.off()
  
  the uppercase delta looks like O with overstrike slash, i.e. Ø.

 snip

 [1] stats graphics  grDevices utils datasets  methods   base
 
 which is the same as yours (except that I'm using a slightly
 earlier version of R, and on i486 rather than x86_64. Debian
 Etch by the way).
 
 Ted.
 
 
 E-Mail: (Ted Harding) ted.hard...@wlandres.net
 Fax-to-email: +44 (0)870 094 0861
 Date: 08-Oct-10   Time: 14:19:48
 -- XFMail --

Hi and thanks for suggestions. Based on these I installed acroread and
found that when viewed with acroread the Delta in the pdf file prints
out OK but when viewed with evince, the document viewer, I get the
error. So, it seems not be an R issue at all. I am running 64-bit Ubuntu
9.10 for those who are interested in testing this.

Many thanks for all help.

Kari

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R: Why this deosn't work?, matrix, rounding error?

2010-10-09 Thread skan

Hello

I've seen the answer at stackoverflow.
They also said I must use zapsmall to avoid roundup problems.
I didn't expect this behaviour when division gives an integer number.
-- 
View this message in context: 
http://r.789695.n4.nabble.com/R-Why-this-deosn-t-work-matrix-rounding-error-tp2968527p2969459.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R: Why this deosn't work?, matrix, rounding error?

2010-10-09 Thread Peter Ehlers

On 2010-10-09 4:47, skan wrote:


Hello

I've seen the answer at stackoverflow.
They also said I must use zapsmall to avoid roundup problems.
I didn't expect this behaviour when division gives an integer number.


The trouble is that your expectations may not coincide with reality.
That's why people refer you to FAQ 7.31.

Even replacing the rep(0, ) with just 0 will not necessarily
give the expected result:

 eps1 - 1e-16
 eps2 - 1e-15

 ## try to generate a 3-by-4 matrix:

 matrix(0, nrow = 3 - eps1, ncol = 4)
 # [,1] [,2] [,3] [,4]
 #[1,]0000
 #[2,]0000
 #[3,]0000


 matrix(0, nrow = 3 - eps2, ncol = 4)
 # [,1] [,2] [,3] [,4]
 #[1,]0000
 #[2,]0000


 matrix(0, nrow = zapsmall(3 - eps2), ncol = 4)
 # [,1] [,2] [,3] [,4]
 #[1,]0000
 #[2,]0000
 #[3,]0000


 ## Note that your calculation did _not_ yield an integer:
  1 + ((1.5 - 0.1) / 0.05) - 29
 #[1] -3.552714e-15

Such are the vagaries of floating-point arithmetic. Play it
safe; use zapsmall.

  -Peter Ehlers

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] A competition to create a recommendation engine for R packages

2010-10-09 Thread Tal Galili
Hello everyone.

There is a new competition, outlined on the blog
dataistshttp://www.dataists.com/2010/10/using-data-tools-to-find-data-tools-the-yo-dawg-of-data-hacking/,
inviting us to analyse statistics of the use of R packages (collected from
52 R users), to create a R-package suggestion engine for ourselves.
Since I noticed several bloggers already wrote about it (as I have detailed
herehttp://www.r-statistics.com/2010/10/a-competition-to-recommend-relevant-r-packages-and-the-future-of-r/),
I thought it to be fitting to also notified the members of the R help
mailing list as well.

Best,
Tal

Contact
Details:---
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
--

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Hausman test for endogeneity

2010-10-09 Thread Holger Steinmetz

Dear folks,

can anybody point me in the right direction on how to conduct a hausman test
for endogeneity in simultanous equation models?

Best,
Holger
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Hausman-test-for-endogeneity-tp2969522p2969522.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Package for converting R datasets into SQL Server (create table and insert statements)?

2010-10-09 Thread johannes rara
Thanks, but I'm not looking for a function to save dataframes into a
RDBMS. I'm looking for a function which creates CREATE TABLE and
INSERT statements from a dataframe.

-J

2010/10/5 Eric Lecoutre ericlecou...@gmail.com:
 Hi,

 You can have a look at RODBC and its function sqlSave.

 HTH,

 Eric


 2010/10/3 johannes rara johannesr...@gmail.com

 Hi,

 R contains many good datasets which would be valuable in other
 platforms as well. My intention is to use R datasets on SQL Server as
 a sample tables. Is there a package that would do automatic conversion
 from the dataset schema into a SQL Server CREATE TABLE statement
 (and INSERT INTO statements)?

 For example.

  str(cars)
 'data.frame':   50 obs. of  2 variables:
  $ speed: num  4 4 7 7 8 9 10 10 10 11 ...
  $ dist : num  2 10 4 22 16 10 18 26 34 17 ...
 

 would become

 create table dbo.cars (
              id int identity(1,1) not null,
              speed int not null,
              dist int not null,
              constraint PK_id primary key clustered (id ASC)
              on [PRIMARY]
              )

 insert into dbo.cars
    values (N'4', N'2'),
              (N'4', N'10'),
              (N'7', N'4'),
               etc.

 -J

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 --
 Eric Lecoutre
 Consultant - Business  Decision
 Business Intelligence  Customer Intelligence


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Package for converting R datasets into SQL Server (create table and insert statements)?

2010-10-09 Thread Gabor Grothendieck
On Sat, Oct 9, 2010 at 9:02 AM, johannes rara johannesr...@gmail.com wrote:
 Thanks, but I'm not looking for a function to save dataframes into a
 RDBMS. I'm looking for a function which creates CREATE TABLE and
 INSERT statements from a dataframe.


If the reason you want that is so you can manipulate R data frames in
SQL then the sqldf package does that.  There are no create statements
to issue and no insert statements to issue (although you can).  The
database is automatically created, the create and insert statements
are automatically generated and executed, your SQL statement is run,
the result is automatically retrieved and the database is
automatically destroyed afterwards.  You just specify a select or
other sql statement with the data frame name(s) replacing the table
name(s).  It works with built-in data frames that ship with R and with
data frames you create yourself.  See http://sqldf.googlecode.com for
more.

-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Memory management in R

2010-10-09 Thread Lorenzo Isella

Hi David,
I am replying to you and to the other people who provided some insight 
into my problems with grepl.

Well, at least we now know that the bug is reproducible.
Indeed it is a strange sequence the one I am postprocessing, probably 
pathological to some extent, nevertheless the problem is given by grepl 
crushing when a long (but not huge) chunk of repeated data is loaded has 
to be acknowledged.
Now, my problem is the following: given a potentially long string (or 
before that a sequence, where every element has been generated via the 
hash function, algo='crc32' of the digest package), how can I, starting 
from an arbitrary position i along the list, calculate the shortest 
substring in the future of i (i.e. the interval i:end of the series) 
that has not occurred in the past of i (i.e. [1:i-1])?
Efficiency is not the main point here, I need to run this code only once 
to get what I need, but it cannot crush on a 2000-entry string.

Cheers

Lorenzo


On 10/09/2010 01:30 AM, David Winsemius wrote:


What puzzles me is that the list is not really long (less than 2000
entries) and I have not experienced the same problem even with longer
lists.


But maybe your loop terminated in them eaarlier/ Someplace between
11*225 and 11*240 the grepping machine gives up:

  eprs - paste(rep(aa, 225), collapse=#)
  grepl(eprs, eprs)
[1] TRUE

  eprs - paste(rep(aa, 240), collapse=#)
  grepl(eprs, eprs)
Error in grepl(eprs, eprs) :
invalid regular expression
'aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#a

In addition: Warning message:
In grepl(eprs, eprs) : regcomp error: 'Out of memory'

The complexity of the problem may depend on the distribution of values.
You have a very skewed distribution with the vast majority being in the
same value as appeared in your error message :

  table(x)
x
12653a6 202fbcc4 48bef8c3 4e084ddc 51f342a4 5d64d58a 78087f5e abddf3d1
1419 299 1 1 1 3 1 1
ac76183b b955be36 c600173a e96f6bbd e9c56275
1 30 5 1 9

And you have 1159 of them in one clump (which would seem to be somewhat
improbably under a random null hypothesis:

  max(rle(x)$lengths)
[1] 1159
  which(rle(x)$lengths == 1159)
[1] 123
  rle(x)$values[123]
[1] 12653a6

HTH (although I think it means you need to construct a different
implementation strategy);

David.



Many thanks

Lorenzo




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Package for converting R datasets into SQL Server (create table and insert statements)?

2010-10-09 Thread Michael Bedward
Package RSQLite has a dbBuildTableDefinition that creates the CREATE
TABLE statement for a given a data.frame. I think other db related
packages for MySQL and PostgreSQL also have such a function.

Michael


On 10 October 2010 00:39, Gabor Grothendieck ggrothendi...@gmail.com wrote:
 On Sat, Oct 9, 2010 at 9:02 AM, johannes rara johannesr...@gmail.com wrote:
 Thanks, but I'm not looking for a function to save dataframes into a
 RDBMS. I'm looking for a function which creates CREATE TABLE and
 INSERT statements from a dataframe.


 If the reason you want that is so you can manipulate R data frames in
 SQL then the sqldf package does that.  There are no create statements
 to issue and no insert statements to issue (although you can).  The
 database is automatically created, the create and insert statements
 are automatically generated and executed, your SQL statement is run,
 the result is automatically retrieved and the database is
 automatically destroyed afterwards.  You just specify a select or
 other sql statement with the data frame name(s) replacing the table
 name(s).  It works with built-in data frames that ship with R and with
 data frames you create yourself.  See http://sqldf.googlecode.com for
 more.

 --
 Statistics  Software Consulting
 GKX Group, GKX Associates Inc.
 tel: 1-877-GKX-GROUP
 email: ggrothendieck at gmail.com

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Package for converting R datasets into SQL Server (create table and insert statements)?

2010-10-09 Thread Spencer Graves
   Have you considered the DBI and RODBC packages?  I'm trying to 
do something like this myself right now, and a post of my own (to 
R-SIG-DB) produced recommendations for these two packages.  Both have 
vignettes.



  Hope this helps.
  Spencer


On 10/9/2010 6:52 AM, Michael Bedward wrote:

Package RSQLite has a dbBuildTableDefinition that creates the CREATE
TABLE statement for a given a data.frame. I think other db related
packages for MySQL and PostgreSQL also have such a function.

Michael


On 10 October 2010 00:39, Gabor Grothendieckggrothendi...@gmail.com  wrote:

On Sat, Oct 9, 2010 at 9:02 AM, johannes rarajohannesr...@gmail.com  wrote:

Thanks, but I'm not looking for a function to save dataframes into a
RDBMS. I'm looking for a function which creates CREATE TABLE and
INSERT statements from a dataframe.


If the reason you want that is so you can manipulate R data frames in
SQL then the sqldf package does that.  There are no create statements
to issue and no insert statements to issue (although you can).  The
database is automatically created, the create and insert statements
are automatically generated and executed, your SQL statement is run,
the result is automatically retrieved and the database is
automatically destroyed afterwards.  You just specify a select or
other sql statement with the data frame name(s) replacing the table
name(s).  It works with built-in data frames that ship with R and with
data frames you create yourself.  See http://sqldf.googlecode.com for
more.

--
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





--
Spencer Graves, PE, PhD
President and Chief Operating Officer
Structure Inspection and Monitoring, Inc.
751 Emerson Ct.
San José, CA 95126
ph:  408-655-4567

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Package for converting R datasets into SQL Server (create table and insert statements)?

2010-10-09 Thread David Winsemius


On Oct 9, 2010, at 9:02 AM, johannes rara wrote:


Thanks, but I'm not looking for a function to save dataframes into a
RDBMS. I'm looking for a function which creates CREATE TABLE and
INSERT statements from a dataframe.


(My first comment is speculation that Eric was intending that you look  
at the _code_ of sqlSave rather than on its output. My reading of the  
code (at the console rather than the source) suggests that it is  
constructing the code and passing it to the external drivers.)


Looking at the web documentation linked from sqldf()'s help page, it  
appears that at least part of this could also addressed by example 9  
of the current full documentation:


http://code.google.com/p/sqldf/

BOD is a built-in dataframe:

require(sqldf)
?sqldf

# Portion of example 9:

 sqldf(pragma table_info(BOD))
  cid   name type notnull dflt_value pk
1   0   Time REAL   0   NA  0
2   1 demand REAL   0   NA  0

 sqldf(c(select * from BOD, select * from sqlite_master))
   type name tbl_name rootpage
1 table  BOD  BOD2
 sql
1 CREATE TABLE `BOD` \n( Time REAL,\n\tdemand REAL \n)

There is integration with a variety of SQL db's, although the act of  
table creation may be limited to SQLite, since the primary advertised  
activity is SELECT statements and it does its access through the  
SQLite drive in memory  ... at least as I understand it.


--
David.



-J

2010/10/5 Eric Lecoutre ericlecou...@gmail.com:

Hi,

You can have a look at RODBC and its function sqlSave.

HTH,

Eric


2010/10/3 johannes rara johannesr...@gmail.com


Hi,

R contains many good datasets which would be valuable in other
platforms as well. My intention is to use R datasets on SQL Server  
as
a sample tables. Is there a package that would do automatic  
conversion

from the dataset schema into a SQL Server CREATE TABLE statement
(and INSERT INTO statements)?

For example.


str(cars)

'data.frame':   50 obs. of  2 variables:
 $ speed: num  4 4 7 7 8 9 10 10 10 11 ...
 $ dist : num  2 10 4 22 16 10 18 26 34 17 ...




would become

create table dbo.cars (
 id int identity(1,1) not null,
 speed int not null,
 dist int not null,
 constraint PK_id primary key clustered (id ASC)
 on [PRIMARY]
 )

insert into dbo.cars
   values (N'4', N'2'),
 (N'4', N'10'),
 (N'7', N'4'),
  etc.

-J

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
Eric Lecoutre
Consultant - Business  Decision
Business Intelligence  Customer Intelligence



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Possible Bug in Effects Package

2010-10-09 Thread Peter Ehlers

On 2010-10-02 11:47, Luciano Selzer wrote:

Dear List,
I find Effects package very useful, but  I believe I have found a bug in
allEffects function. Please consider the following code:

test- data.frame(tries= round(runif(40, 5, 300)),
 tra = gl(4, 10, labels = c(V, D, C, L)),
 prop= runif(40, 0, 1))

test$success- round(with(test, tries*prop))
test$prop- with(test, success/tries)

model- glm( cbind(success, tries) ~ -1 + tra, data = test, family =
binomial)
allEffects(model)

#Error en eval(expr, envir, enclos) : objeto 'tra' no encontrado

model2- glm( prop ~ -1 + tra, weights = tries, data = test, family =
binomial)
allEffects(model2)
#Works

On a quick search on the internet I've found nothing about this. Is this a
bug?



I think that this is indeed a bug, probably due to the use of
the all.vars() function in effects:::analyze.model().

The obvious workaround is to specify your model as in model2
above or, if you want to use the matrix-response version, then
give the matrix a name and use that in your model:

 respmat - with(test, cbind(success, tries - success))
 ##[correcting your cbind]
 mod - glm(respmat ~ )

   -Peter Ehlers


Thanks for your time


Luciano

[[alternative HTML version deleted]]



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Unsubscribe me from mailing list

2010-10-09 Thread craig
Please unsubscribe me from this list.  Thank you.



Marine Biologist
Elasmobranch Bycatch Reduction Scientist
SharkDefense Technologies, LLC
(845) 702-7087

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Memory management in R

2010-10-09 Thread David Winsemius


On Oct 9, 2010, at 9:45 AM, Lorenzo Isella wrote:


Hi David,
I am replying to you and to the other people who provided some  
insight into my problems with grepl.

Well, at least we now know that the bug is reproducible.
Indeed it is a strange sequence the one I am postprocessing,  
probably pathological to some extent, nevertheless the problem is  
given by grepl crushing when a long (but not huge) chunk of repeated  
data is loaded has to be acknowledged.
Now, my problem is the following: given a potentially long string  
(or before that a sequence, where every element has been generated  
via the hash function, algo='crc32' of the digest package), how can  
I, starting from an arbitrary position i along the list, calculate  
the shortest substring in the future of i (i.e. the interval i:end  
of the series) that has not occurred in the past of i (i.e. [1:i-1])?


Maybe you should work on a less convoluted explanation of the test? Or  
perhaps a couple of compact examples, preferably in R-copy-paste format?


Efficiency is not the main point here, I need to run this code only  
once to get what I need, but it cannot crush on a 2000-entry string.


My suggestion is to explore other alternatives. (I will admit that I  
don't yet fully understand the test that you are applying.) The two  
that have occurred to me are Biostrings which I have already mentioned  
and rle() which I have illustrated the use of but not referenced as an  
avenue. The Biostrings package is part of bioConductor (part of the R  
universe) although you should be prepared for a coffee break when you  
install it if you haven't gotten at least bioClite already installed.  
When I installed it last night it had 54 other package dependents also  
downloaded and installed. It seems to me that taking advantage of the  
coding resources in the molecular biology domain that are currently  
directed at decoding the information storage mechanism of life might  
be a smart strategy. You have not described the domain you are working  
in but I would guess that the digest package might be biological in  
primary application? So forgive me if I am preaching to the choir.


The rle option also occurred to me but it might take a smarter coder  
than I to fully implement it. (But maybe Holtman would be up to it.  
He's a _lot_ smarter than I.)  In your example the long x string is  
faithfully represented by two aligned vectors, each 197 characters in  
length. The long repeat sequence that broke the grepl mechanism are  
just one pair of values.

 rle(x)
Run Length Encoding
  lengths: int [1:197] 1 1 2 1 1 4 1 9 1 1 ...
  values : chr [1:197] 5d64d58a ac76183b 202fbcc4 78087f5e ...

So maybe as soon as you got to a bundle that was greater than 1/2 the  
overall length (as happened in the x case) you could stop, since it  
could not have occurred before.


--
David.



Cheers

Lorenzo


On 10/09/2010 01:30 AM, David Winsemius wrote:


What puzzles me is that the list is not really long (less than 2000
entries) and I have not experienced the same problem even with  
longer

lists.


But maybe your loop terminated in them eaarlier/ Someplace between
11*225 and 11*240 the grepping machine gives up:

 eprs - paste(rep(aa, 225), collapse=#)
 grepl(eprs, eprs)
[1] TRUE

 eprs - paste(rep(aa, 240), collapse=#)
 grepl(eprs, eprs)
Error in grepl(eprs, eprs) :
invalid regular expression
'aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#aa#a

In addition: Warning message:
In grepl(eprs, eprs) : regcomp error: 'Out of memory'

The complexity of the problem may depend on the distribution of  
values.
You have a very skewed distribution with the vast majority being in  
the

same value as appeared in your error message :

 table(x)
x
12653a6 202fbcc4 48bef8c3 4e084ddc 51f342a4 5d64d58a 78087f5e  
abddf3d1

1419 299 1 1 1 3 1 1
ac76183b b955be36 c600173a e96f6bbd e9c56275
1 30 5 1 9

And you have 1159 of them in one clump (which would seem to be  
somewhat

improbably under a random null hypothesis:

 max(rle(x)$lengths)
[1] 1159
 which(rle(x)$lengths == 

Re: [R] Unsubscribe me from mailing list

2010-10-09 Thread David Winsemius


On Oct 9, 2010, at 10:45 AM, cr...@sharkdefense.com wrote:


Please unsubscribe me from this list.  Thank you.

You need to do that yourself ... none of us can do that for you. Login  
and unsubscribe through the web page where you subscribed. You can  
also just leave yourself subscribed but turn off mailings or convert  
to once daily digests.


https://stat.ethz.ch/mailman/listinfo/r-help

(At the bottom of the page.)

--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Package for converting R datasets into SQL Server (create table and insert statements)?

2010-10-09 Thread johannes rara
Thanks Michael! dbBuildTableDefinition is something I was looking for
but it does not seem to support SQL Server table definitions (CREATE
TABLE statements may vary between different RDBMS).

Thanks anyway,
-J



2010/10/9 Michael Bedward michael.bedw...@gmail.com:
 Package RSQLite has a dbBuildTableDefinition that creates the CREATE
 TABLE statement for a given a data.frame. I think other db related
 packages for MySQL and PostgreSQL also have such a function.

 Michael


 On 10 October 2010 00:39, Gabor Grothendieck ggrothendi...@gmail.com wrote:
 On Sat, Oct 9, 2010 at 9:02 AM, johannes rara johannesr...@gmail.com wrote:
 Thanks, but I'm not looking for a function to save dataframes into a
 RDBMS. I'm looking for a function which creates CREATE TABLE and
 INSERT statements from a dataframe.


 If the reason you want that is so you can manipulate R data frames in
 SQL then the sqldf package does that.  There are no create statements
 to issue and no insert statements to issue (although you can).  The
 database is automatically created, the create and insert statements
 are automatically generated and executed, your SQL statement is run,
 the result is automatically retrieved and the database is
 automatically destroyed afterwards.  You just specify a select or
 other sql statement with the data frame name(s) replacing the table
 name(s).  It works with built-in data frames that ship with R and with
 data frames you create yourself.  See http://sqldf.googlecode.com for
 more.

 --
 Statistics  Software Consulting
 GKX Group, GKX Associates Inc.
 tel: 1-877-GKX-GROUP
 email: ggrothendieck at gmail.com

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Possible Bug in Effects Package

2010-10-09 Thread John Fox
Dear Peter and Luciano,

I agree that this is a bug, and I'll try to fix it as soon as I have a
chance -- probably the week after next.

I was rather surprised that effect() works in a model without a constant,
but it does seem to:

 model2- glm( prop ~ -1 + tra, weights = tries, data = test, family =
binomial)
 allEffects(model2)
 model: prop ~ -1 + tra

 tra effect
tra
V D C L 
0.4129073 0.4731815 0.5454545 0.4548451 

 model3- glm( prop ~ tra, weights = tries, data = test, family = binomial)
 allEffects(model3)
 model: prop ~ tra

 tra effect
tra
V D C L 
0.4129073 0.4731815 0.5454545 0.4548451

 1/(1 + exp(-coef(model2)))
 traV  traD  traC  traL 
0.4129073 0.4731815 0.5454545 0.4548451

I expect that this is peculiar to the one-way classification, and that
effect() will not work in general for a model without a constant (which will
violate marginality).

Thanks for bringing the problem to my attention. I'm afraid that I've been
so busy this fall that I've been unable to monitor the r-help list.

John


John Fox
Senator William McMaster 
  Professor of Social Statistics
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
web: socserv.mcmaster.ca/jfox


 -Original Message-
 From: Peter Ehlers [mailto:ehl...@ucalgary.ca]
 Sent: October-09-10 10:20 AM
 To: Luciano Selzer
 Cc: r-help@r-project.org; John Fox
 Subject: Re: [R] Possible Bug in Effects Package
 
 On 2010-10-02 11:47, Luciano Selzer wrote:
  Dear List,
  I find Effects package very useful, but  I believe I have found a bug in
  allEffects function. Please consider the following code:
 
  test- data.frame(tries= round(runif(40, 5, 300)),
   tra = gl(4, 10, labels = c(V, D, C, L)),
   prop= runif(40, 0, 1))
 
  test$success- round(with(test, tries*prop))
  test$prop- with(test, success/tries)
 
  model- glm( cbind(success, tries) ~ -1 + tra, data = test, family =
  binomial)
  allEffects(model)
 
  #Error en eval(expr, envir, enclos) : objeto 'tra' no encontrado
 
  model2- glm( prop ~ -1 + tra, weights = tries, data = test, family =
  binomial)
  allEffects(model2)
  #Works
 
  On a quick search on the internet I've found nothing about this. Is this
a
  bug?
 
 
 I think that this is indeed a bug, probably due to the use of
 the all.vars() function in effects:::analyze.model().
 
 The obvious workaround is to specify your model as in model2
 above or, if you want to use the matrix-response version, then
 give the matrix a name and use that in your model:
 
   respmat - with(test, cbind(success, tries - success))
   ##[correcting your cbind]
   mod - glm(respmat ~ )
 
 -Peter Ehlers
 
  Thanks for your time
 
 
  Luciano
 
  [[alternative HTML version deleted]]
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] StrSplit

2010-10-09 Thread Santosh Srinivas
Newbie question ... 

I am looking something equivalent to read.delim but  which accepts a text line 
as parameter instead of a file input.

Below is my problem, I'm unable to get the exact output which is a simple data 
frame of the data where the delimiter exists ... coming quite close though

I have a data frame with 10 lines called MF_Data
 MF_Data [1:10]
 [1] Scheme Code;Scheme Name;Net Asset Value;Repurchase Price;Sale Price;Date 
   
 [2]  
   
 [3] Open Ended Schemes ( Liquid )
   
 [4]  
   
 [5]  
   
 [6] AIG Global Investment Group Mutual Fund  
   
 [7] 106506;AIG India Liquid Fund-Institutional Plan-Daily Dividend 
Option;1001.;1001.;1001.;02-Oct-2010 
 [8] 106511;AIG India Liquid Fund-Institutional Plan-Growth 
Option;1210.4612;1210.4612;1210.4612;02-Oct-2010 
 [9] 106507;AIG India Liquid Fund-Institutional Plan-Weekly Dividend 
Option;1001.8765;1001.8765;1001.8765;02-Oct-2010
[10] 106503;AIG India Liquid Fund-Retail Plan-DailyDividend 
Option;1001.;1001.;1001.;02-Oct-2010 


Now for the lines below .. they are delimted by ; ... I am using 

 tempTxt - MF_Data[7]
 MF_Data_F -   unlist(strsplit(tempTxt,;, fixed = TRUE))
 tempTxt - MF_Data[8]
 MF_Data_F1 -  unlist(strsplit(tempTxt,;, fixed = TRUE))
 MF_Data_F - rbind(MF_Data_F,MF_Data_F1)
 
But MF_Data_F is not a simple 2X6 data frame which is what I want

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Hausman test for endogeneity

2010-10-09 Thread Giuseppe Marinelli
On Saturday 09 October 2010 14:37:35 Holger Steinmetz wrote:
 Dear folks,

 can anybody point me in the right direction on how to conduct a hausman
 test for endogeneity in simultanous equation models?

 Best,
 Holger

hausman.systemfit [1] should be what you are looking for.
Cheers

Giuseppe

[1] http://cran.r-project.org/web/packages/systemfit/systemfit.pdf

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] StrSplit

2010-10-09 Thread jim holtman
Is this what you are after:

 x - c(Scheme Code;Scheme Name;Net Asset Value;Repurchase Price;Sale 
 Price;Date
+ , 
+  ,Open Ended Schemes ( Liquid )
+ , 
+ , 
+ , AIG Global Investment Group Mutual Fund
+ , 106506;AIG India Liquid Fund-Institutional Plan-Daily Dividend
Option;1001.;1001.;1001.;02-Oct-2010
+ , 106511;AIG India Liquid Fund-Institutional Plan-Growth
Option;1210.4612;1210.4612;1210.4612;02-Oct-2010
+ , 106507;AIG India Liquid Fund-Institutional Plan-Weekly Dividend
Option;1001.8765;1001.8765;1001.8765;02-Oct-2010
+ , 106503;AIG India Liquid Fund-Retail Plan-DailyDividend
Option;1001.;1001.;1001.;02-Oct-2010)

 myData - read.table(textConnection(x[7:10]), sep=';')
 closeAllConnections()
 str(myData)
'data.frame':   4 obs. of  6 variables:
 $ V1: int  106506 106511 106507 106503
 $ V2: Factor w/ 4 levels AIG India Liquid Fund-Institutional
Plan-Daily Dividend Option,..: 1 2 3 4
 $ V3: num  1001 1210 1002 1001
 $ V4: num  1001 1210 1002 1001
 $ V5: num  1001 1210 1002 1001
 $ V6: Factor w/ 1 level 02-Oct-2010: 1 1 1 1
 myData
  V1
V2   V3   V4   V5  V6
1 106506  AIG India Liquid Fund-Institutional Plan-Daily Dividend
Option 1001.000 1001.000 1001.000 02-Oct-2010
2 106511  AIG India Liquid Fund-Institutional Plan-Growth
Option 1210.461 1210.461 1210.461 02-Oct-2010
3 106507 AIG India Liquid Fund-Institutional Plan-Weekly Dividend
Option 1001.876 1001.876 1001.876 02-Oct-2010
4 106503  AIG India Liquid Fund-Retail Plan-DailyDividend
Option 1001.000 1001.000 1001.000 02-Oct-2010




On Sat, Oct 9, 2010 at 12:18 PM, Santosh Srinivas
santosh.srini...@gmail.com wrote:
 Newbie question ...

 I am looking something equivalent to read.delim but  which accepts a text 
 line as parameter instead of a file input.

 Below is my problem, I'm unable to get the exact output which is a simple 
 data frame of the data where the delimiter exists ... coming quite close 
 though

 I have a data frame with 10 lines called MF_Data
 MF_Data [1:10]
  [1] Scheme Code;Scheme Name;Net Asset Value;Repurchase Price;Sale 
 Price;Date
  [2] 
  [3] Open Ended Schemes ( Liquid )
  [4] 
  [5] 
  [6] AIG Global Investment Group Mutual Fund
  [7] 106506;AIG India Liquid Fund-Institutional Plan-Daily Dividend 
 Option;1001.;1001.;1001.;02-Oct-2010
  [8] 106511;AIG India Liquid Fund-Institutional Plan-Growth 
 Option;1210.4612;1210.4612;1210.4612;02-Oct-2010
  [9] 106507;AIG India Liquid Fund-Institutional Plan-Weekly Dividend 
 Option;1001.8765;1001.8765;1001.8765;02-Oct-2010
 [10] 106503;AIG India Liquid Fund-Retail Plan-DailyDividend 
 Option;1001.;1001.;1001.;02-Oct-2010


 Now for the lines below .. they are delimted by ; ... I am using

  tempTxt - MF_Data[7]
  MF_Data_F -   unlist(strsplit(tempTxt,;, fixed = TRUE))
  tempTxt - MF_Data[8]
  MF_Data_F1 -  unlist(strsplit(tempTxt,;, fixed = TRUE))
  MF_Data_F - rbind(MF_Data_F,MF_Data_F1)

 But MF_Data_F is not a simple 2X6 data frame which is what I want

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Counting unique items in a list of matrices

2010-10-09 Thread Peter Ehlers

On 2010-10-07 10:10, Jim Silverton wrote:

Hello,
I gave  a list of 2 x 2 matrices called matlist. I have about 5000 2 x 2
matrices. I would like to count how many of each 2 x 2 unique matrix I have.
So I am thinking that I need a list of the unique 2 x 2 matrices and their
counts. Can anyone help.



Here's one way, using the plyr package:

 require(plyr)
 ## make a list of 2X2 matrices
 L - vector('list', 5000)
 set.seed(4321)
 for(i in 1:5000) L[[i]] - matrix(round(runif(4), 1), 2, 2)

 ## convert each matrix to a string of 4 numbers, then
 ## form dataframe
 dL - ldply(L, function(.x) toString(unlist(.x)))

 ## add an index vector
 dL$ind - seq_len(5000)

 ## count unique strings; return string, frequency, indeces
 result - ddply(dL, .(V1), summarize,
  freq=length(V1),
  idx=toString(ind))


   -Peter Ehlers

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] StrSplit

2010-10-09 Thread Jeffrey Spies
Jim's solution is the ideal way to read in the data: using the sep=;
argument in read.table.

However, if you do for some reason have a vector of strings like the
following (maybe someone gives you an Rdata file instead of the raw
data file):

MF_Data - c(106506;AIG India Liquid Fund-Institutional Plan-Daily
Dividend Option;1001.;1001.;1001.;02-Oct-2010,106511;AIG
India Liquid Fund-Institutional Plan-Growth
Option;1210.4612;1210.4612;1210.4612;02-Oct-2010)

Then you can use this to get a data frame:

as.data.frame(do.call(rbind, lapply(MF_Data, function(x)
unlist(strsplit(x, ';')

Cheers,

Jeff.

On Sat, Oct 9, 2010 at 12:30 PM, jim holtman jholt...@gmail.com wrote:
 Is this what you are after:

 x - c(Scheme Code;Scheme Name;Net Asset Value;Repurchase Price;Sale 
 Price;Date
 + , 
 +  ,Open Ended Schemes ( Liquid )
 + , 
 + , 
 + , AIG Global Investment Group Mutual Fund
 + , 106506;AIG India Liquid Fund-Institutional Plan-Daily Dividend
 Option;1001.;1001.;1001.;02-Oct-2010
 + , 106511;AIG India Liquid Fund-Institutional Plan-Growth
 Option;1210.4612;1210.4612;1210.4612;02-Oct-2010
 + , 106507;AIG India Liquid Fund-Institutional Plan-Weekly Dividend
 Option;1001.8765;1001.8765;1001.8765;02-Oct-2010
 + , 106503;AIG India Liquid Fund-Retail Plan-DailyDividend
 Option;1001.;1001.;1001.;02-Oct-2010)

 myData - read.table(textConnection(x[7:10]), sep=';')
 closeAllConnections()
 str(myData)
 'data.frame':   4 obs. of  6 variables:
  $ V1: int  106506 106511 106507 106503
  $ V2: Factor w/ 4 levels AIG India Liquid Fund-Institutional
 Plan-Daily Dividend Option,..: 1 2 3 4
  $ V3: num  1001 1210 1002 1001
  $ V4: num  1001 1210 1002 1001
  $ V5: num  1001 1210 1002 1001
  $ V6: Factor w/ 1 level 02-Oct-2010: 1 1 1 1
 myData
      V1
 V2       V3       V4       V5          V6
 1 106506  AIG India Liquid Fund-Institutional Plan-Daily Dividend
 Option 1001.000 1001.000 1001.000 02-Oct-2010
 2 106511          AIG India Liquid Fund-Institutional Plan-Growth
 Option 1210.461 1210.461 1210.461 02-Oct-2010
 3 106507 AIG India Liquid Fund-Institutional Plan-Weekly Dividend
 Option 1001.876 1001.876 1001.876 02-Oct-2010
 4 106503          AIG India Liquid Fund-Retail Plan-DailyDividend
 Option 1001.000 1001.000 1001.000 02-Oct-2010




 On Sat, Oct 9, 2010 at 12:18 PM, Santosh Srinivas
 santosh.srini...@gmail.com wrote:
 Newbie question ...

 I am looking something equivalent to read.delim but  which accepts a text 
 line as parameter instead of a file input.

 Below is my problem, I'm unable to get the exact output which is a simple 
 data frame of the data where the delimiter exists ... coming quite close 
 though

 I have a data frame with 10 lines called MF_Data
 MF_Data [1:10]
  [1] Scheme Code;Scheme Name;Net Asset Value;Repurchase Price;Sale 
 Price;Date
  [2] 
  [3] Open Ended Schemes ( Liquid )
  [4] 
  [5] 
  [6] AIG Global Investment Group Mutual Fund
  [7] 106506;AIG India Liquid Fund-Institutional Plan-Daily Dividend 
 Option;1001.;1001.;1001.;02-Oct-2010
  [8] 106511;AIG India Liquid Fund-Institutional Plan-Growth 
 Option;1210.4612;1210.4612;1210.4612;02-Oct-2010
  [9] 106507;AIG India Liquid Fund-Institutional Plan-Weekly Dividend 
 Option;1001.8765;1001.8765;1001.8765;02-Oct-2010
 [10] 106503;AIG India Liquid Fund-Retail Plan-DailyDividend 
 Option;1001.;1001.;1001.;02-Oct-2010


 Now for the lines below .. they are delimted by ; ... I am using

  tempTxt - MF_Data[7]
  MF_Data_F -   unlist(strsplit(tempTxt,;, fixed = TRUE))
  tempTxt - MF_Data[8]
  MF_Data_F1 -  unlist(strsplit(tempTxt,;, fixed = TRUE))
  MF_Data_F - rbind(MF_Data_F,MF_Data_F1)

 But MF_Data_F is not a simple 2X6 data frame which is what I want

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Jim Holtman
 Cincinnati, OH
 +1 513 646 9390

 What is the problem that you are trying to solve?

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] StrSplit

2010-10-09 Thread David Winsemius


On Oct 9, 2010, at 12:46 PM, Jeffrey Spies wrote:


Jim's solution is the ideal way to read in the data: using the sep=;
argument in read.table.

However, if you do for some reason have a vector of strings like the
following (maybe someone gives you an Rdata file instead of the raw
data file):

MF_Data - c(106506;AIG India Liquid Fund-Institutional Plan-Daily
Dividend Option;1001.;1001.;1001.;02-Oct-2010,106511;AIG
India Liquid Fund-Institutional Plan-Growth
Option;1210.4612;1210.4612;1210.4612;02-Oct-2010)

Then you can use this to get a data frame:

as.data.frame(do.call(rbind, lapply(MF_Data, function(x)
unlist(strsplit(x, ';')



If you are suggesting that Jim's solution would not work here, then I  
would disagree and suggest you try offering your vector (without the  
cr's inserted by our mail clients) to his code. It should work just  
fine and be far more readable.


On the other hand if you were offering this with an explanation that  
strsplit's split argument is more flexible than the sep argument in  
the read functions because it accepts regular expressions and so can  
handle situations where multiple separators exist in the same line,  
then I would applaud you.


--
David.


Cheers,

Jeff.

On Sat, Oct 9, 2010 at 12:30 PM, jim holtman jholt...@gmail.com  
wrote:

Is this what you are after:

x - c(Scheme Code;Scheme Name;Net Asset Value;Repurchase  
Price;Sale Price;Date

+ , 
+  ,Open Ended Schemes ( Liquid )
+ , 
+ , 
+ , AIG Global Investment Group Mutual Fund
+ , 106506;AIG India Liquid Fund-Institutional Plan-Daily Dividend
Option;1001.;1001.;1001.;02-Oct-2010
+ , 106511;AIG India Liquid Fund-Institutional Plan-Growth
Option;1210.4612;1210.4612;1210.4612;02-Oct-2010
+ , 106507;AIG India Liquid Fund-Institutional Plan-Weekly Dividend
Option;1001.8765;1001.8765;1001.8765;02-Oct-2010
+ , 106503;AIG India Liquid Fund-Retail Plan-DailyDividend
Option;1001.;1001.;1001.;02-Oct-2010)


myData - read.table(textConnection(x[7:10]), sep=';')
closeAllConnections()
str(myData)

'data.frame':   4 obs. of  6 variables:
 $ V1: int  106506 106511 106507 106503
 $ V2: Factor w/ 4 levels AIG India Liquid Fund-Institutional
Plan-Daily Dividend Option,..: 1 2 3 4
 $ V3: num  1001 1210 1002 1001
 $ V4: num  1001 1210 1002 1001
 $ V5: num  1001 1210 1002 1001
 $ V6: Factor w/ 1 level 02-Oct-2010: 1 1 1 1

myData

 V1
V2   V3   V4   V5  V6
1 106506  AIG India Liquid Fund-Institutional Plan-Daily Dividend
Option 1001.000 1001.000 1001.000 02-Oct-2010
2 106511  AIG India Liquid Fund-Institutional Plan-Growth
Option 1210.461 1210.461 1210.461 02-Oct-2010
3 106507 AIG India Liquid Fund-Institutional Plan-Weekly Dividend
Option 1001.876 1001.876 1001.876 02-Oct-2010
4 106503  AIG India Liquid Fund-Retail Plan-DailyDividend
Option 1001.000 1001.000 1001.000 02-Oct-2010






On Sat, Oct 9, 2010 at 12:18 PM, Santosh Srinivas
santosh.srini...@gmail.com wrote:

Newbie question ...

I am looking something equivalent to read.delim but  which accepts  
a text line as parameter instead of a file input.


Below is my problem, I'm unable to get the exact output which is a  
simple data frame of the data where the delimiter exists ...  
coming quite close though


I have a data frame with 10 lines called MF_Data

MF_Data [1:10]
 [1] Scheme Code;Scheme Name;Net Asset Value;Repurchase  
Price;Sale Price;Date

 [2] 
 [3] Open Ended Schemes ( Liquid )
 [4] 
 [5] 
 [6] AIG Global Investment Group Mutual Fund
 [7] 106506;AIG India Liquid Fund-Institutional Plan-Daily  
Dividend Option;1001.;1001.;1001.;02-Oct-2010
 [8] 106511;AIG India Liquid Fund-Institutional Plan-Growth  
Option;1210.4612;1210.4612;1210.4612;02-Oct-2010
 [9] 106507;AIG India Liquid Fund-Institutional Plan-Weekly  
Dividend Option;1001.8765;1001.8765;1001.8765;02-Oct-2010
[10] 106503;AIG India Liquid Fund-Retail Plan-DailyDividend  
Option;1001.;1001.;1001.;02-Oct-2010



Now for the lines below .. they are delimted by ; ... I am using

 tempTxt - MF_Data[7]
 MF_Data_F -   unlist(strsplit(tempTxt,;, fixed = TRUE))
 tempTxt - MF_Data[8]
 MF_Data_F1 -  unlist(strsplit(tempTxt,;, fixed = TRUE))
 MF_Data_F - rbind(MF_Data_F,MF_Data_F1)

But MF_Data_F is not a simple 2X6 data frame which is what I want

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





--
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




Re: [R] problem with colors

2010-10-09 Thread ANJAN PURKAYASTHA
Hi Phil and Thomas,
Thanks for your helpful feedback.  I must admit my solution to creating the
vector of colors lacked your elegance.
In brief, I saved the output of colors() into a text file, saved all but 47
colours in that file and read it back as a data frame and used the first
column of the dataframe as a vector of 47 colours. This roundabout method
may have caused the  problem because when I chose colours according to the
commands sent by both of you things seemed to work just fine.
Thank you very much for your feedback.
Anjan

On Thu, Oct 7, 2010 at 3:25 PM, Thomas Stewart tgstew...@gmail.com wrote:

 I would be helpful if you provided a more complete, reproducible example.
  Consider the following code.  It colors the boxes according to the first 47
 colors listed in the color() vector.

 -tgs

 data-as.data.frame(matrix(rnorm(47*23),ncol=47))
 boxplot(data,col=colors()[1:47])




 On Thu, Oct 7, 2010 at 2:22 PM, ANJAN PURKAYASTHA 
 anjan.purkayas...@gmail.com wrote:

 Hi,
 I have a data set of 47 columns. I would like to create a boxplot for each
 column, each boxplot of a different colour.
 So I created a vector col1. This vector has a subset of the colors
 returned by color()- red, cyan, green etc.
 Now I use the command: boxplot(dataset, col= col1) expecting to see 47
 boxplots, each of a different colour.
 Here is the problem: the boxplots are drawn correctly but it seems that
 only
 the first few colours in col1 are being used in a repeated pattern.
 Anybody has any ideas on how to tackle this?
 Thanks in advance,
 Anjan

 --
 ===
 anjan purkayastha, phd.
 research associate
 fas center for systems biology,
 harvard university
 52 oxford street
 cambridge ma 02138
 phone-703.740.6939
 ===

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





-- 
===
anjan purkayastha, phd.
research associate
fas center for systems biology,
harvard university
52 oxford street
cambridge ma 02138
phone-703.740.6939
===

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Counting unique items in a list of matrices

2010-10-09 Thread Jeffrey Spies
If you just want a list of matrices and their counts, you can use
Peter's list of matrices, L, and then:

With plyr:

require(plyr)
count(unlist(lapply(L, toString)))

Without plyr:

as.data.frame(table(unlist(lapply(L, toString

Cheers,

Jeff.

On Sat, Oct 9, 2010 at 12:44 PM, Peter Ehlers ehl...@ucalgary.ca wrote:
 On 2010-10-07 10:10, Jim Silverton wrote:

 Hello,
 I gave  a list of 2 x 2 matrices called matlist. I have about 5000 2 x 2
 matrices. I would like to count how many of each 2 x 2 unique matrix I
 have.
 So I am thinking that I need a list of the unique 2 x 2 matrices and their
 counts. Can anyone help.


 Here's one way, using the plyr package:

  require(plyr)
  ## make a list of 2X2 matrices
  L - vector('list', 5000)
  set.seed(4321)
  for(i in 1:5000) L[[i]] - matrix(round(runif(4), 1), 2, 2)

  ## convert each matrix to a string of 4 numbers, then
  ## form dataframe
  dL - ldply(L, function(.x) toString(unlist(.x)))

  ## add an index vector
  dL$ind - seq_len(5000)

  ## count unique strings; return string, frequency, indeces
  result - ddply(dL, .(V1), summarize,
                              freq=length(V1),
                              idx=toString(ind))


   -Peter Ehlers

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Hausman test for endogeneity

2010-10-09 Thread Liviu Andronic
Hello

On Sat, Oct 9, 2010 at 2:37 PM, Holger Steinmetz
holger.steinm...@web.de wrote:
 can anybody point me in the right direction on how to conduct a hausman test
 for endogeneity in simultanous equation models?

Try
install.packages('sos')
require(sos)
findFn('hausman')

Here I get these results:
 findFn('hausman')
found 22 matches;  retrieving 2 pages
2

Liviu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] own TAB expansion

2010-10-09 Thread Sebastian Gibb
Am Samstag, 9. Oktober 2010, 08:39:36 schrieb Deepayan Sarkar:
 On Fri, Oct 8, 2010 at 6:19 AM, Sebastian Gibb li...@sebastiangibb.de 
wrote:
  Hello Duncan,
  
  thank for your advice, but it doesn't work like expected:
  
  setClass(Class=A, representation=representation(slotA=numeric,
  slotB=numeric));
  setMethod($, A, function(x, name) {return(slot(x, name));})
  setGeneric(.DollarNames)
  setMethod(.DollarNames, signature(x=A), function(x,
  pattern)grep(pattern=pattern, x=c(slotA, slotB), value=T))
  
  a - new(A, slotA=1, slotB=2)
  a$sl  TAB
  # doesn't print slotA/slotB
  
  a$
  
  What I'm doing wrong?
 
 There is a namespace issue with making .DollarNames() generic;
 basically, the completion code in the utils namespace never sees the
 new S4 generic. See a previous discussion at
 
 http://www.mail-archive.com/r-de...@r-project.org/msg20553.html
 
 Defining a S3 method should work (without the need for a dummy S3
 class even with inheritance if you are working with R 2.12):
 
 .DollarNames.A -
 function(x, pattern) {
 grep(pattern=pattern, x=c(slotA, slotB), value=T)
 }
 
 -Deepayan

Hello Deepayan,

thanks for the link. This solution works for R 2.12.

Bye

Sebastian

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] StrSplit

2010-10-09 Thread Jeffrey Spies
Obviously Jim's solution does work, and I did not intend to imply it
didn't.  In fact, his read.table solution would work both if the OP
had a semi-colon delimited file to begin with (which I was trying to
say was ideal from a workflow standpoint) or a vector of strings (for
use when paired with textConnections).  Using strsplit is merely
another solution for the latter situation.  I thought the OP might
appreciate seeing how to use the function that they indicated they
were having problems with.  Plus, I have a penchant for R-ishly
unreadble code. ;)

Thanks for clarifying,

Jeff.

On Sat, Oct 9, 2010 at 1:04 PM, David Winsemius dwinsem...@comcast.net wrote:

 On Oct 9, 2010, at 12:46 PM, Jeffrey Spies wrote:

 Jim's solution is the ideal way to read in the data: using the sep=;
 argument in read.table.

 However, if you do for some reason have a vector of strings like the
 following (maybe someone gives you an Rdata file instead of the raw
 data file):

 MF_Data - c(106506;AIG India Liquid Fund-Institutional Plan-Daily
 Dividend Option;1001.;1001.;1001.;02-Oct-2010,106511;AIG
 India Liquid Fund-Institutional Plan-Growth
 Option;1210.4612;1210.4612;1210.4612;02-Oct-2010)

 Then you can use this to get a data frame:

 as.data.frame(do.call(rbind, lapply(MF_Data, function(x)
 unlist(strsplit(x, ';')


 If you are suggesting that Jim's solution would not work here, then I would
 disagree and suggest you try offering your vector (without the cr's
 inserted by our mail clients) to his code. It should work just fine and be
 far more readable.

 On the other hand if you were offering this with an explanation that
 strsplit's split argument is more flexible than the sep argument in the read
 functions because it accepts regular expressions and so can handle
 situations where multiple separators exist in the same line, then I would
 applaud you.

 --
 David.

 Cheers,

 Jeff.

 On Sat, Oct 9, 2010 at 12:30 PM, jim holtman jholt...@gmail.com wrote:

 Is this what you are after:

 x - c(Scheme Code;Scheme Name;Net Asset Value;Repurchase Price;Sale
 Price;Date

 + , 
 +  ,Open Ended Schemes ( Liquid )
 + , 
 + , 
 + , AIG Global Investment Group Mutual Fund
 + , 106506;AIG India Liquid Fund-Institutional Plan-Daily Dividend
 Option;1001.;1001.;1001.;02-Oct-2010
 + , 106511;AIG India Liquid Fund-Institutional Plan-Growth
 Option;1210.4612;1210.4612;1210.4612;02-Oct-2010
 + , 106507;AIG India Liquid Fund-Institutional Plan-Weekly Dividend
 Option;1001.8765;1001.8765;1001.8765;02-Oct-2010
 + , 106503;AIG India Liquid Fund-Retail Plan-DailyDividend
 Option;1001.;1001.;1001.;02-Oct-2010)

 myData - read.table(textConnection(x[7:10]), sep=';')
 closeAllConnections()
 str(myData)

 'data.frame':   4 obs. of  6 variables:
  $ V1: int  106506 106511 106507 106503
  $ V2: Factor w/ 4 levels AIG India Liquid Fund-Institutional
 Plan-Daily Dividend Option,..: 1 2 3 4
  $ V3: num  1001 1210 1002 1001
  $ V4: num  1001 1210 1002 1001
  $ V5: num  1001 1210 1002 1001
  $ V6: Factor w/ 1 level 02-Oct-2010: 1 1 1 1

 myData

     V1
 V2       V3       V4       V5          V6
 1 106506  AIG India Liquid Fund-Institutional Plan-Daily Dividend
 Option 1001.000 1001.000 1001.000 02-Oct-2010
 2 106511          AIG India Liquid Fund-Institutional Plan-Growth
 Option 1210.461 1210.461 1210.461 02-Oct-2010
 3 106507 AIG India Liquid Fund-Institutional Plan-Weekly Dividend
 Option 1001.876 1001.876 1001.876 02-Oct-2010
 4 106503          AIG India Liquid Fund-Retail Plan-DailyDividend
 Option 1001.000 1001.000 1001.000 02-Oct-2010




 On Sat, Oct 9, 2010 at 12:18 PM, Santosh Srinivas
 santosh.srini...@gmail.com wrote:

 Newbie question ...

 I am looking something equivalent to read.delim but  which accepts a
 text line as parameter instead of a file input.

 Below is my problem, I'm unable to get the exact output which is a
 simple data frame of the data where the delimiter exists ... coming quite
 close though

 I have a data frame with 10 lines called MF_Data

 MF_Data [1:10]

  [1] Scheme Code;Scheme Name;Net Asset Value;Repurchase Price;Sale
 Price;Date
  [2] 
  [3] Open Ended Schemes ( Liquid )
  [4] 
  [5] 
  [6] AIG Global Investment Group Mutual Fund
  [7] 106506;AIG India Liquid Fund-Institutional Plan-Daily Dividend
 Option;1001.;1001.;1001.;02-Oct-2010
  [8] 106511;AIG India Liquid Fund-Institutional Plan-Growth
 Option;1210.4612;1210.4612;1210.4612;02-Oct-2010
  [9] 106507;AIG India Liquid Fund-Institutional Plan-Weekly Dividend
 Option;1001.8765;1001.8765;1001.8765;02-Oct-2010
 [10] 106503;AIG India Liquid Fund-Retail Plan-DailyDividend
 Option;1001.;1001.;1001.;02-Oct-2010


 Now for the lines below .. they are delimted by ; ... I am using

  tempTxt - MF_Data[7]
  MF_Data_F -   unlist(strsplit(tempTxt,;, fixed = TRUE))
  tempTxt - MF_Data[8]
  MF_Data_F1 -  unlist(strsplit(tempTxt,;, fixed = TRUE))
  MF_Data_F - rbind(MF_Data_F,MF_Data_F1)

 But MF_Data_F is not 

[R] GPS data!

2010-10-09 Thread Mehdi Zarrei
Hello R-experts,

I have some coordinates that look like this:
lat   long

















32 31.85
59 48.74


 34 05.7
58 50.79


 34 05.7
58 50.79


 34 05.7
58 50.79




This was my GPS setting by the time of filed trip. I assume that the second 
column is minute + seconds. Am i right? I am looking for a function to 
convert them to decimal degree. 

Appreciate it if I get any help.

All the best,

Mehdi





  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Plot time range with rect or boxplot

2010-10-09 Thread Eric Hu
Hi,
I am trying to use rect (R2.11) to plot a set of data as following
 
 
  data
  CompanyPt  Pri  Pub
1AWO520  8/5/09  2/11/10
2BWO893 7/30/03  2/24/05
3AWO258 12/8/08  6/17/10
4C   WO248 1/13/09   9/2/10


pri- strptime(pri,%m/%d/%y)
pub - strptime(pub,%m/%d/%y)

plot.new()
plot.window(xlim=c(min(pri,pub),max(pri,pub)),ylim=c(0,length(company)-1))
%y - seq(0,0.5*(length(company)-1),0.5)
%h - 0.1
%rect(pri, y-h, pub, y+h, col=c(light blue,pink,yellow,red))
 
Neither xlim nor rect/boxplot recognizes pri/pub with date format. I wonder if 
there is a good way to deal with the date ploting so the x-axis can reflect the 
actual time range.
 
Thank you,
Eric
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Memory management in R

2010-10-09 Thread Lorenzo Isella



My suggestion is to explore other alternatives. (I will admit that I
don't yet fully understand the test that you are applying.)


Hi,
I am trying to partially implement the Lempel Ziv compression algorithm.
The point is that compressibility and entropy of a time series are 
related, hence my final goal is to evaluate the entropy of a time series.

You can find more at

http://bit.ly/93zX4T
http://en.wikipedia.org/wiki/LZ77_and_LZ78
http://bit.ly/9NgIFt




The two that

have occurred to me are Biostrings which I have already mentioned and
rle() which I have illustrated the use of but not referenced as an
avenue. The Biostrings package is part of bioConductor (part of the R
universe) although you should be prepared for a coffee break when you
install it if you haven't gotten at least bioClite already installed.
When I installed it last night it had 54 other package dependents also
downloaded and installed. It seems to me that taking advantage of the
coding resources in the molecular biology domain that are currently
directed at decoding the information storage mechanism of life might be
a smart strategy. You have not described the domain you are working in
but I would guess that the digest package might be biological in
primary application? So forgive me if I am preaching to the choir.

The rle option also occurred to me but it might take a smarter coder
than I to fully implement it. (But maybe Holtman would be up to it. He's
a _lot_ smarter than I.) In your example the long x string is
faithfully represented by two aligned vectors, each 197 characters in
length. The long repeat sequence that broke the grepl mechanism are just
one pair of values.
  rle(x)
Run Length Encoding
lengths: int [1:197] 1 1 2 1 1 4 1 9 1 1 ...
values : chr [1:197] 5d64d58a ac76183b 202fbcc4 78087f5e ...

So maybe as soon as you got to a bundle that was greater than 1/2 the
overall length (as happened in the x case) you could stop, since it
could not have occurred before.



I doubt that rle() can be deployed to replace Lempel-Ziv (LZ) algorithm 
in a trivial way. As a less convoluted example, consider the series


x - c(d,a,b,d,a,b,e,z)

If i=4 and therefore the i-th element is the second 'd' in the series, 
the shortest series starting from i=4 that I do not see in the past of 
'd' is


d,a,b,e, whose length is equal to 4 and that is the value 
returned by the function below.
The frustrating thing is that I already have the tools I need, just they 
crash for reasons beyond my control on relatively short series.
If anyone can make the function below more robust, that is really a big 
help for me.

Cheers

Lorenzo

###
entropy_lz - function(x,i){

past - x[1:i-1]

n - length(x)

lp - length(past)

future - x[i:n]

go_on - 1

count_len - 0

past_string - paste(past, collapse=#)

while (go_on0){

new_seq - x[i:(i+count_len)]

fut_string - paste(new_seq, collapse=#)

count_len - count_len+1

if (grepl(fut_string,past_string)!=1){

go_on - -1

}
}
return(count_len)

}

x - c(c,a,b,c,a,b,e,z)

S - entropy_lz(x,4)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Loss of precision in read.csv.

2010-10-09 Thread steven mosher
Given a csv file from this location

Airports-http://www.ourairports.com/data/airports.csv;

download.file(Airports,basename(Airports))


airports -read.csv(airports.csv,encoding=UTF-8)

 airports[1,]

id ident type  name latitude_deg longitude_deg
elevation_ft continent iso_country iso_region municipality scheduled_service

1 6523   00A heliport Total Rf Heliport  *40.0708  -74.9336 *
  11  NA  US  US-PA Bensalemno

  gps_code iata_code local_code home_link wikipedia_link keywords

1  00A  00A


And the precision is lost which we can show by using readLines:


fred-readLines(airports.csv)

 fred[2]
[1] 6523,\00A\,\heliport\,\Total Rf Heliport\,*
40.07080078125,-74.9336013793945*
,11,\NA\,\US\,\US-PA\,\Bensalem\,\no\,\00A\,,\00A\,,,


I tried various approaches, using colClasses, switching to read.tables,
specifying dec=.


I tested read.csv and it does preserve precision on my test case, but not on
this data.


Ideas?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] same random numbers in different sessions

2010-10-09 Thread Liviu Andronic
Dear all
I'm using Xubuntu Lucid and I keep getting the same random numbers
whenever I start a new session of R. For example, I keep getting
 sample(1:1000, 1)
[1] 87

or
 rnorm(1:10)
 [1] -1.3618103  0.4241701  1.0720076  0.2208145 -0.5375314 -0.4846588
 [7]  0.7576768  0.6527407 -0.6868786  0.8718527

I expected that some set.seed() instruction woudl be present in a
config file in
/usr/lib/R/etc/

but after grepping the only reference came out in Rprofile.site and it
was commented out:
# set.seed(1234)

What else could be causing this? Regards
Liviu

 sessionInfo()
R version 2.11.1 (2010-05-31)
x86_64-pc-linux-gnu

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=C  LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] fortunes_1.4-0 sos_1.3-0  brew_1.0-3 IPSUR_1.1


-- 
Do you know how to read?
http://www.alienetworks.com/srtest.cfm
http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader
Do you know how to write?
http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Loss of precision in read.csv.

2010-10-09 Thread Joshua Wiley
Hi Steven,

As near as I can tell, no precision is lost.  R is just being
courteous and not excessively filling our consoles.  Try:

print(airports[1,latitude_deg], digits = 22)

which is the most digits R will print (although internally it can
store more I believe).

Alternately, you can convert it to character class:

as.character(airports[1, ])

So in short, this is just a cosmetic feature of presenting the data,
not its actual storage.

Cheers,

Josh

On Sat, Oct 9, 2010 at 1:33 PM, steven mosher mosherste...@gmail.com wrote:
 Given a csv file from this location

 Airports-http://www.ourairports.com/data/airports.csv;

 download.file(Airports,basename(Airports))


 airports -read.csv(airports.csv,encoding=UTF-8)

 airports[1,]

    id ident     type              name latitude_deg longitude_deg
 elevation_ft continent iso_country iso_region municipality scheduled_service

 1 6523   00A heliport Total Rf Heliport      *40.0708      -74.9336 *
  11      NA          US      US-PA     Bensalem                no

  gps_code iata_code local_code home_link wikipedia_link keywords

 1      00A                  00A


 And the precision is lost which we can show by using readLines:


 fred-readLines(airports.csv)

 fred[2]
 [1] 6523,\00A\,\heliport\,\Total Rf Heliport\,*
 40.07080078125,-74.9336013793945*
 ,11,\NA\,\US\,\US-PA\,\Bensalem\,\no\,\00A\,,\00A\,,,


 I tried various approaches, using colClasses, switching to read.tables,
 specifying dec=.


 I tested read.csv and it does preserve precision on my test case, but not on
 this data.


 Ideas?

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] same random numbers in different sessions

2010-10-09 Thread Daniel Nordlund
 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On Behalf Of Liviu Andronic
 Sent: Saturday, October 09, 2010 2:15 PM
 To: r-help@r-project.org Help
 Subject: [R] same random numbers in different sessions
 
 Dear all
 I'm using Xubuntu Lucid and I keep getting the same random numbers
 whenever I start a new session of R. For example, I keep getting
  sample(1:1000, 1)
 [1] 87
 
 or
  rnorm(1:10)
  [1] -1.3618103  0.4241701  1.0720076  0.2208145 -0.5375314 -0.4846588
  [7]  0.7576768  0.6527407 -0.6868786  0.8718527
 
 I expected that some set.seed() instruction woudl be present in a
 config file in
 /usr/lib/R/etc/
 
 but after grepping the only reference came out in Rprofile.site and it
 was commented out:
 # set.seed(1234)
 
 What else could be causing this? Regards
 Liviu
 

Could you be reloading a workspace at start-up that is setting the seed?  What 
happens if you start R using the --vanilla option?

Dan

Daniel Nordlund
Bothell, WA USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] same random numbers in different sessions

2010-10-09 Thread G. Jay Kerns
Dear Liviu,

On Sat, Oct 9, 2010 at 5:14 PM, Liviu Andronic landronim...@gmail.com wrote:
 Dear all
 I'm using Xubuntu Lucid and I keep getting the same random numbers
 whenever I start a new session of R. For example, I keep getting
 sample(1:1000, 1)
 [1] 87

 or
 rnorm(1:10)
  [1] -1.3618103  0.4241701  1.0720076  0.2208145 -0.5375314 -0.4846588
  [7]  0.7576768  0.6527407 -0.6868786  0.8718527

 I expected that some set.seed() instruction woudl be present in a
 config file in
 /usr/lib/R/etc/

 but after grepping the only reference came out in Rprofile.site and it
 was commented out:
 # set.seed(1234)

 What else could be causing this? Regards
 Liviu

 sessionInfo()
 R version 2.11.1 (2010-05-31)
 x86_64-pc-linux-gnu

 locale:
  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
  [5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
  [9] LC_ADDRESS=C               LC_TELEPHONE=C
 [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

 attached base packages:
 [1] stats     graphics  grDevices utils     datasets  methods   base

 other attached packages:
 [1] fortunes_1.4-0 sos_1.3-0      brew_1.0-3     IPSUR_1.1




I notice that you have the IPSUR package loaded;  you know, just a
shot in the dark here, but did you try not loading it?

I ask because the vignette is built by making a special choice for
set.seed, and the workspace that ships with the package might be
interacting in an unexpected way.

Please let me know if IPSUR is the culprit.

Regards,
Jay

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] GPS data!

2010-10-09 Thread jim holtman
No need for a function; you can just write the expression yourself:

 x - read.table(textConnection(  32 31.85
+59 48.74
+ 34 05.7
+58 50.79
+ 34 05.7
+58 50.79
+ 34 05.7
+58 50.79))
 closeAllConnections()
 x
  V1V2
1 32 31.85
2 59 48.74
3 34  5.70
4 58 50.79
5 34  5.70
6 58 50.79
7 34  5.70
8 58 50.79
 # convert
 x$V1 + x$V2 / 60
[1] 32.53083 59.81233 34.09500 58.84650 34.09500 58.84650 34.09500 58.84650



On Sat, Oct 9, 2010 at 2:49 PM, Mehdi Zarrei gagzar...@yahoo.com wrote:
 Hello R-experts,

 I have some coordinates that look like this:
 lat   long

















                        32 31.85
                        59 48.74


                         34 05.7
                        58 50.79


                         34 05.7
                        58 50.79


                         34 05.7
                        58 50.79




 This was my GPS setting by the time of filed trip. I assume that the second 
 column is minute + seconds. Am i right? I am looking for a function to 
 convert them to decimal degree.

 Appreciate it if I get any help.

 All the best,

 Mehdi






        [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] GPS data!

2010-10-09 Thread Spencer Graves

   Have you tried sos:


install.packages('sos') # if not already installed
library(sos)
(gps - ???GPS)


  This found 63 matches for me right now.  The results open as a 
table in a web browser with the package with the most matches first and 
with hot links to the help page for each match in the right hand column.



  Hope this helps.
  Spencer


On 10/9/2010 2:54 PM, jim holtman wrote:

No need for a function; you can just write the expression yourself:


x- read.table(textConnection(  32 31.85

+59 48.74
+ 34 05.7
+58 50.79
+ 34 05.7
+58 50.79
+ 34 05.7
+58 50.79))

closeAllConnections()
x

   V1V2
1 32 31.85
2 59 48.74
3 34  5.70
4 58 50.79
5 34  5.70
6 58 50.79
7 34  5.70
8 58 50.79

# convert
x$V1 + x$V2 / 60

[1] 32.53083 59.81233 34.09500 58.84650 34.09500 58.84650 34.09500 58.84650

On Sat, Oct 9, 2010 at 2:49 PM, Mehdi Zarreigagzar...@yahoo.com  wrote:

Hello R-experts,

I have some coordinates that look like this:
lat   long

















32 31.85
59 48.74


 34 05.7
58 50.79


 34 05.7
58 50.79


 34 05.7
58 50.79




This was my GPS setting by the time of filed trip. I assume that the second column is 
minute + seconds. Am i right? I am looking for a function to convert them to 
decimal degree.

Appreciate it if I get any help.

All the best,

Mehdi






[[alternative HTML version deleted]]


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Spencer Graves, PE, PhD
President and Chief Operating Officer
Structure Inspection and Monitoring, Inc.
751 Emerson Ct.
San José, CA 95126
ph:  408-655-4567

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] same random numbers in different sessions

2010-10-09 Thread jim holtman
You need to set the set.seed yourself.  There are some simulation
where I do want the same numbers generated and can use the set.seed to
set it to a know value.  If you want something random each time, then
use the time of day in the call to set.seed.

On Sat, Oct 9, 2010 at 5:14 PM, Liviu Andronic landronim...@gmail.com wrote:
 Dear all
 I'm using Xubuntu Lucid and I keep getting the same random numbers
 whenever I start a new session of R. For example, I keep getting
 sample(1:1000, 1)
 [1] 87

 or
 rnorm(1:10)
  [1] -1.3618103  0.4241701  1.0720076  0.2208145 -0.5375314 -0.4846588
  [7]  0.7576768  0.6527407 -0.6868786  0.8718527

 I expected that some set.seed() instruction woudl be present in a
 config file in
 /usr/lib/R/etc/

 but after grepping the only reference came out in Rprofile.site and it
 was commented out:
 # set.seed(1234)

 What else could be causing this? Regards
 Liviu

 sessionInfo()
 R version 2.11.1 (2010-05-31)
 x86_64-pc-linux-gnu

 locale:
  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
  [5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
  [9] LC_ADDRESS=C               LC_TELEPHONE=C
 [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

 attached base packages:
 [1] stats     graphics  grDevices utils     datasets  methods   base

 other attached packages:
 [1] fortunes_1.4-0 sos_1.3-0      brew_1.0-3     IPSUR_1.1


 --
 Do you know how to read?
 http://www.alienetworks.com/srtest.cfm
 http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader
 Do you know how to write?
 http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plot time range with rect or boxplot

2010-10-09 Thread jim holtman
Try this.  You also had some typos on the names and weren't using the
dataframe correctly.

x - read.table(textConnection( CompanyPt
Pri  Pub
1AWO520  8/5/09  2/11/10
2BWO893 7/30/03  2/24/05
3AWO258 12/8/08  6/17/10
4C   WO248 1/13/09   9/2/10),
header = TRUE, as.is = TRUE)
closeAllConnections()
x$Pri - as.Date(x$Pri, format = '%m/%d/%y')
x$Pub - as.Date(x$Pub, format = '%m/%d/%y')
y - seq(0,0.5*(length(x$Company)-1),0.5)
h - 0.1
plot(range(x$Pri, x$Pub), c(0, nrow(x) - 1), type = 'n')
rect(x$Pri, y-h, x$Pub, y+h, col=c(light blue,pink,yellow,red))


On Sat, Oct 9, 2010 at 3:10 PM, Eric Hu eric...@gilead.com wrote:
 Hi,
 I am trying to use rect (R2.11) to plot a set of data as following


   data
  Company        Pt                  Pri                  Pub
 1    A            WO520          8/5/09          2/11/10
 2    B            WO893         7/30/03          2/24/05
 3    A            WO258         12/8/08          6/17/10
 4    C           WO248         1/13/09           9/2/10


 pri- strptime(pri,%m/%d/%y)
 pub - strptime(pub,%m/%d/%y)

 plot.new()
 plot.window(xlim=c(min(pri,pub),max(pri,pub)),ylim=c(0,length(company)-1))
 %y - seq(0,0.5*(length(company)-1),0.5)
 %h - 0.1
 %rect(pri, y-h, pub, y+h, col=c(light blue,pink,yellow,red))

 Neither xlim nor rect/boxplot recognizes pri/pub with date format. I wonder 
 if there is a good way to deal with the date ploting so the x-axis can 
 reflect the actual time range.

 Thank you,
 Eric
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Memory management in R

2010-10-09 Thread David Winsemius


On Oct 9, 2010, at 4:23 PM, Lorenzo Isella wrote:




My suggestion is to explore other alternatives. (I will admit that I
don't yet fully understand the test that you are applying.)


Hi,
I am trying to partially implement the Lempel Ziv compression  
algorithm.
The point is that compressibility and entropy of a time series are  
related, hence my final goal is to evaluate the entropy of a time  
series.

You can find more at

http://bit.ly/93zX4T
http://en.wikipedia.org/wiki/LZ77_and_LZ78
http://bit.ly/9NgIFt




The two that

have occurred to me are Biostrings which I have already mentioned and
rle() which I have illustrated the use of but not referenced as an
avenue. The Biostrings package is part of bioConductor (part of the R
universe) although you should be prepared for a coffee break when you
install it if you haven't gotten at least bioClite already installed.
When I installed it last night it had 54 other package dependents  
also

downloaded and installed. It seems to me that taking advantage of the
coding resources in the molecular biology domain that are currently
directed at decoding the information storage mechanism of life  
might be
a smart strategy. You have not described the domain you are working  
in

but I would guess that the digest package might be biological in
primary application? So forgive me if I am preaching to the choir.

The rle option also occurred to me but it might take a smarter coder
than I to fully implement it. (But maybe Holtman would be up to it.  
He's

a _lot_ smarter than I.) In your example the long x string is
faithfully represented by two aligned vectors, each 197 characters in
length. The long repeat sequence that broke the grepl mechanism are  
just

one pair of values.
 rle(x)
Run Length Encoding
lengths: int [1:197] 1 1 2 1 1 4 1 9 1 1 ...
values : chr [1:197] 5d64d58a ac76183b 202fbcc4 78087f5e ...

So maybe as soon as you got to a bundle that was greater than 1/2 the
overall length (as happened in the x case) you could stop, since it
could not have occurred before.



I doubt that rle() can be deployed to replace Lempel-Ziv (LZ)  
algorithm in a trivial way. As a less convoluted example, consider  
the series


x - c(d,a,b,d,a,b,e,z)

If i=4 and therefore the i-th element is the second 'd' in the  
series, the shortest series starting from i=4 that I do not see in  
the past of 'd' is


d,a,b,e, whose length is equal to 4 and that is the value  
returned by the function below.
The frustrating thing is that I already have the tools I need, just  
they crash for reasons beyond my control on relatively short series.
If anyone can make the function below more robust, that is really a  
big help for me.


I already offered the Biostrings package. It provides more robust  
methods for string matching than does grepl. Is there a reason that  
you choose not to?


--
David.

Cheers

Lorenzo

###
entropy_lz - function(x,i){

past - x[1:i-1]

n - length(x)

lp - length(past)

future - x[i:n]

go_on - 1

count_len - 0

past_string - paste(past, collapse=#)

while (go_on0){

new_seq - x[i:(i+count_len)]

fut_string - paste(new_seq, collapse=#)

count_len - count_len+1

if (grepl(fut_string,past_string)!=1){

go_on - -1

}
}
return(count_len)

}

x - c(c,a,b,c,a,b,e,z)

S - entropy_lz(x,4)


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] GC verbose=false still showing report

2010-10-09 Thread Robin Jeffries
I must be reading the help file for gc() wrong. I thought it said that
gc(verbose=FALSE) will run the garbage collection without printing the
Ncells/Vcells summary. However, this is what I get:

gc(verbose = FALSE)
 used (Mb) gc trigger  (Mb) max used  (Mb)
Ncells 267097 14.3 531268  28.4   531268  28.4
Vcells 429302  3.3   20829406 159.0 55923977 426.7

I'm embedding this in an Sweave/TeX file, so I *really* can't have
this printing out. Suggestions other than manually editing the TeX
file?

Robin Jeffries
MS, DrPH Candidate
Department of Biostatistics
UCLA
530-624-0428

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] GC verbose=false still showing report

2010-10-09 Thread Jeff Newmiller
Try

 invisible(gc())

?

Robin Jeffries rjeffr...@ucla.edu wrote:

I must be reading the help file for gc() wrong. I thought it said that
gc(verbose=FALSE) will run the garbage collection without printing the
Ncells/Vcells summary. However, this is what I get:

gc(verbose = FALSE)
 used (Mb) gc trigger  (Mb) max used  (Mb)
Ncells 267097 14.3 531268  28.4   531268  28.4
Vcells 429302  3.3   20829406 159.0 55923977 426.7

I'm embedding this in an Sweave/TeX file, so I *really* can't have
this printing out. Suggestions other than manually editing the TeX
file?

Robin Jeffries
MS, DrPH Candidate
Department of Biostatistics
UCLA
530-624-0428

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
---
Sent from my phone. Please excuse my brevity.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] svg plot and dashed lines

2010-10-09 Thread Paul Murrell

Hi

On 29/09/2010 11:15 p.m., Ivan Calandra wrote:

   Dear users,

When I boxplot(), the lines of the whiskers are dashed. However, when I
save in an svg file, the dashed lines of the whiskers are not dashed
anymore.
How can I have the dashed lines in the svg file?
I don't have this problem with a ps file, but I cannot edit such file as
easily as an svg file. That's why I'd like to stick to the svg format.


Assuming you're on Windows, you could try something like ...

# Install the 'Cairo' package from CRAN
library(Cairo)
CairoSVG(test.svg)
boxplot(b~a, data=df)
dev.off()

... on a modern Linux system, this should simplify to ...

svg(test.svg))
boxplot(b~a, data=df)
dev.off()

Paul


Thanks in advance,
Ivan


df- structure(list(a = structure(c(1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L), .Label = c(A, B), class
= factor), b = c(0.904439748839731, -0.855322875817714,
-0.957288625102814, 0.130401502975395, -1.27765131101282,
-2.08861064654457, 1.10234256081394, -2.05533035069656,
-1.04529859053820, -0.0847903566670016, 1.02553030160793,
0.321170740199536, 1.87419854190502, -0.891404432182873,
0.968745913802415, -0.85229752730528, 0.641555656821046,
1.72455661053506, -0.523097596614304, 1.26729031187194)), .Names =
c(a, b), row.names = c(NA, -20L), class = data.frame)

library(RSvgDevice)
devSVG(file=test.svg)
boxplot(b~a, data=df)
dev.off()




--
Dr Paul Murrell
Department of Statistics
The University of Auckland
Private Bag 92019
Auckland
New Zealand
64 9 3737599 x85392
p...@stat.auckland.ac.nz
http://www.stat.auckland.ac.nz/~paul/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] StrSplit

2010-10-09 Thread Santosh Srinivas
Thanks Jim. Exactly what I needed!

-Original Message-
From: jim holtman [mailto:jholt...@gmail.com] 
Sent: 09 October 2010 22:01
To: Santosh Srinivas
Cc: r-help@r-project.org
Subject: Re: [R] StrSplit

Is this what you are after:

 x - c(Scheme Code;Scheme Name;Net Asset Value;Repurchase Price;Sale
Price;Date
+ , 
+  ,Open Ended Schemes ( Liquid )
+ , 
+ , 
+ , AIG Global Investment Group Mutual Fund
+ , 106506;AIG India Liquid Fund-Institutional Plan-Daily Dividend
Option;1001.;1001.;1001.;02-Oct-2010
+ , 106511;AIG India Liquid Fund-Institutional Plan-Growth
Option;1210.4612;1210.4612;1210.4612;02-Oct-2010
+ , 106507;AIG India Liquid Fund-Institutional Plan-Weekly Dividend
Option;1001.8765;1001.8765;1001.8765;02-Oct-2010
+ , 106503;AIG India Liquid Fund-Retail Plan-DailyDividend
Option;1001.;1001.;1001.;02-Oct-2010)

 myData - read.table(textConnection(x[7:10]), sep=';')
 closeAllConnections()
 str(myData)
'data.frame':   4 obs. of  6 variables:
 $ V1: int  106506 106511 106507 106503
 $ V2: Factor w/ 4 levels AIG India Liquid Fund-Institutional
Plan-Daily Dividend Option,..: 1 2 3 4
 $ V3: num  1001 1210 1002 1001
 $ V4: num  1001 1210 1002 1001
 $ V5: num  1001 1210 1002 1001
 $ V6: Factor w/ 1 level 02-Oct-2010: 1 1 1 1
 myData
  V1
V2   V3   V4   V5  V6
1 106506  AIG India Liquid Fund-Institutional Plan-Daily Dividend
Option 1001.000 1001.000 1001.000 02-Oct-2010
2 106511  AIG India Liquid Fund-Institutional Plan-Growth
Option 1210.461 1210.461 1210.461 02-Oct-2010
3 106507 AIG India Liquid Fund-Institutional Plan-Weekly Dividend
Option 1001.876 1001.876 1001.876 02-Oct-2010
4 106503  AIG India Liquid Fund-Retail Plan-DailyDividend
Option 1001.000 1001.000 1001.000 02-Oct-2010




On Sat, Oct 9, 2010 at 12:18 PM, Santosh Srinivas
santosh.srini...@gmail.com wrote:
 Newbie question ...

 I am looking something equivalent to read.delim but  which accepts a text
line as parameter instead of a file input.

 Below is my problem, I'm unable to get the exact output which is a simple
data frame of the data where the delimiter exists ... coming quite close
though

 I have a data frame with 10 lines called MF_Data
 MF_Data [1:10]
  [1] Scheme Code;Scheme Name;Net Asset Value;Repurchase Price;Sale
Price;Date
  [2] 
  [3] Open Ended Schemes ( Liquid )
  [4] 
  [5] 
  [6] AIG Global Investment Group Mutual Fund
  [7] 106506;AIG India Liquid Fund-Institutional Plan-Daily Dividend
Option;1001.;1001.;1001.;02-Oct-2010
  [8] 106511;AIG India Liquid Fund-Institutional Plan-Growth
Option;1210.4612;1210.4612;1210.4612;02-Oct-2010
  [9] 106507;AIG India Liquid Fund-Institutional Plan-Weekly Dividend
Option;1001.8765;1001.8765;1001.8765;02-Oct-2010
 [10] 106503;AIG India Liquid Fund-Retail Plan-DailyDividend
Option;1001.;1001.;1001.;02-Oct-2010


 Now for the lines below .. they are delimted by ; ... I am using

  tempTxt - MF_Data[7]
  MF_Data_F -   unlist(strsplit(tempTxt,;, fixed = TRUE))
  tempTxt - MF_Data[8]
  MF_Data_F1 -  unlist(strsplit(tempTxt,;, fixed = TRUE))
  MF_Data_F - rbind(MF_Data_F,MF_Data_F1)

 But MF_Data_F is not a simple 2X6 data frame which is what I want

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] GC verbose=false still showing report

2010-10-09 Thread Robin Jeffries
invisible(gc())

worked perfectly. Thanks Jeff.

@ Josh: I know how to toggle showing/hiding command echos, but I
haven't figured out how to toggle on/off any printed output.




On Sat, Oct 9, 2010 at 5:10 PM, Robin Jeffries rjeffr...@ucla.edu wrote:
 I must be reading the help file for gc() wrong. I thought it said that
 gc(verbose=FALSE) will run the garbage collection without printing the
 Ncells/Vcells summary. However, this is what I get:

 gc(verbose = FALSE)
         used (Mb) gc trigger  (Mb) max used  (Mb)
 Ncells 267097 14.3     531268  28.4   531268  28.4
 Vcells 429302  3.3   20829406 159.0 55923977 426.7

 I'm embedding this in an Sweave/TeX file, so I *really* can't have
 this printing out. Suggestions other than manually editing the TeX
 file?

 Robin Jeffries
 MS, DrPH Candidate
 Department of Biostatistics
 UCLA
 530-624-0428


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Loss of precision in read.csv.

2010-10-09 Thread steven mosher
Ha Thanks,

  That was it.

On Sat, Oct 9, 2010 at 2:38 PM, Joshua Wiley jwiley.ps...@gmail.com wrote:

 Hi Steven,

 As near as I can tell, no precision is lost.  R is just being
 courteous and not excessively filling our consoles.  Try:

 print(airports[1,latitude_deg], digits = 22)

 which is the most digits R will print (although internally it can
 store more I believe).

 Alternately, you can convert it to character class:

 as.character(airports[1, ])

 So in short, this is just a cosmetic feature of presenting the data,
 not its actual storage.

 Cheers,

 Josh

 On Sat, Oct 9, 2010 at 1:33 PM, steven mosher mosherste...@gmail.com
 wrote:
  Given a csv file from this location
 
  Airports-http://www.ourairports.com/data/airports.csv;
 
  download.file(Airports,basename(Airports))
 
 
  airports -read.csv(airports.csv,encoding=UTF-8)
 
  airports[1,]
 
 id ident type  name latitude_deg longitude_deg
  elevation_ft continent iso_country iso_region municipality
 scheduled_service
 
  1 6523   00A heliport Total Rf Heliport  *40.0708  -74.9336 *
   11  NA  US  US-PA Bensalemno
 
   gps_code iata_code local_code home_link wikipedia_link keywords
 
  1  00A  00A
 
 
  And the precision is lost which we can show by using readLines:
 
 
  fred-readLines(airports.csv)
 
  fred[2]
  [1] 6523,\00A\,\heliport\,\Total Rf Heliport\,*
  40.07080078125,-74.9336013793945*
  ,11,\NA\,\US\,\US-PA\,\Bensalem\,\no\,\00A\,,\00A\,,,
 
 
  I tried various approaches, using colClasses, switching to read.tables,
  specifying dec=.
 
 
  I tested read.csv and it does preserve precision on my test case, but not
 on
  this data.
 
 
  Ideas?
 
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 



 --
 Joshua Wiley
 Ph.D. Student, Health Psychology
 University of California, Los Angeles
 http://www.joshuawiley.com/


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] read.table issue

2010-10-09 Thread Santosh Srinivas
Dear R-Group,

I am getting this error message incomplete final line found by
readTableHeader in the code below.

It seems to me that the error message is because of quote in the text data.
Is there any easy way to handle this? Or should I do a substitute.


 tempTxt - 100589;Canara Robeco Expo-Income
Plan;18.92;18.92;19.35;02-Apr-2007
+ 
 read.table(textConnection(tempTxt), sep=';')
  V1 V2V3V4V5  V6
1 100589 Canara Robeco Expo-Income Plan 18.92 18.92 19.35 02-Apr-2007
 tempTxt - 103272;Canara Robeco Fortune '94;30.07;30.07;30.75;02-Apr-2007
+ 
 read.table(textConnection(tempTxt), sep=';')
Error in read.table(textConnection(tempTxt), sep = ;) : 
  incomplete final line found by readTableHeader on 'tempTxt'

Thanks,
Santosh

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help needed for getYahooData in TTR package writing the Yahoo data to excel

2010-10-09 Thread missvanilla

Dear all, 

I'm totally new to R. Recently I've been trying to use getYahooData in TTR
package in order to download stock index daily open/high/low/close. The
downloaded data is in the format of 

   Open  High Low   Close  Volume
2000-01-04 18937.45 19187.61 18937.45 19002.86  0
2000-01-05 19003.51 19003.51 18221.82 18542.55  0
2000-01-06 18574.01 18582.74 18168.27 18168.27  0
2000-01-07 18194.05 18285.73 18068.10 18193.41  0
2000-01-11 18246.10 18887.56 18246.10 18850.92  0
2000-01-12 18780.17 18811.87 18626.92 18677.42  0
2000-01-13 18667.18 18845.03 18667.18 18833.29  0
2000-01-14 18882.99 19058.02 18733.83 18956.55  0
2000-01-17 19025.62 19442.58 19025.62 19437.23  0
2000-01-18 19412.47 19412.47 19145.17 19196.57  0

However, when I attempted to write the data to excel using write.table,
dates in the first colume  become 1,2,3,4 in the excel file. Same problem
happened if write.csv was used. 

If you run these two lines of code you'll get what I meant.. before running
the code, package TTR needs to be loaded. 

N225 - getYahooData(^N225, 2101, )
write.table(N225,Nikkei.xls,sep='\t', row.name = TRUE , col.name = NA)

Appreciate your kind assistance! Thanks a lot in advance. 

-- 
View this message in context: 
http://r.789695.n4.nabble.com/Help-needed-for-getYahooData-in-TTR-package-writing-the-Yahoo-data-to-excel-tp2970017p2970017.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help needed for getYahooData in TTR package writing the Yahoo data to excel

2010-10-09 Thread David Winsemius


On Oct 9, 2010, at 10:54 PM, missvanilla wrote:



Dear all,

I'm totally new to R. Recently I've been trying to use getYahooData  
in TTR
package in order to download stock index daily open/high/low/close.  
The

downloaded data is in the format of

  Open  High Low   Close  Volume
2000-01-04 18937.45 19187.61 18937.45 19002.86  0
2000-01-05 19003.51 19003.51 18221.82 18542.55  0
2000-01-06 18574.01 18582.74 18168.27 18168.27  0
2000-01-07 18194.05 18285.73 18068.10 18193.41  0
2000-01-11 18246.10 18887.56 18246.10 18850.92  0
2000-01-12 18780.17 18811.87 18626.92 18677.42  0
2000-01-13 18667.18 18845.03 18667.18 18833.29  0
2000-01-14 18882.99 19058.02 18733.83 18956.55  0
2000-01-17 19025.62 19442.58 19025.62 19437.23  0
2000-01-18 19412.47 19412.47 19145.17 19196.57  0

However, when I attempted to write the data to excel using  
write.table,
dates in the first colume  become 1,2,3,4 in the excel file. Same  
problem

happened if write.csv was used.

If you run these two lines of code you'll get what I meant.. before  
running

the code, package TTR needs to be loaded.


N225 - getYahooData(^N225, 2101, )
write.table(N225,Nikkei.xls,sep='\t', row.name = TRUE , col.name  
= NA)


There is a well-described problem with write.table files going into  
Excel. There is no leading item or tab on the first row. You need to  
insert an extra cell and move the header over one position. Then you  
won't be misinterpreting your row.names as dates.


--
David


Appreciate your kind assistance! Thanks a lot in advance.

--
View this message in context: 
http://r.789695.n4.nabble.com/Help-needed-for-getYahooData-in-TTR-package-writing-the-Yahoo-data-to-excel-tp2970017p2970017.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] read.xls??

2010-10-09 Thread Matt Curcio
Greeting all,
I am having a little trouble finding the 'right' package that will
read in .xls Excel spreadsheets. My Ubuntu base does not seem to have
the ability to read them.

Any suggestions?
Cheers,
M

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to add a new column to a matrix?

2010-10-09 Thread Lakshmi Kastury

Hi -
I am a beginner to the R language. I have written the following matrix: 
Z.mat=matrix(c(2,2,2,1,1,1,3,2,1,6,5,4,9,1,1,2,3,2), nrow=6)
I would like to add a 4th column consisting of: 6, 9, 8, 15, 16, 17

I would also like to name each column a, b, c, d as well.

Thanks!
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] read.xls??

2010-10-09 Thread Gabor Grothendieck
On Sat, Oct 9, 2010 at 11:56 PM, Matt Curcio matt.curcio...@gmail.com wrote:
 Greeting all,
 I am having a little trouble finding the 'right' package that will
 read in .xls Excel spreadsheets. My Ubuntu base does not seem to have
 the ability to read them.

For various alternatives see:
http://rwiki.sciviews.org/doku.php?id=tips:data-io:ms_windowss=excel

-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to add a new column to a matrix?

2010-10-09 Thread Joshua Wiley
Hi,

This should do it.  I tried to comment to explain things.

Z.mat - matrix(c(2,2,2,1,1,1,3,2,1,6,5,4,9,1,1,2,3,2), nrow=6)

# column bind data together
Z.mat - cbind(Z.mat, c(6,9,8,15,16,17))

# add names to the 2 dimensions of Z.mat
# the first element of the list is the row names, left as empty
# the second element is the column names
# 'letters' is a built in vector of the lower case letters of
# the Latin alphabet
dimnames(Z.mat) - list(NULL, letters[1:4])

# Another way would be to use
colnames(Z.mat) - letters[1:4]

# For documentation see especially
? cbind
? dimnames

Hope that helps,

Josh

On Sat, Oct 9, 2010 at 8:16 PM, Lakshmi Kastury
skast...@students.poly.edu wrote:

 Hi -
 I am a beginner to the R language. I have written the following matrix: 
 Z.mat=matrix(c(2,2,2,1,1,1,3,2,1,6,5,4,9,1,1,2,3,2), nrow=6)
 I would like to add a 4th column consisting of: 6, 9, 8, 15, 16, 17

 I would also like to name each column a, b, c, d as well.

 Thanks!

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Mapping the coordinates!

2010-10-09 Thread Mehdi Zarrei










Hello,



I have a series of coordinates
(latitudes and longitudes) each one/several associated to a code
(from 1 to 28). I used function points (latitude, longitudes) to
transfer them to a per-prepared map. 




1- I wonder how  I might be able to
automatically add codes (1-28) to the map too? 




2-Moreover,  mostly there are a few
codes from the identical coordinates. What is the function to avoid
overlapping of codes on the map? 




3- I want to draw closed line around
some geographical areas to define the habitats. 







Your help in any way (introducing
manuals, codes, etc) is appreciated.






All the best,



Mehdi



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to add a new column to a matrix?

2010-10-09 Thread David Winsemius


On Oct 9, 2010, at 11:16 PM, Lakshmi Kastury wrote:



Hi -
I am a beginner to the R language. I have written the following  
matrix: Z.mat=matrix(c(2,2,2,1,1,1,3,2,1,6,5,4,9,1,1,2,3,2), nrow=6)

I would like to add a 4th column consisting of: 6, 9, 8, 15, 16, 17


?cbind


I would also like to name each column a, b, c, d as well.


The help page for matrix seems to be perfectly clear on this point.

?matrix

--
David


Thanks!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] read.table issue

2010-10-09 Thread jim holtman
The problem is that you have an unbalanced quote (') in your input .
you need to specifiy quote = '' in read.table:

 tempTxt - 103272;Canara Robeco Fortune '94;30.07;30.07;30.75;02-Apr-2007
+ 
 read.table(textConnection(tempTxt), sep=';', quote = '')
  V1V2V3V4V5  V6
1 103272 Canara Robeco Fortune '94 30.07 30.07 30.75 02-Apr-2007

The quote is '94 in the string.

On Sat, Oct 9, 2010 at 10:05 PM, Santosh Srinivas
santosh.srini...@gmail.com wrote:
 Dear R-Group,

 I am getting this error message incomplete final line found by
 readTableHeader in the code below.

 It seems to me that the error message is because of quote in the text data.
 Is there any easy way to handle this? Or should I do a substitute.


 tempTxt - 100589;Canara Robeco Expo-Income
 Plan;18.92;18.92;19.35;02-Apr-2007
 + 
 read.table(textConnection(tempTxt), sep=';')
      V1                             V2    V3    V4    V5          V6
 1 100589 Canara Robeco Expo-Income Plan 18.92 18.92 19.35 02-Apr-2007
 tempTxt - 103272;Canara Robeco Fortune '94;30.07;30.07;30.75;02-Apr-2007
 + 
 read.table(textConnection(tempTxt), sep=';')
 Error in read.table(textConnection(tempTxt), sep = ;) :
  incomplete final line found by readTableHeader on 'tempTxt'

 Thanks,
 Santosh

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Mapping the coordinates!

2010-10-09 Thread Johannes Huesing
Mehdi Zarrei gagzar...@yahoo.com [Sun, Oct 10, 2010 at 06:11:23AM CEST]:
 
 
 
   
   
   
   
   
   
 
 Hello,
 
 
 
 I have a series of coordinates
 (latitudes and longitudes) each one/several associated to a code
 (from 1 to 28). I used function points (latitude, longitudes) to
 transfer them to a per-prepared map. 
 
 
 
 
 1- I wonder how  I might be able to
 automatically add codes (1-28) to the map too? 

Type ?text at the R prompt.

 
 2-Moreover,  mostly there are a few
 codes from the identical coordinates. What is the function to avoid
 overlapping of codes on the map? 
 

jitter() adds some noise, I don't know of this is sufficient for you.

 
 3- I want to draw closed line around
 some geographical areas to define the habitats. 

If you search for convex hull in rseek.org, you may find something 
relevant for you. I did this and the third result was

http://addictedtor.free.fr/graphiques/RGraphGallery.php?graph=61

-- 
Johannes Hüsing   There is something fascinating about science. 
  One gets such wholesale returns of conjecture 
mailto:johan...@huesing.name  from such a trifling investment of fact.  
  
http://derwisch.wikidot.com (Mark Twain, Life on the Mississippi)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Loss of precision in read.csv.

2010-10-09 Thread Joshua Wiley
On Sat, Oct 9, 2010 at 2:38 PM, Joshua Wiley jwiley.ps...@gmail.com wrote:
 Hi Steven,

 As near as I can tell, no precision is lost.  R is just being
 courteous and not excessively filling our consoles.  Try:

 print(airports[1,latitude_deg], digits = 22)

 which is the most digits R will print (although internally it can
 store more I believe).

Dr. Heiberger was kind enough to point out to that the maximum is 53
binary digits, as stated in the R FAQ 7.31:
http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f

Slides by the same from the recent UseR! 2010 conference also provide
further explanation:
http://user2010.org/slides/Heiberger.pdf

One library that allows further precision is Rmpfr based on:
http://www.mpfr.org/

To give a small example borrowing the sprintf() display from Dr.
Heiberger's slides:

 library(Rmpfr)
 sprintf(%+17.17f, 2/3)
[1] +0.3
 mpfr(2, 260)/3
1 'mpfr' number of precision  260   bits
[1] 
0.6685


My sincerest apologies for the previous misinformation.

Josh

snip

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.