Re: [R] Integrating R and Textmate

2008-07-25 Thread Hans-Jörg Bibiko


On 25.07.2008, at 17:06, Rob Goedman wrote:


Art,

Could it be the case TextMate is activating the wrong version of R  
(2.6 vs. 2.7.1).

I do not believe this. TextMate is using the normal R Terminal.

I invoked the following command in a Rdaemon environment:
install.packages(ade4, repos=http://cran.r-project.org;,  
contriburl = contrib.url( http://streaming.stat.iastate.edu/CRAN/;,  
type=mac.binary), dependencies=TRUE, installWithVers=TRUE)


with Mac OSX 10.4.11, R 2.7.0
and I got this

Package: ade4
Version: 1.4-9
Date: 2008/5/23
...
URL: http://pbil.univ-lyon1.fr/ADE-4, Mailing list:
http://listes.univ-lyon1.fr/wws/info/adelist
Packaged: Fri May 23 16:28:17 2008; penel
Built: R 2.7.0; universal-apple-darwin8.10.1; 2008-05-26 03:43:26; unix

What version of TextMate do you are using and esp. which R bundle??

--Hans

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] variable as part of file name

2008-07-02 Thread Hans-Jörg Bibiko


On 02.07.2008, at 15:31, Laura Poggio wrote:


Dear all,
sorry for this very basic question, but I did not find any good  
example yet.


I would like to set up a variable that can be recall later to  
substitute a

part of a file name.
As example:

var_filename = as.name(aaa)

jpeg(var_filename.jpg)
plot()
dev.off()


try:
var_filename - 'foo'
jpeg( paste( var_filename, '.jpg', sep='' ) )

See ?paste

--Hans

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Convert character string to number

2008-06-21 Thread Hans-Jörg Bibiko


On 21.06.2008, at 01:36, Ken Liu wrote:

I would like to convert a character vector

xxx - c(1/2, 1/4)

to

yyy - c(0.5, 0.25)


, but as.numeric didn't work for me.  Could anyone give me a hint  
please?


There are many many ways, and they're depending on the structure of  
xxx. If you only have such fractions you can use this naïve approach:


as.numeric( gsub((\\d+)/(\\d+), \\1, xxx, perl=T) ) / as.numeric 
( gsub((\\d+)/(\\d+), \\2, xxx, perl=T) )


--Hans

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] cutting out numbers from vectors

2008-06-21 Thread Hans-Jörg Bibiko


On 20.06.2008, at 19:27, calundergrad wrote:



i have a vector with a string of number
e.g

[1] 0113001 001130011000 001130012000 001130013000 001130016000
[6] 001130018000

i want a vector with the same numbers except with the last three  
digits  of

every factor cut off.
e.g

[1] 01130010 001130011 001130012 001130013 001130016
[6] 001130018

how do i accomplish this?


One way:

gsub((.*)(.{3}), \\1, YOUR_VECTOR)


--Hans

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Convert character string to number

2008-06-21 Thread Hans-Jörg Bibiko

On 21.06.2008, at 17:25, Marc Schwartz wrote:

on 06/21/2008 09:18 AM Gabor Grothendieck wrote:

On Sat, Jun 21, 2008 at 8:31 AM, Wacek Kusnierczyk
[EMAIL PROTECTED] wrote:

Hans-Jörg Bibiko wrote:

On 21.06.2008, at 01:36, Ken Liu wrote:

I would like to convert a character vector
xxx - c(1/2, 1/4)
to
yyy - c(0.5, 0.25)

as.numeric( gsub((\\d+)/(\\d+), \\1, xxx, perl=T) ) /  
as.numeric(

gsub((\\d+)/(\\d+), \\2, xxx, perl=T) )

or:

library(gsubfn)
as.numeric(
   gsubfn(([0-9]+)/([0-9]+),
   numerator+denominator~as.numeric(numerator)/as.numeric 
(denominator),

   xxx, backref=-2)
)
strapply, also in the gsubfn package, could be used here in a  
similar way too:

library(gsubfn)
strapply(xxx, ([0-9]+)/([0-9]+), ~ as.numeric(x) / as.numeric 
(y), backref = -2, simplify = c)

[1] 0.50 0.25
or

fn$sapply(strapply(xxx, [0-9]+, as.numeric), ~ x[1]/x[2])

[1] 0.50 0.25


I may have missed this one in the thread someplace, but I did not  
note any solutions based upon using strsplit(). So:


 sapply(strsplit(xxx, split = /),
 function(x) as.numeric(x[1]) / as.numeric(x[2]))
[1] 0.50 0.25
or
 apply(sapply(strsplit(xxx, split = /), as.numeric),
2, function(x) x[1] / x[2])
[1] 0.50 0.25


OK. Here is an other one:
* for UNIX/Mac and it's meant funnily [to rest R ;) ]

sapply(as.list(xxx),
	function(x) as.numeric(system(paste(echo 'scale=10;,x,'| 
bc,sep=''), intern=T)))


--Hans

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] (no subject)

2008-06-19 Thread Hans-Jörg Bibiko


On 19.06.2008, at 07:24, Paul Adams wrote:


Hello everyone,
I am wanting to replace an element in a matrix with NA. I have used  
the following code
dat-read.table(file=C:\\Documents and Settings\ 
\txt,header=T,row.names=1)

file.show(file=C:\\Documents and Settings\\txt)
Z.matrix-as.matrix(dat)
Y-dat[,46:63]
X-dat[1,51]
dat[1,51]-NA
Whenever I use this code I get the original value when I type show 
(X).I run the script and type

show(X) and the original value is still there.What am I doing wrong?


Well, actually nothing.
R works line by line. You set X-dat[1,51] with let's say '4711'.  
Fine X is now '4711'. Then you change the cell dat[1,51]-NA. Fine.
If you type show(dat) you'll see that the cell 1,51 is now NA. But X  
is still the same, because X is NOT bound to the content of the cell  
dat[1,51].


--Hans

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Pattern Matching Replacement

2008-06-19 Thread Hans-Jörg Bibiko


On 19.06.2008, at 20:17, ppatel3026 wrote:



I would like to replace \r\n with  in a character string, where  
\r\n

exists only between  and , how could I do that?

Initial:
characterString = XMLtag1
id=\F\r\n2\/t\r\nag1\r\ntag\r\n2/tag2/XML

Result:
characterString = XMLtag1 id=\F2\/tag1\r\ntag2/tag2/ 
XML


Tried with sub(below) but it only replaces the first instance and I  
am not
sure how to pattern match so that it only replaces \r\n that exist  
within

tags( and ).

sub(\r\n, , charStream)


It's only a very first idea:

gsub((?=)([^]*?)\\r\\n([^]*?)(?=), \\1\\2, characterString,  
perl=T)



--Hans

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Editor for Mac OSX

2008-06-18 Thread Hans-Jörg Bibiko


On 18.06.2008, at 18:09, Graham Smith wrote:


Have a look at TextMate  http://macromates.com/


There are three extensions (bundles) available dealing with R for  
TextMate [up to now in the Review repository]:

[in very short terms]
1) R
- writing R scripts and executing it (plots are inside as pdfs  
available)

- one can execute more than one R script
- fast syntax highlighting, snippets (macros) for fast writing etc.
- command completion
- GUI for inserting function parameters
- displaying the command signature
- tidy function
- drag'n'drop facilities (drag a csv to the window it inserts read.csv 
('filename'))



2) R Console (R.app)
- to remote R.app via AppleScript but with all goodies of TextMate


3) R Console (Rdaemon) - an ESS-like extension
- R runs hidden inside of TextMate
- it combines a text editor and the R console in a sophisticated way
- easy to expandably
- many GUI-like elements (Graphics/Package Manager ...)
- only written in scripting languages
- crash safer - if R/TextMate/Mac crashes one has at least the entire  
last session as text file to reconstruct



and many many many more

If you want to know more let it me know.

--Hans



2008/6/18 Sebastian Leuzinger [EMAIL PROTECTED]:


Dear R-list
I am (forced) to change from Linux to Mac and am now looking for a  
new
editor for R. I would like one that features a split window  
(console +
editor) as well as syntax highlighting. Can anyone help?  
Especially the
split-window feature does not seem to be easily available in the  
editors
desribed on the R-help site, except Emacs, which I am reluctant to  
start

using.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] gsub and multiple replacements

2008-06-02 Thread Hans-Jörg Bibiko


On 02.06.2008, at 17:27, Ng Stanley wrote:

I would like to replace A B by A-B and AA 
(DD) by
AA using a single gsub. Is that possible besides using two  
gsub ?




Could you be a bit more precisely?

If you are dealing with two fix strings then you can write

ifelse(theString == A B, A-B, AA)

if not, one could find a regexp to solve that problem, but one could  
also use gsub in a cascade:


gsub('regexp1', 'replace1', gsub('regexp2', 'replace2', theString) )  
etc.


--Hans

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Unicode characters (R 2.7.0 on Windows XP SP3 and Hardy Heron)

2008-06-01 Thread Hans-Jörg Bibiko

On 31.05.2008, at 00:11, Prof Brian Ripley wrote:


On Fri, 30 May 2008, Duncan Murdoch wrote:
But I think with Brian Ripley's work over the last while, R for  
Windows actually handles utf-8 pretty well.  (It might not guess  
at that encoding, but if you tell it that's what you're using...)


Yes. I already mentioned that there was a big step from R 2.6 to R  
2.7 for Windows regarding the support of UTF-8.


R passes around, prints and plots UTF-8 character data pretty well,  
but it translates to the native encoding for almost all character- 
level manipulations (and not just on Windows).  ?Encoding spells  
out the exceptions (and I think the original poster had not read  
it).  As time goes on we may add more, but it is really tedious  
(and somewhat error-prone) to have multiple paths through the code  
for different encodings (and different OSes do handle these  
differently -- Windows' use of UTF-16 means that one character may  
not be one wchar_t).



R is becoming more and more popular amongst philologists, linguistics  
etc. It is very nice to have one software environment to gather,  
analyze, and visualize data based on texts. But, e.g. linguists are  
dealing very often with more than one language at the same time.  
That's why they have to use an Unicode encoding.
In R they have to use any functions dealing with characters, like  
nchar, strsplit, grep/gsub, to lower/upper case etc.
These functions are, more or less, based on the underlying locale  
settings. But why?


It is a very very painful task to write functions for different  
encodings on different platforms. Thus I wonder whether it would be  
possible to switch internally to one Unicode encoding. If one  
considers e.g. the memory usage UTF-8 would be an option. Of course,  
such a change will be REALLY a BIG challenge in terms of effort,  
speed, compatibility, etc. This would also mean to avoid the usage of  
system libraries.
Maybe this would be a task for R 4.0 or it will be my eternal private  
dream :)


OK. Let me be a bit more realistic.
An other issue is the used regular expression engine. On a Mac or  
UNIX machine one can set a UTF-8 locale. Fine. But these locales  
aren't available under Windows (yet?). Maybe it's worth to have a  
look at other regexp engines like Oniguruma ( http://www.geocities.jp/ 
kosako3/oniguruma/ ). It supports, among others, all Unicode  
encodings. It is used in many applications. I do not know how  
difficult it will be to implement such a library in R. But this would  
solve, I guess, 80%  of the problems of R users who are interested in  
text analyzing. nchar, strsplit, grep etc. could make usage of it.  
Maybe one could write such a package for Windows (maybe also for Mac/ 
UNIX, because Oniguruma has some very nice additional features). Of  
course, a string should be piped as an UTF-8 byte stream to the  
Oniguruma lib, and I do not know whether this is easily possible in R  
for Windows.


Once again, thanks for all the effort done to set up such a wonderful  
piece of software.


--Hans

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Unicode characters (R 2.7.0 on Windows XP SP3 and Hardy Heron)

2008-05-30 Thread Hans-Jörg Bibiko

Hi,

to put it simply. Windows cannot handle utf-8 data. There is no utf-8  
locale available.
If your corpus only contains Russian data, maybe English glosses etc.  
you can try to set lang of Rgui.exe to Russian.
Then at least you can use grep, strsplit because they are depending  
on the locales chosen.



On 30.05.2008, at 17:14, Stefan Th. Gries wrote:



# I can do that all on Linux, but this arises in a context where
# many other character processing issues are explained for Mac,
# Linux, *and* Windows, and I'd hate to have to say this one
# thing, you can't do on Windows


Unfortunately I have to say this quite often :)

Cheers,

--Hans

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data.frame indexing

2008-04-28 Thread Hans-Jörg Bibiko


On 28.04.2008, at 16:40, Georg Ehret wrote:


E.g.:

a-as.data.frame(matrix(rnorm(100),nrow=10,ncol=10))
b-which(a$V10.8)
b

[1]  1  4  6 10

a_indexb-a[b,]
a_notIndexB-a[!b,]
nrow(a_notIndexB)

[1] 0

Indexing a on b is not a problem (a_indexb), but how can do get  
only the

elements left if I take out the elements indexed with b?


The ! operator only works on BOOLEAN.

ONE possible way to set a_notIndexB is:

a_notIndexB -a [-1*b, ]

--Hans

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Regular Expressions Help

2008-04-19 Thread Hans-Jörg Bibiko
On 19.04.2008, at 06:46, maud wrote:

 I am having some trouble learning regular expressions. Let me describe
 the general problem I am dealing with. Consider the following setup:

 Joe- c(1,2,3)
 Bob- c(2,4,6)
 Alice - c(9,8,7)

 Matrix - cbind(Joe, Bob, Alice)
 St - c(Bob, Alice, Alice:Bob)
 [...]
 I have been reading over various post on regular expressions, but
 really haven't made any progress. As far as I can tell there aren't
 standard string functions in R. (Also, as an aside, is there a
 wildcard character in R? I'd want something so if x=Bob a statment
 of the form x== B*b would evaluate true where * is the wildcard.)


I'm not really sure if I understood you correctly.
If you're looking for a way to match St against something like B*b  
you can make usage of normal regular expressions (like grep()).

For details begin with ?regexp and ?grep

Then you can try for instance:

grep(B.b, St)

to get
[1] 1 3

(the indices of the vector St)

or directly

St[grep(B.b, St)]

Cheers,
--Hans

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to replace German umlauts in strings?

2008-04-10 Thread Hans-Jörg Bibiko
On 10.04.2008, at 18:03, Hofert Marius wrote:
 I have a file containing names of German students. These names
 contain the characters ä, ö or ü (German umlauts). I use
 read.table() to read the file and let's assume the table is then
 stored in a variable called data. The names are then contained in
 the first column, i.e. data[,1]. Now if I simply display the variable
 data, I see, that ä is replaced by \x8a, ö is replaced by \x9a
 and so forth. Now, I would like to have these characters replaced by
 their LaTeX (or TeX) equivalents, meaning \x8a should be replaced by
 \a, \x9a should be replaced by \o and so forth. I tried a lot,
 especially with gsub(), however, the backslashes cause problems and I
 do not know how to get this to work. The data.frame should then be
 written to a file without destroying the replaced substrings (so that
 indeed \a appears in the file for \x8a). Is this possible?

 Here is a minimal example:
 data=data.frame(names=c(Bj\x9arn,S\x9aren),points=c
 (10,20),stringsAsFactors=F)
 data[1,1]=gsub('\\x9a','\\o',data[1,1]) #does not work! (neither do
 similar calls)

Try this:

gsub('\\x9a','\\o',m, perl = TRUE, useBytes = TRUE)

Cheers,

--Hans
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] for loop help

2008-04-10 Thread Hans-Jörg Bibiko

On 11.04.2008, at 05:38, [EMAIL PROTECTED]  
[EMAIL PROTECTED] wrote:

 ?`break`
 ?`next`



 for(i in 1:13) {

 if(i  13) next
 print(Hello!\n)
 }
 [1] Hello!\n




 I am trying to find a solution in R for the following C++ code that
 allows
 one to skip ahead in the loop:

 for (x = 0; x = 13; x++){
  x=12;
  cout  Hello World;
 }

 Note that Hello World was printed only twice using this C++ loop. I
 tried to do the same in R:

 for(i in 1:13){
  i=12
  print(Hello World)
 }
 It doesn't work as I expected, i.e., this R loop prints Hello  
 World 13
 times.


Maybe to understand this try:
for(i in 1:13){i - rnorm(1);print(i)}

You cannot change the seq in a for loop as you can in C, Perl, etc.
The variables i in for() and the i within the statement are not the  
same. Thus you only can query i (see Bill's hint).

see also ?Control; last paragraph in 'Details'

--Hans

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Number of words in a string

2008-04-09 Thread Hans-Jörg Bibiko

On 09.04.2008, at 17:46, Shubha Vishwanath Karanth wrote:
 To put it simple,

 C=c(My Dog, Its really good, Beautiful)

 Now,
 SOMEFUNCTION(C) should give: c(My, Its really, )

SOMEFUNCTION - function(x) gsub( *\\w+$, , x)

But be aware that this won't work for instance for combining diacritics.
If you have this:

C - c(My Dog, Its really good, Beautiful, Tuli faŝda)

in fasda above the s is a combining circumfix ^

would give

[1] My Its reallyTuli faŝ

Then one should use the strsplit approach.

Cheers,

--Hans
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to create a legend without plot, and to use scientific notation for axes label ?

2008-04-09 Thread Hans-Jörg Bibiko

On 10.04.2008, at 05:18, Ng Stanley wrote:
 Also, is there any way to specify scientific notation for axes label ?

Scientific notation à la 3E-4 is set by default. Or did you mean  
engineering notation?
Maybe this could help:
http://wiki.r-project.org/rwiki/doku.php?id=tips:data- 
strings:formatengineering

--Hans

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] distance matrix as text file - how to import?

2008-04-08 Thread Hans-Jörg Bibiko
Dear all,

I have -hopefully- a tiny problem.

I was sent a text file containing a distance matrix à la:

1
2 3
4 5 6

Now I wanted to import these data into a dist object to, let's say,  
do 'plot(hclust(v))'.

My first naïve approach was to scan the text file in order to get a  
vector v. Then I did:

class(v) - dist
attr(v, Size) - 4

But, of course, I got this:

1
2 4
3 5 6

I wonder if there's an elegant way to do it. The only way I know of  
is very very stony one.

I'd be very appreciated for any hint.

Cheers,

--Hans

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.