Hello!
 
Please accept my sincere apologies for annoying the R development team with my 
post this week. If I were required to register as "a developer" before 
submission, this would not have happened. To rehabilitate myself, please find 
at the bottom of this mail two R-functions, 'string2vector' and 
'vector2string', with "comments and tests". Both functions may go a long way 
towards assisting a number of R-users to make their R-programming more 
productive. I am a novice R-programmer: I started dabbling in R less than two 
months ago, heavily influenced by examples of code I see, including within the 
R.org documents (monkey does what monkey sees). Before posting two functions, I 
would really appreciate constructive edits where they may be needed as well as 
their posting by someone-in-the-know so there will be conveniently accessible 
for R users.

I am very impressed with potential of R and the community supporting it. I just 
wish I got to R sooner: I am looking to R to better support my work in 
"designed experiments to assess the statistically significant performance of 
combinatorial optimization algorithms on instance isomorphs of NP-hard 
problems" -- for better context of this mouthful, see the few postings under
  http://www.cbl.ncsu.edu:16080/xBed/publications/
I am working on a tutorial paper where I expect R to play a significant role in 
better explaining and illustrating, code-wise and graphically, the concepts 
discussed in the publications above. I would welcome a co-author with 
experience in R-programming as well as statistics and interests in the 
experimental methods addressed in these publications.

As I elaborate in notes that follow, I was looking at a variety of 
"R-documents" before my "bug" submission. I would appreciate very much if some 
of you could take the time to scan through these notes and respond briefly with 
useful pointers. Here are the headlines:

    (1) why I still think there may be a bug with 'noquote' vs 'as.integer'

    (2) search on "split string" and "join string"; the missing package 
"stringr"

    (3) a take on "Tcl" commands 'split', 'join', 'string', 'append', 'foreach'

    (4) a take on "R" functions 'string2vector' and 'vector2string'

    (5) code and comments for "R" functions 'string2vector' and 'vector2string

(1) why I still think there may be a bug with 'noquote' vs 'as.integer'
--------------------------------------------------------------------------------
> # MacOSX 10.6.2, R 2.9.1 GUI 1.28 Tiger build 32-bit (5444)
> qvector
[1] "0" "0" "0" "1" "1" "0" "1"
> qvector[1]
[1] "0"
> tmp = noquote(qvector[1])
> tmp
[1] 0
> tmp = as.integer(qvector[1])
> tmp
[1] 0
> 
When embedded in the function as per my "bug" report, 'noquote' and 
'as.integer' are no longer equivalent whereas in the example above they appear 
to be equivalent!! I submitted the "function" with print/cat statements for 
sake of illustration.

(2) search on "split string" and "join string"; the missing package "stringr"
--------------------------------------------------------------------------------
http://search.r-project.org/ reveals
   orderof 850 messages for search on "split string"
   orderof 160 messages for search on "join string"

http://finzi.psych.upenn.edu/search.html reveals
    for search on "split string"
        • Rhelp08:   [ split: 890 ] [ string: 1676 ] [ TOTAL: 77 ]
        • functions: [ split: 954 ] [ string: 6453 ] [ TOTAL: 204 ]
    for search on "join string"
        • Rhelp08:   [ join: 176 ] [ string: 1676 ] [ TOTAL: 8 ]
        • functions: [ join: 192 ] [ string: 6453 ] [ TOTAL: 36 ]
    This site also provides a link to the package "stringr"
    http://finzi.psych.upenn.edu/R/library/stringr/html/00Index.html
However, the download does not deliver ...
> install.packages("stringr")
  ....
   package ‘stringr’ is not available

There are a lot of hard-to-understand and not-so-relevant code snippets in all 
these 1000's of postings. I would argue that had robust functions such as 
'string2vector' and 'vector2string' been included in the R-package, many 
R-programmers could take longer vacations, spend their time more productively,
and significantly reduce duplication of coding efforts on basically the same
problems.

Since vector is such and important "primitive" in R, I argue that functions 
such as 'string2vector' and 'vector2string' should be made to play a role 
similar to commands 'split', 'join', 'string', and 'append' that support 
programmers in Tcl. See my take on Tcl in the section below.

(3) a take on "Tcl" commands 'split', 'join', 'string', 'append', 'foreach'
--------------------------------------------------------------------------------
I have been using Tcl to "wrap" a number of combinatorial solvers and automate 
workflows that implement and execute a number of my experiments on instance 
isomorphs. I even used Tcl to prototype few combinatorial optimization 
algorithm prototypes and write code for statistical analysis -- as task for 
which I now find R much better suited.

I intend to alert my Tcl colleagues in-the-know about the wonderful 
infrastructure provided in R when it comes to the R-shell (at least under 
MacOSX), and the ability to name and initialize function variable defaults 
explicitly, and the ability to install new packages so transparently. Before 
coming across R, I already took the trouble to create Tcl wrapper programs with 
command lines that feature identical order-indepent syntax as the syntax used 
in R. This being said, what I miss about R is gathering all commands on a 
single page such as
   http://www.tcl.tk/man/tcl8.5/TclCmd/contents.htm
Note that once you click on any of the commands, a number of classes that 
extend each command become visible, including the example section(s). 

Here I illustrate my use of just five tcl commands that subsequently guided my 
"design" of the function 'string2vector' in 'vector2string' "R"

# few "Tcl" examples before designing the function 'string2vector' in "R"
% set binS "10011"
% join [split $binS ""] ", "
1, 0, 0, 1, 1
%
% set strS "I \t am\tdone" 
% foreach item [split $strS "\t"] {append strSQ \"$item\",}
% set strSQ [string trimright $strSQ ,]
"I "," am","done"
# 
# few "Tcl" examples before designing the function 'vector2string' in "R"
% set strV "1,0,0,1"
1,0,0,1
% split $strV ","
1 0 0 1
join [split $strV ","] ":"
1:0:0:1

(4) a take on "R" functions 'string2vector' and 'vector2string'
--------------------------------------------------------------------------------
> # few tests of the function 'string2vector' in "R"
> binS = "10011"
> binV = string2vector(binS, SS="", type="int")
> binV[2] ; binV[5]
[1] 0
[1] 1
> strS = "I am done" 
> vecS = string2vector(strS, SS=" ", type="char")
> vecS[1] ; vecS[3]
[1] "I"
[1] "done"
> 
> # few tests of the function 'vector2string' in "R"
> binV = c(1,0,0,1) 
> vector2string(binV, type="int")
[1] "1001"
> vector2string(binV, SS=" ", type="char")
[1] "1 0 0 1"
> subsV = c("I", "am", "done")  
> vector2string(subsV, SS=":", type="char")
[1] "I:am:done"
> 

(5) code and comments for "R" functions 'string2vector' and 'vector2string'
--------------------------------------------------------------------------------

string2vector = function(string="ch-2 \t sec-7\tex-5", SS="\t", type="char")
#
# This procedure splits a string and assigns substrings to an R-vector.
# The split is controlled by the string separator SS (default value:  SS="\t").
# Here we convert  a binary string into a binary vector:
#   let  binS = "10011"  
#   then binV = string2vector(binS, SS="", type="int")
# Here we convert a string into a vector of substrings:
#   let  strS = "I am done" 
#   then vecS = string2vector(strS, SS=" ", type="char")
#
# LIMITATION: The function interprets all substrings either as of type 
#             "int" or "char".  A function that interprets the type of each
#             substring dynamically may one day be written by an R-guru.
#              
# Franc Brglez, Wed Dec  9 14:19:16 EST 2009
{   
    qlist   = strsplit(string, SS) ; qvector = qlist[[1]]
    n = length(qvector) ; xvector = NULL
    for (i in 1:n) {
        if (type == "int") {
            tmp = as.integer(qvector[i])
        } else {
            tmp = qvector[i]
        }
        xvector = c(xvector, tmp)
    }
    return(xvector)
} # string2vector

vector2string = function(vector=c("ch-2", "sec-7", "ex-5"), SS="_", 
type="char") 
#
# This procedure converts values from a vector to a concatenation of substrings 
# separated by user-specified string separator SS (default value:  SS="_").
# Each substring represents a vector component value, either as a numerical 
# value or as an alphanumeric string. 
# Here we convert a binary vector to a binary string representing an integer:
#   let  binV = c(1,0,0,1)  
#   then strS = vector2string(binV, type="int")
# Here we convert a binary vector to string representing a binary sequence:
#   let  binV = c(1,0,0,1)  
#   then seqS = vector2string(binV, SS=" ", type="char")
# Here we convert a vector of substrings to colon-separated string:
#   let subsV = c("I", "am", "done")  
#   then strS = vector2string(subsV, SS=":", type="char")
#
# LIMITATION: The function interprets all substrings in the vector either as of 
#             type "int" or "char".  A function that interprets the type of each
#             substring dynamically may one day be written by an R-guru.
#
# Franc Brglez, Wed Dec  9 15:43:59 EST 2009
{   
    if (type == "int") {
        string = paste(strsplit(paste(vector), " "), collapse="")
    } else {
        n = length(vector) ; nm1 = n-1 ; string = ""
        for (i in 1:nm1) {
            tmp    = noquote(vector[i])
            string = paste(string, tmp, SS, sep="")
        }
        tmp    = noquote(vector[n])
        string = paste(string, tmp, sep="")     
    }
    return(string)
} # vector2string

----------------
Dr. Franc Brglez                                        email: brg...@ncsu.edu 
Department of Computer Science, Box 8206     http://sitta.csc.ncsu.edu/~brglez
North Carolina State University                            TEL: (919) 515-9675
Raleigh NC 27695-8206 USA  

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to