Re: [R] making chains from pairs

2013-11-09 Thread David Winsemius

On Nov 8, 2013, at 10:56 AM, Hermann Norpois wrote:

 Hello,
 
 having a data frame like test with pairs of characters I would like to
 create chains. For instance from the pairs A/B and B/I you get the vector A
 B I. It is like jumping from one pair to the next related pair. So for my
 example test you should get:
 A B F G H I
 C F I K
 D L M N O P

 second - with(test, tapply(V2, V1, FUN=function(x) test[test$V2==x, ] ) )
Warning messages:
1: In test$V2 == x :
  longer object length is not a multiple of shorter object length
2: In test$V2 == x :
  longer object length is not a multiple of shorter object length
3: In test$V2 == x :
  longer object length is not a multiple of shorter object length

 third - sapply(names(second) , function(df) c(df, second[[df]][ , V2 ]) )
 third
$A
[1] A B F G H

$B
[1] B F I F I

$C
[1] C F I K

$D
[1] D L M N

$L
[1] L O P

 fourth - sapply(names(third), function(d) unique( c(third[[d]], 
   unlist(third[ sapply( third[[d]], [ ) 
]) ) ) )
 fourth
$A
[1] A B F G H I

$B
[1] B F I

$C
[1] C F I K

$D
[1] D L M N O P

$L
[1] L O P



 
 
 test
   V1 V2
 1   A  B
 2   A  F
 3   A  G
 4   A  H
 5   B  F
 6   B  I
 7   C  F
 8   C  I
 9   C  K
 10  D  L
 11  D  M
 12  D  N
 13  L  O
 14  L  P
 
 Thanks
 Hermann
 
 dput (test)
 structure(list(V1 = c(A, A, A, A, B, B, C, C,
 C, D, D, D, L, L), V2 = c(B, F, G, H, F,
 I, F, I, K, L, M, N, O, P)), .Names = c(V1,
 V2), row.names = c(NA, -14L), class = data.frame)
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Standard errors in regression models with interactions terms

2013-11-09 Thread rm
In a rather simple regression, I’d like to ask the question, for high trees,
whether it makes a difference (for volume) whether a three is thick.

If my interpretation is correct, for low trees, i.e. for which trees$isHigh
== FALSE, the answer is yes. 

The problem is how to merge the standard errors. Code follows.

data(trees)
trees$isHigh - trees$Height  76
trees$isThick - trees$Girth  13
m - lm(trees$Volume ~ trees$isHigh + trees$isThick +
trees$isHigh:trees$isThick)
summary(m)

I might be mistaken, but a workaround is to rewrite the model as follows,
which shows that the answer is yes. However, I would very much like to know
how to answer the question with the original model.

data(trees)
trees$isLow - trees$Height = 76
trees$isThick - trees$Girth  13
m - lm(trees$Volume ~ trees$isLow + trees$isThick +
trees$isLow:trees$isThick)
summary(m)








--
View this message in context: 
http://r.789695.n4.nabble.com/Standard-errors-in-regression-models-with-interactions-terms-tp4680104.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] `level' definition in `computeContour3d' (misc3d package)

2013-11-09 Thread j. van den hoff
I'd very much appreciate some help here: I'm in the process of clarifying  
whether I can use `computeContour3d' to derive estimates of the surface  
area of a single closed isosurface (and prospectively the enclosed  
volume). getting the surface area from the list of triangles returned by  
`computeContour3d' is straightforward but I've stumbled over the precise  
meaning of `level' here. looking into the package, ultimately the level is  
used in the namespace function `faceType' which reads:


function (v, nx, ny, level, maxvol)
{
if (level == maxvol)
p - v = level
else p - v  level
v[p] - 1
v[!p] - 0
v[-nx, -ny] + 2 * v[-1, -ny] + 4 * v[-1, -1] + 8 * v[-nx,
-1]
}

my question: is the discrimination of the special case `level == maxvol'  
(or rather of everything else) really desirable? I would argue
that always testing for `v = level' would be better. if I feed data with  
discrete values (e.g. integer-valued) defined
on a coarse grid into `computeContour3d' it presently makes a big  
difference whether there is a single data point (e.g.) with a value larger

than `level' or not. consider the 1D example:

data1 - c(0, 0, 1, 1, 1, 1, 1, 0, 0)
data2 - c(0, 0, 1, 2, 1, 1, 1, 0, 0)

and level = 1

this defines the isocontour `level = 1' to lie at pos 3 and 7 in for data1  
but as lying at pos 4 in data2. actually I would like (and expect) to get  
the same isosurface for `data2' with this `level' setting. in short: the  
meaning/definition of `level' changes depending on whether or not it is  
equal to `maxvol'. this is neither stated in the manpage nor is this  
desirable in my view. but maybe I miss something here. any clarification  
would be appreciated.


j.



--

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] S4 vs S3. New Package

2013-11-09 Thread daniel schnaider
Hi,

I am working on a new credit portfolio optimization package. My question is
if it is more recommended to develop in S4 object oriented or S3.

It would be more naturally to develop in object oriented paradigm, but
there is many concerns regarding S4.

1) Performance of S4 could be an issue as a setter function, actually
changes the whole object behind the scenes.

2) Documentation. It has been really hard to find examples in S4. Most
books and articles consider straightforward S3 examples.

Thanks,

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] S4; Setter function is not chaning slot value as expected

2013-11-09 Thread daniel schnaider
It is my first time programming with S4 and I can't get the setter fuction
to actually change the value of the slot created by the constructor.

I guess it has to do with local copy, global copy, etc. of the variable -
but, I could't find anything relevant in documentation.

Tried to copy examples from the internet, but they had the same problem.

# The code
setClass (Account ,
   representation (
   customer_id = character,
   transactions = matrix)
)


Account - function(id, t) {
new(Account, customer_id = id, transactions = t)
}


setGeneric (CustomerID-, function(obj,
id){standardGeneric(CustomerID-)})
setReplaceMethod(CustomerID, Account, function(obj, id){
obj@customer_id - id
obj
})

ac - Account(12345, matrix(c(1,2,3,4,5,6), ncol=2))
ac
CustomerID - 54321
ac

#Output
 ac
An object of class Account
Slot customer_id:
[1] 12345

Slot transactions:
 [,1] [,2]
[1,]14
[2,]25
[3,]36

# CustomerID is value has changed to 54321, but as you can see it does't
 CustomerID - 54321
 ac
An object of class Account
Slot customer_id:
[1] 12345

Slot transactions:
 [,1] [,2]
[1,]14
[2,]25
[3,]36


Help!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] `level' definition in `computeContour3d' (misc3d package)

2013-11-09 Thread Duncan Murdoch

On 13-11-09 8:50 AM, j. van den hoff wrote:

I'd very much appreciate some help here: I'm in the process of clarifying
whether I can use `computeContour3d' to derive estimates of the surface
area of a single closed isosurface (and prospectively the enclosed
volume). getting the surface area from the list of triangles returned by
`computeContour3d' is straightforward but I've stumbled over the precise
meaning of `level' here. looking into the package, ultimately the level is
used in the namespace function `faceType' which reads:

function (v, nx, ny, level, maxvol)
{
  if (level == maxvol)
  p - v = level
  else p - v  level
  v[p] - 1
  v[!p] - 0
  v[-nx, -ny] + 2 * v[-1, -ny] + 4 * v[-1, -1] + 8 * v[-nx,
  -1]
}

my question: is the discrimination of the special case `level == maxvol'
(or rather of everything else) really desirable? I would argue
that always testing for `v = level' would be better. if I feed data with
discrete values (e.g. integer-valued) defined
on a coarse grid into `computeContour3d' it presently makes a big
difference whether there is a single data point (e.g.) with a value larger
than `level' or not. consider the 1D example:

data1 - c(0, 0, 1, 1, 1, 1, 1, 0, 0)
data2 - c(0, 0, 1, 2, 1, 1, 1, 0, 0)

and level = 1

this defines the isocontour `level = 1' to lie at pos 3 and 7 in for data1
but as lying at pos 4 in data2. actually I would like (and expect) to get
the same isosurface for `data2' with this `level' setting. in short: the
meaning/definition of `level' changes depending on whether or not it is
equal to `maxvol'. this is neither stated in the manpage nor is this
desirable in my view. but maybe I miss something here. any clarification
would be appreciated.


I don't see why you'd expect the same output from those vectors, but 
since they aren't legal input to computeContour3d, maybe I don't know 
what you mean by them.  Could you put together a reproducible example 
that shows bad contours?


Duncan Murdoch



j.



--

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] S4; Setter function is not chaning slot value as expected

2013-11-09 Thread Simon Zehnder
If you want to set a slot you have to refer to it:

ac@CustomerID - “54321” 

or you use your setter:

ac - CustomerID(ac, “54321”)

What you did was creating a new symbol CustomerID referring to the String 
“54321”
CustomerID - “54321”
CustomerID
[1] “54321”

Best

Simon

On 09 Nov 2013, at 15:31, daniel schnaider dschnai...@gmail.com wrote:

 It is my first time programming with S4 and I can't get the setter fuction
 to actually change the value of the slot created by the constructor.
 
 I guess it has to do with local copy, global copy, etc. of the variable -
 but, I could't find anything relevant in documentation.
 
 Tried to copy examples from the internet, but they had the same problem.
 
 # The code
setClass (Account ,
   representation (
   customer_id = character,
   transactions = matrix)
)
 
 
Account - function(id, t) {
new(Account, customer_id = id, transactions = t)
}
 
 
setGeneric (CustomerID-, function(obj,
 id){standardGeneric(CustomerID-)})
setReplaceMethod(CustomerID, Account, function(obj, id){
obj@customer_id - id
obj
})
 
ac - Account(12345, matrix(c(1,2,3,4,5,6), ncol=2))
ac
CustomerID - 54321
ac
 
 #Output
 ac
An object of class Account
Slot customer_id:
[1] 12345
 
Slot transactions:
 [,1] [,2]
[1,]14
[2,]25
[3,]36
 
 # CustomerID is value has changed to 54321, but as you can see it does't
 CustomerID - 54321
 ac
An object of class Account
Slot customer_id:
[1] 12345
 
Slot transactions:
 [,1] [,2]
[1,]14
[2,]25
[3,]36
 
 
 Help!
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] S4 vs S3. New Package

2013-11-09 Thread Simon Zehnder
This depends very often of on the developer and what he is comfortable with. I 
like S4 classes, as I come from C++ and S4 classes approximate C++ classes at 
least more than S3 classes do (Reference Classes would do so even more and I 
know very good R programmers liking these most).

1) I wrote a package for MCMC simulation with S4 classes carrying all simulated 
values - fast enough for me: in less than 1.5 secs I have my sample of 100.000 
values together with several other 100T values like log-likelihoods, posterior 
hyper parameters, etc. I watch out for not copying too often an object but 
sometimes it is not avoidable. 

2) That is not true: 

Books:
http://www.amazon.de/Software-Data-Analysis-Programming-Statistics/dp/0387759352/ref=sr_1_1?ie=UTF8qid=1384014486sr=8-1keywords=John+chambers+data
http://www.amazon.de/Programming-Data-Language-John-Chambers/dp/0387985034/ref=sr_1_4?ie=UTF8qid=1384014486sr=8-4keywords=John+chambers+data

Online:
https://www.rmetrics.org/files/Meielisalp2009/Presentations/Chalabi1.pdf
https://www.stat.auckland.ac.nz/S-Workshop/Gentleman/S4Objects.pdf

And for a bunch of packages look into the Bioconductor packages.

Best

Simon

On 09 Nov 2013, at 16:22, daniel schnaider dschnai...@gmail.com wrote:

 Hi,
 
 I am working on a new credit portfolio optimization package. My question is
 if it is more recommended to develop in S4 object oriented or S3.
 
 It would be more naturally to develop in object oriented paradigm, but
 there is many concerns regarding S4.
 
 1) Performance of S4 could be an issue as a setter function, actually
 changes the whole object behind the scenes.
 
 2) Documentation. It has been really hard to find examples in S4. Most
 books and articles consider straightforward S3 examples.
 
 Thanks,
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] `level' definition in `computeContour3d' (misc3d package)

2013-11-09 Thread j. van den hoff
On Sat, 09 Nov 2013 17:16:28 +0100, Duncan Murdoch  
murdoch.dun...@gmail.com wrote:



On 13-11-09 8:50 AM, j. van den hoff wrote:
I'd very much appreciate some help here: I'm in the process of  
clarifying

whether I can use `computeContour3d' to derive estimates of the surface
area of a single closed isosurface (and prospectively the enclosed
volume). getting the surface area from the list of triangles returned by
`computeContour3d' is straightforward but I've stumbled over the precise
meaning of `level' here. looking into the package, ultimately the level  
is

used in the namespace function `faceType' which reads:

function (v, nx, ny, level, maxvol)
{
  if (level == maxvol)
  p - v = level
  else p - v  level
  v[p] - 1
  v[!p] - 0
  v[-nx, -ny] + 2 * v[-1, -ny] + 4 * v[-1, -1] + 8 * v[-nx,
  -1]
}

my question: is the discrimination of the special case `level == maxvol'
(or rather of everything else) really desirable? I would argue
that always testing for `v = level' would be better. if I feed data  
with

discrete values (e.g. integer-valued) defined
on a coarse grid into `computeContour3d' it presently makes a big
difference whether there is a single data point (e.g.) with a value  
larger

than `level' or not. consider the 1D example:

data1 - c(0, 0, 1, 1, 1, 1, 1, 0, 0)
data2 - c(0, 0, 1, 2, 1, 1, 1, 0, 0)

and level = 1

this defines the isocontour `level = 1' to lie at pos 3 and 7 in for  
data1
but as lying at pos 4 in data2. actually I would like (and expect) to  
get

the same isosurface for `data2' with this `level' setting. in short: the
meaning/definition of `level' changes depending on whether or not it is
equal to `maxvol'. this is neither stated in the manpage nor is this
desirable in my view. but maybe I miss something here. any clarification
would be appreciated.


I don't see why you'd expect the same output from those vectors, but  
since they aren't legal input to computeContour3d, maybe I don't know  
what you mean by them.  Could you put together a reproducible example  
that shows bad contours?


it's not bad contours, actually. my question only concerns the different  
meaning

of `level' depending on whether `level = maxvol' or not.

here is a real example:

8
library(misc3d)

dim - 21
cnt - (dim+1)/2
wid1 - 5
wid2 - 1
rng1 - (cnt-wid1):(cnt+wid1)
rng2 - (cnt-wid2):(cnt+wid2)

v - array(0, rep (dim, 3))

#put 11x11x11 box of ones at center
v[rng1, rng1, rng1] - 1

con1 - computeContour3d(v, level = 1)
drawScene(makeTriangles(con1))
dum - readline(CR for next plot)

#put an additional  3x3x3 box of twos at center
v[rng2, rng2, rng2] - 2
con2 - computeContour3d(v, level = 1)
drawScene(makeTriangles(con2))
8

this first puts a 11x11x11 box one Ones at the center of the  
zero-initalized array and computes `con1' for `level=1'. in the 2. step
it puts a further, 3x3x3 box of Twos at the center and computes the  
`level=1' contour again which this time does not delineate
the box of Ones but lies somewhere between the two non-zero boxes since  
now the test in `faceType' is for ` level'. this is not immediately  
obvious from the plots (no scale) but obvious from looking at `con1' and  
`con2': the `con2' isosurface is shrunk by 3 voxels at each
side relative to `con1' (so my initial mail was wrong here: `con2' does  
not jump to the next discrete isocontour but rather to
a point about halfway between both plateaus ). I also (for my own problem  
at hand) computed the total surface area which is
(not surprisingly...) 600 for `con1' and 64.87 for `con2'. so if one is  
interested in such surfaces (I am) this makes a big difference in such  
data.


the present behavior is not wrong per se but I would much prefer if the  
test where always for `= level' (so that in the present example the
resulting isosurface would in both cases delineate the box of Ones -- as  
is the case when using `level = 1-e-6' instead of `level=1').


I believe the isosurface for a given value of `level' should have an  
unambiguous meaning independent of what the data further inside are  
looking like.


is this clearer now?



Duncan Murdoch



j.



--

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide  
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.






--

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] `level' definition in `computeContour3d' (misc3d package)

2013-11-09 Thread Duncan Murdoch

On 13-11-09 11:57 AM, j. van den hoff wrote:

On Sat, 09 Nov 2013 17:16:28 +0100, Duncan Murdoch
murdoch.dun...@gmail.com wrote:


On 13-11-09 8:50 AM, j. van den hoff wrote:

I'd very much appreciate some help here: I'm in the process of
clarifying
whether I can use `computeContour3d' to derive estimates of the surface
area of a single closed isosurface (and prospectively the enclosed
volume). getting the surface area from the list of triangles returned by
`computeContour3d' is straightforward but I've stumbled over the precise
meaning of `level' here. looking into the package, ultimately the level
is
used in the namespace function `faceType' which reads:

function (v, nx, ny, level, maxvol)
{
   if (level == maxvol)
   p - v = level
   else p - v  level
   v[p] - 1
   v[!p] - 0
   v[-nx, -ny] + 2 * v[-1, -ny] + 4 * v[-1, -1] + 8 * v[-nx,
   -1]
}

my question: is the discrimination of the special case `level == maxvol'
(or rather of everything else) really desirable? I would argue
that always testing for `v = level' would be better. if I feed data
with
discrete values (e.g. integer-valued) defined
on a coarse grid into `computeContour3d' it presently makes a big
difference whether there is a single data point (e.g.) with a value
larger
than `level' or not. consider the 1D example:

data1 - c(0, 0, 1, 1, 1, 1, 1, 0, 0)
data2 - c(0, 0, 1, 2, 1, 1, 1, 0, 0)

and level = 1

this defines the isocontour `level = 1' to lie at pos 3 and 7 in for
data1
but as lying at pos 4 in data2. actually I would like (and expect) to
get
the same isosurface for `data2' with this `level' setting. in short: the
meaning/definition of `level' changes depending on whether or not it is
equal to `maxvol'. this is neither stated in the manpage nor is this
desirable in my view. but maybe I miss something here. any clarification
would be appreciated.


I don't see why you'd expect the same output from those vectors, but
since they aren't legal input to computeContour3d, maybe I don't know
what you mean by them.  Could you put together a reproducible example
that shows bad contours?


it's not bad contours, actually. my question only concerns the different
meaning
of `level' depending on whether `level = maxvol' or not.

here is a real example:

8
library(misc3d)

dim - 21
cnt - (dim+1)/2
wid1 - 5
wid2 - 1
rng1 - (cnt-wid1):(cnt+wid1)
rng2 - (cnt-wid2):(cnt+wid2)

v - array(0, rep (dim, 3))

#put 11x11x11 box of ones at center
v[rng1, rng1, rng1] - 1

con1 - computeContour3d(v, level = 1)
drawScene(makeTriangles(con1))
dum - readline(CR for next plot)

#put an additional  3x3x3 box of twos at center
v[rng2, rng2, rng2] - 2
con2 - computeContour3d(v, level = 1)
drawScene(makeTriangles(con2))
8

this first puts a 11x11x11 box one Ones at the center of the
zero-initalized array and computes `con1' for `level=1'. in the 2. step
it puts a further, 3x3x3 box of Twos at the center and computes the
`level=1' contour again which this time does not delineate
the box of Ones but lies somewhere between the two non-zero boxes since
now the test in `faceType' is for ` level'. this is not immediately
obvious from the plots (no scale) but obvious from looking at `con1' and
`con2': the `con2' isosurface is shrunk by 3 voxels at each
side relative to `con1' (so my initial mail was wrong here: `con2' does
not jump to the next discrete isocontour but rather to
a point about halfway between both plateaus ). I also (for my own problem
at hand) computed the total surface area which is
(not surprisingly...) 600 for `con1' and 64.87 for `con2'. so if one is
interested in such surfaces (I am) this makes a big difference in such
data.

the present behavior is not wrong per se but I would much prefer if the
test where always for `= level' (so that in the present example the
resulting isosurface would in both cases delineate the box of Ones -- as
is the case when using `level = 1-e-6' instead of `level=1').

I believe the isosurface for a given value of `level' should have an
unambiguous meaning independent of what the data further inside are
looking like.



I think it does, but your data make the determination of its location 
ambiguous.


The definition is the estimated location where the continuous field 
sampled at v crosses level.


You have a field with a discontinuity (or two).  You have whole volumes 
of space where the field is equal to the level.  The marching cubes 
algorithm is designed to detect crossings, not solid regions.


For example, going back to one dimension, if your data looked like your 
original vector


data1 - c(0, 0, 1, 1, 1, 1, 1, 0, 0)

then it is ambiguous where it crosses 1:  it could be at 3 and 7, or 
there could be multiple crossings in that range.  I believe the 
analogous situation in misc3d would treat this as a crossing at 3 and 7.


Duncan Murdoch


Re: [R] `level' definition in `computeContour3d' (misc3d package)

2013-11-09 Thread j. van den hoff
On Sat, 09 Nov 2013 18:18:23 +0100, Duncan Murdoch  
murdoch.dun...@gmail.com wrote:



On 13-11-09 11:57 AM, j. van den hoff wrote:

On Sat, 09 Nov 2013 17:16:28 +0100, Duncan Murdoch
murdoch.dun...@gmail.com wrote:


On 13-11-09 8:50 AM, j. van den hoff wrote:

I'd very much appreciate some help here: I'm in the process of
clarifying
whether I can use `computeContour3d' to derive estimates of the  
surface

area of a single closed isosurface (and prospectively the enclosed
volume). getting the surface area from the list of triangles returned  
by
`computeContour3d' is straightforward but I've stumbled over the  
precise
meaning of `level' here. looking into the package, ultimately the  
level

is
used in the namespace function `faceType' which reads:

function (v, nx, ny, level, maxvol)
{
   if (level == maxvol)
   p - v = level
   else p - v  level
   v[p] - 1
   v[!p] - 0
   v[-nx, -ny] + 2 * v[-1, -ny] + 4 * v[-1, -1] + 8 * v[-nx,
   -1]
}

my question: is the discrimination of the special case `level ==  
maxvol'

(or rather of everything else) really desirable? I would argue
that always testing for `v = level' would be better. if I feed data
with
discrete values (e.g. integer-valued) defined
on a coarse grid into `computeContour3d' it presently makes a big
difference whether there is a single data point (e.g.) with a value
larger
than `level' or not. consider the 1D example:

data1 - c(0, 0, 1, 1, 1, 1, 1, 0, 0)
data2 - c(0, 0, 1, 2, 1, 1, 1, 0, 0)

and level = 1

this defines the isocontour `level = 1' to lie at pos 3 and 7 in for
data1
but as lying at pos 4 in data2. actually I would like (and expect) to
get
the same isosurface for `data2' with this `level' setting. in short:  
the
meaning/definition of `level' changes depending on whether or not it  
is

equal to `maxvol'. this is neither stated in the manpage nor is this
desirable in my view. but maybe I miss something here. any  
clarification

would be appreciated.


I don't see why you'd expect the same output from those vectors, but
since they aren't legal input to computeContour3d, maybe I don't know
what you mean by them.  Could you put together a reproducible example
that shows bad contours?


it's not bad contours, actually. my question only concerns the  
different

meaning
of `level' depending on whether `level = maxvol' or not.

here is a real example:

8
library(misc3d)

dim - 21
cnt - (dim+1)/2
wid1 - 5
wid2 - 1
rng1 - (cnt-wid1):(cnt+wid1)
rng2 - (cnt-wid2):(cnt+wid2)

v - array(0, rep (dim, 3))

#put 11x11x11 box of ones at center
v[rng1, rng1, rng1] - 1

con1 - computeContour3d(v, level = 1)
drawScene(makeTriangles(con1))
dum - readline(CR for next plot)

#put an additional  3x3x3 box of twos at center
v[rng2, rng2, rng2] - 2
con2 - computeContour3d(v, level = 1)
drawScene(makeTriangles(con2))
8

this first puts a 11x11x11 box one Ones at the center of the
zero-initalized array and computes `con1' for `level=1'. in the 2. step
it puts a further, 3x3x3 box of Twos at the center and computes the
`level=1' contour again which this time does not delineate
the box of Ones but lies somewhere between the two non-zero boxes since
now the test in `faceType' is for ` level'. this is not immediately
obvious from the plots (no scale) but obvious from looking at `con1' and
`con2': the `con2' isosurface is shrunk by 3 voxels at each
side relative to `con1' (so my initial mail was wrong here: `con2' does
not jump to the next discrete isocontour but rather to
a point about halfway between both plateaus ). I also (for my own  
problem

at hand) computed the total surface area which is
(not surprisingly...) 600 for `con1' and 64.87 for `con2'. so if one is
interested in such surfaces (I am) this makes a big difference in such
data.

the present behavior is not wrong per se but I would much prefer if  
the

test where always for `= level' (so that in the present example the
resulting isosurface would in both cases delineate the box of Ones -- as
is the case when using `level = 1-e-6' instead of `level=1').

I believe the isosurface for a given value of `level' should have an
unambiguous meaning independent of what the data further inside are
looking like.



I think it does, but your data make the determination of its location  
ambiguous.


I was imprecise: what I meant is: the isosurface should not change in my  
example between both cases.




The definition is the estimated location where the continuous field  
sampled at v crosses level.


understood/agreed.



You have a field with a discontinuity (or two).  You have whole volumes  
of space where the field is equal to the level.  The marching cubes  
algorithm is designed to detect crossings, not solid regions.


For example, going back to one dimension, if your data looked like your  
original vector


data1 - c(0, 0, 1, 1, 1, 1, 1, 0, 0)

then it is ambiguous 

Re: [R] Date handling in R is hard to understand

2013-11-09 Thread Carl Witthoft
I agree w/ lubridate.I also would like to mention that   date handling 
is amazingly difficult in ALL computer languages, not just R.  Take a stroll
through sites like thedailywtf.com to see how quickly people get into
tarpits full of thorns when trying to deal with leap years, weeks vs month
ends, etc.   



Bert Gunter wrote
 Have a look at the lubridate package. It claims to try to make
 dealing with dates easier.
 
 
 -- Bert
 
 On Fri, Nov 8, 2013 at 11:41 AM, Alemu Tadesse lt;

 alemu.tadesse@

 gt; wrote:
 Dear All,

 I usually work with time series data. The data may come in AM/PM date
 format or on 24 hour time basis. R can not recognize the two differences
 automatically - at least for me. I have to specifically tell R in which
 time format the data is. It seems that Pandas knows how to handle date
 without being told the format. The problem arises when I try to shift
 time
 by a certain time. Say adding 3600 to shift it forward, that case I have
 to
 use something like:
 Measured_data$Date - as.POSIXct(as.character(Measured_data$Date),
 tz=,format = %m/%d/%Y %I:%M %p)+3600
 or Measured_data$Date - as.POSIXct(as.character(Measured_data$Date),
 tz=,format = %m/%d/%Y %H:%M)+3600  depending on the format. The date
 also attaches MDT or MST and so on. When merging two data frames  with
 dates of different format that may create a problem (I think). When I get
 data from excel it could be in any/random format and I needed to
 customize
 the date to use in R in one of the above formats. Any TIPS - for
 automatic
 processing with no need to specifically tell the data format ?

 Another problem I saw was that when using r bind to bind data frames, if
 one column of one of the data frames is a character data (say for example
 none - coming from mysql) format R doesn't know how to concatenate
 numeric
 column from the other data frame to it. I needed to change the numeric to
 character and later after binding takes place I had to re-convert it to
 numeric. But, this causes problem in an automated environment. Any
 suggestion ?

 Thanks
 Mihretu

 [[alternative HTML version deleted]]

 __
 

 R-help@

  mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 
 -- 
 
 Bert Gunter
 Genentech Nonclinical Biostatistics
 
 (650) 467-7374
 
 __

 R-help@

  mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





--
View this message in context: 
http://r.789695.n4.nabble.com/Date-handling-in-R-is-hard-to-understand-tp4680070p4680125.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] C50 Node Assignment

2013-11-09 Thread Carl Witthoft

Just to clarify:  I'm guessing the OP is referring to the CRAN package C50
here.   A quick skim suggests the rules are a list element of a C5.0-class
object, so maybe that's where to start?


David Winsemius wrote
 In my role as a moderator I am attempting to bypass the automatic mail
 filters that are blocking this posting. Please reply to the list and to:
 =
 Kevin Shaney lt;

 kevin.shaney@

 gt;
 
 C50 Node Assignment
 
 I am using C50 to classify individuals into 5 groups / categories (factor
 variable).  The tree / set of rules has 10 rules for classification.  I am
 trying to extract the RULE for which each individual qualifies (a number
 between 1 and 10), and cannot figure out how to do so.  I can extract the
 predicted group and predicted group probability, but not the RULE to which
 an individual qualifies.  Please let me know if you can help!
 
 Kevin
 =
 
 
 -- 
 David Winsemius
 Alameda, CA, USA
 
 __

 R-help@

  mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





--
View this message in context: 
http://r.789695.n4.nabble.com/C50-Node-Assignment-tp4680071p4680127.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] S4; Setter function is not chaning slot value as expected

2013-11-09 Thread Martin Morgan

On 11/09/2013 06:31 AM, daniel schnaider wrote:

It is my first time programming with S4 and I can't get the setter fuction
to actually change the value of the slot created by the constructor.

I guess it has to do with local copy, global copy, etc. of the variable -
but, I could't find anything relevant in documentation.

Tried to copy examples from the internet, but they had the same problem.

# The code
 setClass (Account ,
representation (
customer_id = character,
transactions = matrix)
 )


 Account - function(id, t) {
 new(Account, customer_id = id, transactions = t)
 }


 setGeneric (CustomerID-, function(obj,
id){standardGeneric(CustomerID-)})


Replacement methods (in R in general) require that the final argument (the 
replacement value) be named 'value', so


setGeneric(CustomerID-,
function(x, ..., value) standardGeneric(CustomerID))

setReplaceMethod(CustomerID, c(Account, character),
function(x, , value)
{
x@customer_id - value
x
})

use this as

   CustomerID(ac) - 54321



 setReplaceMethod(CustomerID, Account, function(obj, id){
 obj@customer_id - id
 obj
 })

 ac - Account(12345, matrix(c(1,2,3,4,5,6), ncol=2))
 ac
 CustomerID - 54321
 ac

#Output
  ac
 An object of class Account
 Slot customer_id:
 [1] 12345

 Slot transactions:
  [,1] [,2]
 [1,]14
 [2,]25
 [3,]36

# CustomerID is value has changed to 54321, but as you can see it does't
  CustomerID - 54321



  ac
 An object of class Account
 Slot customer_id:
 [1] 12345

 Slot transactions:
  [,1] [,2]
 [1,]14
 [2,]25
 [3,]36


Help!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Tables Package Grouping Factors

2013-11-09 Thread Jeff Newmiller
Visually, the elimination of duplicates in hierarchical tables in the 
tabular function from the tables package is very nice. I would like to do 
the same thing with non-crossed factors, but am perhaps missing some 
conceptual element of how this package is used. The following code 
illustrates my goal (I hope):


library(tables)
sampledf - data.frame( Sex=rep(c(M,F),each=6)
   , Name=rep(c(John,Joe,Mark,Alice,Beth,Jane),each=2)
   , When=rep(c(Before,After),times=6)
   , Weight=c(180,190,190,180,200,200,140,145,150,140,135,135)
   )
sampledf$SexName - factor( paste( sampledf$Sex, sampledf$Name ) )

# logically, this is the layout
tabular( Name ~ Heading()* When * Weight * Heading()*identity, 
data=sampledf )


# but I want to augment the Name with the Sex but visually group the
# Sex like
#   tabular( Sex*Name ~ Heading()*When * Weight * Heading()*identity, 
data=sampledf )

# would except that there really is no crossing between sexes.
tabular( SexName ~ Heading()*When * Weight * Heading()*identity, 
data=sampledf )

# this repeats the Sex category excessively.


---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] `level' definition in `computeContour3d' (misc3d package)

2013-11-09 Thread Duncan Murdoch

On 13-11-09 12:53 PM, j. van den hoff wrote:

On Sat, 09 Nov 2013 18:18:23 +0100, Duncan Murdoch
murdoch.dun...@gmail.com wrote:


On 13-11-09 11:57 AM, j. van den hoff wrote:

On Sat, 09 Nov 2013 17:16:28 +0100, Duncan Murdoch
murdoch.dun...@gmail.com wrote:


On 13-11-09 8:50 AM, j. van den hoff wrote:

I'd very much appreciate some help here: I'm in the process of
clarifying
whether I can use `computeContour3d' to derive estimates of the
surface
area of a single closed isosurface (and prospectively the enclosed
volume). getting the surface area from the list of triangles returned
by
`computeContour3d' is straightforward but I've stumbled over the
precise
meaning of `level' here. looking into the package, ultimately the
level
is
used in the namespace function `faceType' which reads:

function (v, nx, ny, level, maxvol)
{
if (level == maxvol)
p - v = level
else p - v  level
v[p] - 1
v[!p] - 0
v[-nx, -ny] + 2 * v[-1, -ny] + 4 * v[-1, -1] + 8 * v[-nx,
-1]
}

my question: is the discrimination of the special case `level ==
maxvol'
(or rather of everything else) really desirable? I would argue
that always testing for `v = level' would be better. if I feed data
with
discrete values (e.g. integer-valued) defined
on a coarse grid into `computeContour3d' it presently makes a big
difference whether there is a single data point (e.g.) with a value
larger
than `level' or not. consider the 1D example:

data1 - c(0, 0, 1, 1, 1, 1, 1, 0, 0)
data2 - c(0, 0, 1, 2, 1, 1, 1, 0, 0)

and level = 1

this defines the isocontour `level = 1' to lie at pos 3 and 7 in for
data1
but as lying at pos 4 in data2. actually I would like (and expect) to
get
the same isosurface for `data2' with this `level' setting. in short:
the
meaning/definition of `level' changes depending on whether or not it
is
equal to `maxvol'. this is neither stated in the manpage nor is this
desirable in my view. but maybe I miss something here. any
clarification
would be appreciated.


I don't see why you'd expect the same output from those vectors, but
since they aren't legal input to computeContour3d, maybe I don't know
what you mean by them.  Could you put together a reproducible example
that shows bad contours?


it's not bad contours, actually. my question only concerns the
different
meaning
of `level' depending on whether `level = maxvol' or not.

here is a real example:

8
library(misc3d)

dim - 21
cnt - (dim+1)/2
wid1 - 5
wid2 - 1
rng1 - (cnt-wid1):(cnt+wid1)
rng2 - (cnt-wid2):(cnt+wid2)

v - array(0, rep (dim, 3))

#put 11x11x11 box of ones at center
v[rng1, rng1, rng1] - 1

con1 - computeContour3d(v, level = 1)
drawScene(makeTriangles(con1))
dum - readline(CR for next plot)

#put an additional  3x3x3 box of twos at center
v[rng2, rng2, rng2] - 2
con2 - computeContour3d(v, level = 1)
drawScene(makeTriangles(con2))
8

this first puts a 11x11x11 box one Ones at the center of the
zero-initalized array and computes `con1' for `level=1'. in the 2. step
it puts a further, 3x3x3 box of Twos at the center and computes the
`level=1' contour again which this time does not delineate
the box of Ones but lies somewhere between the two non-zero boxes since
now the test in `faceType' is for ` level'. this is not immediately
obvious from the plots (no scale) but obvious from looking at `con1' and
`con2': the `con2' isosurface is shrunk by 3 voxels at each
side relative to `con1' (so my initial mail was wrong here: `con2' does
not jump to the next discrete isocontour but rather to
a point about halfway between both plateaus ). I also (for my own
problem
at hand) computed the total surface area which is
(not surprisingly...) 600 for `con1' and 64.87 for `con2'. so if one is
interested in such surfaces (I am) this makes a big difference in such
data.

the present behavior is not wrong per se but I would much prefer if
the
test where always for `= level' (so that in the present example the
resulting isosurface would in both cases delineate the box of Ones -- as
is the case when using `level = 1-e-6' instead of `level=1').

I believe the isosurface for a given value of `level' should have an
unambiguous meaning independent of what the data further inside are
looking like.



I think it does, but your data make the determination of its location
ambiguous.


I was imprecise: what I meant is: the isosurface should not change in my
example between both cases.



The definition is the estimated location where the continuous field
sampled at v crosses level.


understood/agreed.



You have a field with a discontinuity (or two).  You have whole volumes
of space where the field is equal to the level.  The marching cubes
algorithm is designed to detect crossings, not solid regions.

For example, going back to one dimension, if your data looked like your
original vector

data1 - c(0, 0, 1, 1, 1, 1, 1, 0, 0)

then it is 

Re: [R] Tables Package Grouping Factors

2013-11-09 Thread Duncan Murdoch

On 13-11-09 1:23 PM, Jeff Newmiller wrote:

Visually, the elimination of duplicates in hierarchical tables in the
tabular function from the tables package is very nice. I would like to do
the same thing with non-crossed factors, but am perhaps missing some
conceptual element of how this package is used. The following code
illustrates my goal (I hope):

library(tables)
sampledf - data.frame( Sex=rep(c(M,F),each=6)
 , Name=rep(c(John,Joe,Mark,Alice,Beth,Jane),each=2)
 , When=rep(c(Before,After),times=6)
 , Weight=c(180,190,190,180,200,200,140,145,150,140,135,135)
 )
sampledf$SexName - factor( paste( sampledf$Sex, sampledf$Name ) )

# logically, this is the layout
tabular( Name ~ Heading()* When * Weight * Heading()*identity,
data=sampledf )

# but I want to augment the Name with the Sex but visually group the
# Sex like
#   tabular( Sex*Name ~ Heading()*When * Weight * Heading()*identity,
data=sampledf )
# would except that there really is no crossing between sexes.
tabular( SexName ~ Heading()*When * Weight * Heading()*identity,
data=sampledf )
# this repeats the Sex category excessively.


I don't think it's easy to get what you want.  The basic assumption is 
that factors are crossed.


One hack that would get you what you want in this case is to make up a 
new variable representing person within sex (running from 1 to 3), then 
treating the Name as a statistic.  Of course, this won't work if you 
don't have equal numbers of each sex.


A better solution is more cumbersome, and only works in LaTeX (and maybe 
HTML).  Draw two tables, first for the female subset, then for the male 
subset.  Put out the headers only on the first one and the footer only 
on the second, and it will be typeset as one big table.
You'll have to fight with the fact that the factors Sex and Name 
remember their levels whether they are present or not, but it should 
work.  For example,


sampledf$Sex - as.character(sampledf$Sex)
sampledf$Name - as.character(sampledf$Name)
females - subset(sampledf, Sex == F)
males - subset(sampledf, Sex == M)

latex( tabular( Factor(Sex)*Factor(Name) ~ Heading()*When * Weight * 
Heading()*identity, data=females),

options = list(doFooter=FALSE, doEnd=FALSE) )

latex( tabular( Factor(Sex)*Factor(Name) ~ Heading()*When * Weight * 
Heading()*identity, data=males),

options = list(doBegin=FALSE, doHeader=FALSE) )

It would probably make sense to support nested factor notation using 
%in% to make this easier, but currently tables doesn't do that.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Custom Numeric type in R

2013-11-09 Thread Christofer Bogaso
Hi again,

In R, there are various numerics like, NA, Inf, or simple integers etc.
However I want to include one custom type: TBD, which R should treat as
numeric, not character.

That TBD should have same property like Inf, however except this: TBD -
TBD = 0

In future, I am planning to add few more properties to TBD.

Can somebody guide me if this is possible in R, ans also some pointer on
how can be done in R

Thanks and regards,

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Tables Package Grouping Factors

2013-11-09 Thread Duncan Murdoch

On 13-11-09 1:23 PM, Jeff Newmiller wrote:

Visually, the elimination of duplicates in hierarchical tables in the
tabular function from the tables package is very nice. I would like to do
the same thing with non-crossed factors, but am perhaps missing some
conceptual element of how this package is used. The following code
illustrates my goal (I hope):

library(tables)
sampledf - data.frame( Sex=rep(c(M,F),each=6)
 , Name=rep(c(John,Joe,Mark,Alice,Beth,Jane),each=2)
 , When=rep(c(Before,After),times=6)
 , Weight=c(180,190,190,180,200,200,140,145,150,140,135,135)
 )
sampledf$SexName - factor( paste( sampledf$Sex, sampledf$Name ) )

# logically, this is the layout
tabular( Name ~ Heading()* When * Weight * Heading()*identity,
data=sampledf )

# but I want to augment the Name with the Sex but visually group the
# Sex like
#   tabular( Sex*Name ~ Heading()*When * Weight * Heading()*identity,
data=sampledf )
# would except that there really is no crossing between sexes.
tabular( SexName ~ Heading()*When * Weight * Heading()*identity,
data=sampledf )
# this repeats the Sex category excessively.


I forgot, there's a simpler way to do this.  Build the full table with 
the junk values, then take a subset:


full - tabular( Sex*Name ~ Heading()*When * Weight * 
Heading()*identity, data=sampledf )


full[c(1:3, 10:12), ]

Figuring out which rows you want to keep can be a little tricky, but 
doing something like this might be good:


counts - tabular( Sex*Name ~ 1, data=sampledf )
full[ as.logical(counts), ]

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] S4 vs S3. New Package

2013-11-09 Thread Rolf Turner



For my take on the issue see fortune(strait jacket).

cheers,

Rolf Turner

P. S.  I said that quite some time ago and I have seen nothing
in the intervening years to change my views.

R. T.


On 11/10/13 04:22, daniel schnaider wrote:

Hi,

I am working on a new credit portfolio optimization package. My question is
if it is more recommended to develop in S4 object oriented or S3.

It would be more naturally to develop in object oriented paradigm, but
there is many concerns regarding S4.

1) Performance of S4 could be an issue as a setter function, actually
changes the whole object behind the scenes.

2) Documentation. It has been really hard to find examples in S4. Most
books and articles consider straightforward S3 examples.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] C50 Node Assignment

2013-11-09 Thread Max Kuhn
There is a sub-object called 'rules' that has the output of C5.0 for this model:

 library(C50)
 mod - C5.0(Species ~ ., data = iris, rules = TRUE)
 cat(mod$rules)
id=See5/C5.0 2.07 GPL Edition 2013-11-09
entries=1
rules=4 default=setosa
conds=1 cover=50 ok=50 lift=2.94231 class=setosa
type=2 att=Petal.Length cut=1.9 result=
conds=3 cover=48 ok=47 lift=2.88 class=versicolor
type=2 att=Petal.Length cut=1.9 result=
type=2 att=Petal.Length cut=4.901 result=
type=2 att=Petal.Width cut=1.7 result=
conds=1 cover=46 ok=45 lift=2.875 class=virginica
type=2 att=Petal.Width cut=1.7 result=
conds=1 cover=46 ok=44 lift=2.8125 class=virginica
type=2 att=Petal.Length cut=4.901 result=

You would either have to parse this or parse the summary results:

 summary(mod)

Call:
C5.0.formula(formula = Species ~ ., data = iris, rules = TRUE)

snip
Rules:

Rule 1: (50, lift 2.9)
Petal.Length = 1.9
-  class setosa  [0.981]

Rule 2: (48/1, lift 2.9)
Petal.Length  1.9
Petal.Length = 4.9
Petal.Width = 1.7
-  class versicolor  [0.960]
snip

Max

On Sat, Nov 9, 2013 at 1:11 PM, Carl Witthoft c...@witthoft.com wrote:

 Just to clarify:  I'm guessing the OP is referring to the CRAN package C50
 here.   A quick skim suggests the rules are a list element of a C5.0-class
 object, so maybe that's where to start?


 David Winsemius wrote
 In my role as a moderator I am attempting to bypass the automatic mail
 filters that are blocking this posting. Please reply to the list and to:
 =
 Kevin Shaney lt;

 kevin.shaney@

 gt;

 C50 Node Assignment

 I am using C50 to classify individuals into 5 groups / categories (factor
 variable).  The tree / set of rules has 10 rules for classification.  I am
 trying to extract the RULE for which each individual qualifies (a number
 between 1 and 10), and cannot figure out how to do so.  I can extract the
 predicted group and predicted group probability, but not the RULE to which
 an individual qualifies.  Please let me know if you can help!

 Kevin
 =


 --
 David Winsemius
 Alameda, CA, USA

 __

 R-help@

  mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





 --
 View this message in context: 
 http://r.789695.n4.nabble.com/C50-Node-Assignment-tp4680071p4680127.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 

Max

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] S4 vs S3. New Package

2013-11-09 Thread Martin Morgan

On 11/09/2013 11:59 AM, Rolf Turner wrote:



For my take on the issue see fortune(strait jacket).

 cheers,

 Rolf Turner

P. S.  I said that quite some time ago and I have seen nothing
in the intervening years to change my views.


Mileage varies; the Bioconductor project attains a level of interoperability and 
re-use (http://www.nature.com/nbt/journal/v31/n10/full/nbt.2721.html) that would 
be difficult with a less formal class system.




 R. T.


On 11/10/13 04:22, daniel schnaider wrote:

Hi,

I am working on a new credit portfolio optimization package. My question is
if it is more recommended to develop in S4 object oriented or S3.

It would be more naturally to develop in object oriented paradigm, but
there is many concerns regarding S4.

1) Performance of S4 could be an issue as a setter function, actually
changes the whole object behind the scenes.


Depending on implementation, updating S3 objects could as easily trigger copies; 
this is a fact of life in R. Mitigate by modelling objects in a vector 
(column)-oriented approach rather than the row-oriented paradigm of Java / C++ / 
etc.


Martin Morgan


2) Documentation. It has been really hard to find examples in S4. Most
books and articles consider straightforward S3 examples.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Tables Package Grouping Factors

2013-11-09 Thread Jeff Newmiller
The problem that prompted this question involved manufacturers and their model 
numbers, so I think the cross everything and throw away most of it will get out 
of hand quickly. The number of models per manufacturer definitely varies. I 
think I will work on the print segments of the table successively approach. 
Thanks for the ideas.
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

Duncan Murdoch murdoch.dun...@gmail.com wrote:
On 13-11-09 1:23 PM, Jeff Newmiller wrote:
 Visually, the elimination of duplicates in hierarchical tables in the
 tabular function from the tables package is very nice. I would like
to do
 the same thing with non-crossed factors, but am perhaps missing some
 conceptual element of how this package is used. The following code
 illustrates my goal (I hope):

 library(tables)
 sampledf - data.frame( Sex=rep(c(M,F),each=6)
  ,
Name=rep(c(John,Joe,Mark,Alice,Beth,Jane),each=2)
  , When=rep(c(Before,After),times=6)
  ,
Weight=c(180,190,190,180,200,200,140,145,150,140,135,135)
  )
 sampledf$SexName - factor( paste( sampledf$Sex, sampledf$Name ) )

 # logically, this is the layout
 tabular( Name ~ Heading()* When * Weight * Heading()*identity,
 data=sampledf )

 # but I want to augment the Name with the Sex but visually group the
 # Sex like
 #   tabular( Sex*Name ~ Heading()*When * Weight * Heading()*identity,
 data=sampledf )
 # would except that there really is no crossing between sexes.
 tabular( SexName ~ Heading()*When * Weight * Heading()*identity,
 data=sampledf )
 # this repeats the Sex category excessively.

I forgot, there's a simpler way to do this.  Build the full table with 
the junk values, then take a subset:

full - tabular( Sex*Name ~ Heading()*When * Weight * 
Heading()*identity, data=sampledf )

full[c(1:3, 10:12), ]

Figuring out which rows you want to keep can be a little tricky, but 
doing something like this might be good:

counts - tabular( Sex*Name ~ 1, data=sampledf )
full[ as.logical(counts), ]

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] S4 vs S3. New Package

2013-11-09 Thread daniel schnaider
Thank you!


On Sat, Nov 9, 2013 at 2:30 PM, Simon Zehnder szehn...@uni-bonn.de wrote:

 This depends very often of on the developer and what he is comfortable
 with. I like S4 classes, as I come from C++ and S4 classes approximate C++
 classes at least more than S3 classes do (Reference Classes would do so
 even more and I know very good R programmers liking these most).

 1) I wrote a package for MCMC simulation with S4 classes carrying all
 simulated values - fast enough for me: in less than 1.5 secs I have my
 sample of 100.000 values together with several other 100T values like
 log-likelihoods, posterior hyper parameters, etc. I watch out for not
 copying too often an object but sometimes it is not avoidable.

 2) That is not true:

 Books:

 http://www.amazon.de/Software-Data-Analysis-Programming-Statistics/dp/0387759352/ref=sr_1_1?ie=UTF8qid=1384014486sr=8-1keywords=John+chambers+data

 http://www.amazon.de/Programming-Data-Language-John-Chambers/dp/0387985034/ref=sr_1_4?ie=UTF8qid=1384014486sr=8-4keywords=John+chambers+data

 Online:
 https://www.rmetrics.org/files/Meielisalp2009/Presentations/Chalabi1.pdf
 https://www.stat.auckland.ac.nz/S-Workshop/Gentleman/S4Objects.pdf

 And for a bunch of packages look into the Bioconductor packages.

 Best

 Simon

 On 09 Nov 2013, at 16:22, daniel schnaider dschnai...@gmail.com wrote:

  Hi,
 
  I am working on a new credit portfolio optimization package. My question
 is
  if it is more recommended to develop in S4 object oriented or S3.
 
  It would be more naturally to develop in object oriented paradigm, but
  there is many concerns regarding S4.
 
  1) Performance of S4 could be an issue as a setter function, actually
  changes the whole object behind the scenes.
 
  2) Documentation. It has been really hard to find examples in S4. Most
  books and articles consider straightforward S3 examples.
 
  Thanks,
 
[[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.




-- 
Daniel Schnaider

SP Phone:  +55-11-9.7575.0822


d...@scaigroup.com
skype dschnaider
Linked In: http://www.linkedin.com/in/danielschnaider

w http://www.arkiagroup.com/ww.scaigroup.com

Depoimentos de clientes http://www.scaigroup.com/Projetos/depoimentos

Casos de Sucesso  Referências http://www.scaigroup.com/Projetos

SCAI Group no Facebook http://facebook.scaigroup.com/

SCAI Group no Twitter http://twitter.scaigroup.com/

SCAI Group no Google Plus http://plus.scaigroup.com/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problem while using searchTwitter function in TwitteR package

2013-11-09 Thread Ameek Singh
Hi,

I am data mining from twitter using the twitteR package but I am facing a
problem mining statuses. A lot of times the status does not return the
desired number of statuses but returns only a specific number. Also, this
number keeps on varying for exactly the same arguments ( No, its not that
more statuses were put up during that time), it even decreases sometimes.
Is there some logic to how this works ?
This is the warning I am getting : In doRppAPICall(search/tweets, n,
params = params, retryOnRateLimit = retryOnRateLimit,  :
  3000 tweets were requested but the API can only return 299

Thanks for the help in advance,
Ameek Singh

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Using Unicode inside R's expression() command

2013-11-09 Thread Sverre Stausland
I'm using 'expression()' in R plots in order to get italicized text.
But it appears as if I cannot use Unicode symbols inside 'expression'
outside of ASCII characters. Is there some way I can work around this?
My goal is to get the 'fi' ligature in various labels in my R barplots
(together with italicized text). I think I could work around this if
there was another way of italicizing selected characters in a string
than using 'expression()'.

I'm using R for Windows version 3.0.2.

CairoPDF(file = Ligature1.pdf)
plot.new()
text(x =.5, y = .5, labels = fi, family = Times New Roman)
dev.off()

CairoPDF(file = Ligature2.pdf)
plot.new()
text(x =.5, y = .5, labels = expression(paste(italic(m), u, fi,
italic(m), sep = )), family = Times New Roman)
dev.off()
attachment: lig1.pngattachment: lig2.png__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] union of list objects if objects intersect

2013-11-09 Thread Hermann Norpois
Hello,

I have a list called ja and I wish to unify list objects if there is some
overlap.
For instance something like

if (length (intersect (ja[[1]], ja[[2]]) !=0) { union (ja[[1]], ja[[2]] }

but of course it should work cumulatively (for larger data sets).

Could you please give me a hint.

Thanks
Hermann

 ja
$A
[1] A B F G H

$B
[1] B F I

$C
[1] C F I K

$D
[1] D L M N

$L
[1] L O P


dput (ja)
structure(list(A = c(A, B, F, G, H), B = c(B, F,
I), C = c(C, F, I, K), D = c(D, L, M, N), L = c(L,
O, P)), .Names = c(A, B, C, D, L))

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] select .txt from .txt in a directory

2013-11-09 Thread arun


Hi,
Try:
library(stringr)
# Created the selected files (98) in a separate working  folder 
(SubsetFiles1) (refer to my previous mail)
filelst - list.files()
#Sublst - filelst[1:2]
res - lapply(filelst,function(x) {con - file(x)
     Lines1 - readLines(con) close(con)
     Lines2 - Lines1[-1]
     Lines3 - str_split(Lines2,-.9M)
     Lines4 - str_trim(unlist(lapply(Lines3,function(x) {x[x==] - NA
     paste(x,collapse= )})))
     Lines5 - gsub((\\d+)[A-Za-z],\\1,Lines4)
     res1 - read.table(text=Lines5,sep=,header=FALSE,fill=TRUE)
     res1})

##Created another folder Modified to store the res files
lapply(seq_along(res),function(i) 
write.table(res[[i]],paste(/home/arunksa111/Zl/Modified,paste0(Mod_,filelst[i]),sep=/),row.names=FALSE,quote=FALSE))

 lstf1 - list.files(path=/home/arunksa111/Zl/Modified)  

lst1 - lapply(lstf1,function(x) 
readLines(paste(/home/arunksa111/Zl/Modified,x,sep=/)))
 which(lapply(lst1,function(x) length(grep(\\d+-.9,x)))0 )
 #[1]  7 11 14 15 30 32 39 40 42 45 46 53 60 65 66 68 69 70 73 74 75 78 80 82 83
#[26] 86 87 90 91 93

lst2 - lapply(lst1,function(x) gsub((\\d+)(-.9),\\1 \\2,x))
 #lapply(lst2,function(x) x[grep(\\d+-.9,x)]) ##checking for the pattern

lst3 - lapply(lst2,function(x) {x-gsub((-.9)(-.9),\\1 \\2,x)})#
#lapply(lst3,function(x) x[grep(\\d+-.9,x)])  ##checking for the pattern
# lapply(lst3,function(x) x[grep(-.9,x)]) ###second check
lst4 - lapply(lst3,function(x) gsub((Day) (\\d+),\\1_\\2, x[-1]))  
#removed the additional header V1, V2, etc.

#sapply(lst4,function(x) length(strsplit(x[1], )[[1]])) #checking the number 
of columns that should be present
lst5 - lapply(lst4,function(x) unlist(lapply(x, function(y) word(y,1,33
lst6 - lapply(lst5,function(x) 
read.table(text=x,header=TRUE,stringsAsFactors=FALSE,sep=,fill=TRUE))
# head(lst6[[94]],3)
lst7 - lapply(lst6,function(x) x[x$Year =1961  x$Year =2005,])
#head(lst7[[45]],3)
 lst8 - lapply(lst7,function(x) x[!is.na(x$Year),])


lst9 - lapply(lst8,function(x) {
    if((min(x$Year)1961)|(max(x$Year)2005)){
  n1- (min(x$Year)-1961)*12
  x1- as.data.frame(matrix(NA,ncol=ncol(x),nrow=n1))
  n2- (2005-max(x$Year))*12
  x2- as.data.frame(matrix(NA,ncol=ncol(x),nrow=n2))
   colnames(x1) - colnames(x)
   colnames(x2) - colnames(x)        
  x3- rbind(x1,x,x2)
    }
   else if((min(x$Year)==1961)  (max(x$Year)==2005)) {
      if((min(x$Mo[x$Year==1961])1)|(max(x$Mo[x$Year==2005])12)){
       n1 - min(x$Mo[x$Year==1961])-1
       x1 - as.data.frame(matrix(NA,ncol=ncol(x),nrow=n1))
       n2 - (12-max(x$Mo[x$Year==2005]))      
       x2 - as.data.frame(matrix(NA,ncol=ncol(x),nrow=n2))
       colnames(x1) - colnames(x)
       colnames(x2) - colnames(x)
       x3 - rbind(x1,x,x2)
      }
        else {    
        x
    }
 
    } })

which(sapply(lst9,nrow)!=540)
#[1] 45 46 54 64 65 66 70 75 97
lst10 - lapply(lst9,function(x) {x1 - x[!is.na(x$Year),]
             hx1 - head(x1,1)
             tx1 - tail(x1,1)
             x2 - as.data.frame(matrix(NA, ncol=ncol(x), nrow=hx1$Mo-1))
             x3 - as.data.frame(matrix(NA,ncol=ncol(x),nrow=12-tx1$Mo))
             colnames(x2) - colnames(x)
             colnames(x3) - colnames(x)
             if(nrow(x)  540) rbind(x2,x,x3) else x  })
which(sapply(lst10,nrow)!=540)
#integer(0)



lst11 -lapply(lst10,function(x) 
data.frame(col1=unlist(data.frame(t(x)[-c(1:2),]),use.names=FALSE))) 
  lst12- lapply(seq_along(lst10),function(i){
    x- lst11[[i]]
    colnames(x)- lstf1[i]
    row.names(x)- 1:nrow(x)
    x
  })
res2 -  do.call(cbind,lst11)
 dim(res2)
#[1] 16740    98
 
res2[res2==-.9]-NA # change missing value identifier as in your data set
which(res2==-.9)
#integer(0)

dates1-seq.Date(as.Date('1Jan1961',format=%d%b%Y),as.Date('31Dec2005',format=%d%b%Y),by=day)
dates2- as.character(dates1)
sldat- split(dates2,list(gsub(-.*,,dates2)))
lst12-lapply(sldat,function(x) lapply(split(x,gsub(.*-(.*)-.*,\\1,x)), 
function(y){x1-as.numeric(gsub(.*-.*-(.*),\\1,y));if((31-max(x1))0) 
{x2-seq(max(x1)+1,31,1);x3-paste0(unique(gsub((.*-.*-).*,\\1,y)),x2);c(y,x3)}
 else y} ))
any(sapply(lst12,function(x) any(lapply(x,length)!=31)))
#[1] FALSE

lst22-lapply(lst12,function(x) unlist(x,use.names=FALSE))
sapply(lst22,length)
dates3-unlist(lst22,use.names=FALSE)
length(dates3)
res3 - data.frame(dates=dates3,res2,stringsAsFactors=FALSE)
str(res3)
res3$dates-as.Date(res3$dates)
res4 - res3[!is.na(res3$dates),]
res4[1:3,1:3]
dim(res4)
 #[1] 16436    99


A.K.




On Friday, November 8, 2013 5:54 PM, Zilefac Elvis zilefacel...@yahoo.com 
wrote:

Hi Ak,

I think I figured out how to do the sub-setting. All I needed was to use column 
3 in Temperature_inventory and select matching .txt files in the .zip file. The 
final result would be a subset of files whose IDs are in column 3 of 
temp_inventory.
*
I also have this script which you developed 

Re: [R] select .txt from .txt in a directory

2013-11-09 Thread arun
HI,

The code could be shortened by using ?merge or ?join().
library(plyr)
##Using the output from `lst6`


lst7 - lapply(lst6,function(x) {x1 - 
data.frame(Year=rep(1961:2005,each=12),Mo=rep(1:12,45)); x2 
-join(x1,x,type=left,by=c(Year,Mo))})

##rest are the same (only change in object names)

 sapply(lst7,nrow)
 lst8 -lapply(lst7,function(x) 
data.frame(col1=unlist(data.frame(t(x)[-c(1:2),]),use.names=FALSE))) 
  lst9- lapply(seq_along(lst8),function(i){
    x- lst11[[i]]
    colnames(x)- lstf1[i]
    row.names(x)- 1:nrow(x)
    x
  }) 
sapply(lst9,nrow)
res2New - do.call(cbind,lst9)
 dim(res2New)
#[1] 16740    98
res2New[res2New ==-.9]-NA # change missing value identifier as in your 
data set
which(res2New==-.9)
#integer(0)

dates1-seq.Date(as.Date('1Jan1961',format=%d%b%Y),as.Date('31Dec2005',format=%d%b%Y),by=day)
dates2- as.character(dates1)
sldat- split(dates2,list(gsub(-.*,,dates2)))
lst12-lapply(sldat,function(x) lapply(split(x,gsub(.*-(.*)-.*,\\1,x)), 
function(y){x1-as.numeric(gsub(.*-.*-(.*),\\1,y));if((31-max(x1))0) 
{x2-seq(max(x1)+1,31,1);x3-paste0(unique(gsub((.*-.*-).*,\\1,y)),x2);c(y,x3)}
 else y} ))
any(sapply(lst12,function(x) any(lapply(x,length)!=31)))
#[1] FALSE

lst22-lapply(lst12,function(x) unlist(x,use.names=FALSE))
sapply(lst22,length)
dates3-unlist(lst22,use.names=FALSE)
length(dates3)
res3New - data.frame(dates=dates3,res2New,stringsAsFactors=FALSE)
str(res3New)
res3New$dates-as.Date(res3New$dates)
res4New - res3New[!is.na(res3New$dates),]
res4New[1:3,1:3]
dim(res4New)
colnames(res4) - colnames(res4New)
 identical(res4,res4New)
#[1] TRUE

A.K.





On Saturday, November 9, 2013 5:46 PM, arun smartpink...@yahoo.com wrote:


Hi,
Try:
library(stringr)
# Created the selected files (98) in a separate working  folder 
(SubsetFiles1) (refer to my previous mail)
filelst - list.files()
#Sublst - filelst[1:2]
res - lapply(filelst,function(x) {con - file(x)
     Lines1 - readLines(con) close(con)
     Lines2 - Lines1[-1]
     Lines3 - str_split(Lines2,-.9M)
     Lines4 - str_trim(unlist(lapply(Lines3,function(x) {x[x==] - NA
     paste(x,collapse= )})))
     Lines5 - gsub((\\d+)[A-Za-z],\\1,Lines4)
     res1 - read.table(text=Lines5,sep=,header=FALSE,fill=TRUE)
     res1})

##Created another folder Modified to store the res files
lapply(seq_along(res),function(i) 
write.table(res[[i]],paste(/home/arunksa111/Zl/Modified,paste0(Mod_,filelst[i]),sep=/),row.names=FALSE,quote=FALSE))

 lstf1 - list.files(path=/home/arunksa111/Zl/Modified)  

lst1 - lapply(lstf1,function(x) 
readLines(paste(/home/arunksa111/Zl/Modified,x,sep=/)))
 which(lapply(lst1,function(x) length(grep(\\d+-.9,x)))0 )
 #[1]  7 11 14 15 30 32 39 40 42 45 46 53 60 65 66 68 69 70 73 74 75 78 80 82 83
#[26] 86 87 90 91 93

lst2 - lapply(lst1,function(x) gsub((\\d+)(-.9),\\1 \\2,x))
 #lapply(lst2,function(x) x[grep(\\d+-.9,x)]) ##checking for the pattern

lst3 - lapply(lst2,function(x) {x-gsub((-.9)(-.9),\\1 \\2,x)})#
#lapply(lst3,function(x) x[grep(\\d+-.9,x)])  ##checking for the pattern
# lapply(lst3,function(x) x[grep(-.9,x)]) ###second check
lst4 - lapply(lst3,function(x) gsub((Day) (\\d+),\\1_\\2, x[-1]))  
#removed the additional header V1, V2, etc.

#sapply(lst4,function(x) length(strsplit(x[1], )[[1]])) #checking the number 
of columns that should be present
lst5 - lapply(lst4,function(x) unlist(lapply(x, function(y) word(y,1,33
lst6 - lapply(lst5,function(x) 
read.table(text=x,header=TRUE,stringsAsFactors=FALSE,sep=,fill=TRUE))
# head(lst6[[94]],3)
lst7 - lapply(lst6,function(x) x[x$Year =1961  x$Year =2005,])
#head(lst7[[45]],3)
 lst8 - lapply(lst7,function(x) x[!is.na(x$Year),])


lst9 - lapply(lst8,function(x) {
    if((min(x$Year)1961)|(max(x$Year)2005)){
  n1- (min(x$Year)-1961)*12
  x1- as.data.frame(matrix(NA,ncol=ncol(x),nrow=n1))
  n2- (2005-max(x$Year))*12
  x2- as.data.frame(matrix(NA,ncol=ncol(x),nrow=n2))
   colnames(x1) - colnames(x)
   colnames(x2) - colnames(x)        
  x3- rbind(x1,x,x2)
    }
   else if((min(x$Year)==1961)  (max(x$Year)==2005)) {
      if((min(x$Mo[x$Year==1961])1)|(max(x$Mo[x$Year==2005])12)){
       n1 - min(x$Mo[x$Year==1961])-1
       x1 - as.data.frame(matrix(NA,ncol=ncol(x),nrow=n1))
       n2 - (12-max(x$Mo[x$Year==2005]))      
       x2 - as.data.frame(matrix(NA,ncol=ncol(x),nrow=n2))
       colnames(x1) - colnames(x)
       colnames(x2) - colnames(x)
       x3 - rbind(x1,x,x2)
      }
        else {    
        x
    }
 
    } })

which(sapply(lst9,nrow)!=540)
#[1] 45 46 54 64 65 66 70 75 97
lst10 - lapply(lst9,function(x) {x1 - x[!is.na(x$Year),]
             hx1 - head(x1,1)
             tx1 - tail(x1,1)
             x2 - as.data.frame(matrix(NA, ncol=ncol(x), nrow=hx1$Mo-1))
             x3 - as.data.frame(matrix(NA,ncol=ncol(x),nrow=12-tx1$Mo))
             colnames(x2) - colnames(x)
             colnames(x3) - colnames(x)
             if(nrow(x)  540) rbind(x2,x,x3) else x  })

Re: [R] Tables Package Grouping Factors

2013-11-09 Thread Duncan Murdoch

On 13-11-09 5:56 PM, Jeff Newmiller wrote:

The problem that prompted this question involved manufacturers and their model 
numbers, so I think the cross everything and throw away most of it will get out 
of hand quickly. The number of models per manufacturer definitely varies. I 
think I will work on the print segments of the table successively approach. 
Thanks for the ideas.


I've just added cbind() and rbind() methods for tabular objects, so that 
approach will be a lot easier.  Just do the table of the first subset, 
then rbind on the subsets for the rest.  Will commit to R-forge after a 
bit more testing and documentation.


Duncan Murdoch


---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
   Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
---
Sent from my phone. Please excuse my brevity.

Duncan Murdoch murdoch.dun...@gmail.com wrote:

On 13-11-09 1:23 PM, Jeff Newmiller wrote:

Visually, the elimination of duplicates in hierarchical tables in the
tabular function from the tables package is very nice. I would like

to do

the same thing with non-crossed factors, but am perhaps missing some
conceptual element of how this package is used. The following code
illustrates my goal (I hope):

library(tables)
sampledf - data.frame( Sex=rep(c(M,F),each=6)
  ,

Name=rep(c(John,Joe,Mark,Alice,Beth,Jane),each=2)

  , When=rep(c(Before,After),times=6)
  ,

Weight=c(180,190,190,180,200,200,140,145,150,140,135,135)

  )
sampledf$SexName - factor( paste( sampledf$Sex, sampledf$Name ) )

# logically, this is the layout
tabular( Name ~ Heading()* When * Weight * Heading()*identity,
data=sampledf )

# but I want to augment the Name with the Sex but visually group the
# Sex like
#   tabular( Sex*Name ~ Heading()*When * Weight * Heading()*identity,
data=sampledf )
# would except that there really is no crossing between sexes.
tabular( SexName ~ Heading()*When * Weight * Heading()*identity,
data=sampledf )
# this repeats the Sex category excessively.


I forgot, there's a simpler way to do this.  Build the full table with
the junk values, then take a subset:

full - tabular( Sex*Name ~ Heading()*When * Weight *
Heading()*identity, data=sampledf )

full[c(1:3, 10:12), ]

Figuring out which rows you want to keep can be a little tricky, but
doing something like this might be good:

counts - tabular( Sex*Name ~ 1, data=sampledf )
full[ as.logical(counts), ]

Duncan Murdoch




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] S4; Setter function is not chaning slot value as expected

2013-11-09 Thread Hadley Wickham
Modelling a mutable entity, i.e. an account, is really a perfect
example of when to use reference classes.  You might find the examples
on http://adv-r.had.co.nz/OO-essentials.html give you a better feel
for the strengths and weaknesses of R's different OO systems.

Hadley

On Sat, Nov 9, 2013 at 9:31 AM, daniel schnaider dschnai...@gmail.com wrote:
 It is my first time programming with S4 and I can't get the setter fuction
 to actually change the value of the slot created by the constructor.

 I guess it has to do with local copy, global copy, etc. of the variable -
 but, I could't find anything relevant in documentation.

 Tried to copy examples from the internet, but they had the same problem.

 # The code
 setClass (Account ,
representation (
customer_id = character,
transactions = matrix)
 )


 Account - function(id, t) {
 new(Account, customer_id = id, transactions = t)
 }


 setGeneric (CustomerID-, function(obj,
 id){standardGeneric(CustomerID-)})
 setReplaceMethod(CustomerID, Account, function(obj, id){
 obj@customer_id - id
 obj
 })

 ac - Account(12345, matrix(c(1,2,3,4,5,6), ncol=2))
 ac
 CustomerID - 54321
 ac

 #Output
  ac
 An object of class Account
 Slot customer_id:
 [1] 12345

 Slot transactions:
  [,1] [,2]
 [1,]14
 [2,]25
 [3,]36

 # CustomerID is value has changed to 54321, but as you can see it does't
  CustomerID - 54321
  ac
 An object of class Account
 Slot customer_id:
 [1] 12345

 Slot transactions:
  [,1] [,2]
 [1,]14
 [2,]25
 [3,]36


 Help!

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Chief Scientist, RStudio
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.