Re: [R] Rotation Forest Error Message

2020-08-20 Thread Abby Spurdle
Just re-read your question and realized I misread the error message.
The argument is of zero length.

But the conclusion is the same, either a bug in the package, or a
problem with your input.


On Fri, Aug 21, 2020 at 4:16 PM Abby Spurdle  wrote:
>
> Note that I'm not familiar with this package or the method.
> Also note that you haven't told anyone what function you're using, or
> what your call was.
>
> I'm assuming that you're using the rotationForest() function.
> According to its help page, the default is:
>
> K = round(ncol(x)/3, 0)
>
> There's no reason why the default K value should be higher than the
> number of columns, unless:
> (1) There's a bug with the package; or
> (2) There's a problem with your input.
>
> I note that the package is only version 0.1.3, so a bug is not out of
> the question.
> Also, I'm a little surprised the author didn't use integer division:
>
> K = ncol (x) %/% 3
>
> You could just set K to the above value, and see what happens...
>
>
> On Fri, Aug 21, 2020 at 1:06 PM Sparks, John  wrote:
> >
> > Hi R Helpers,
> >
> > I wanted to try the rotationForest package.
> >
> > I pointed it at my data set and got the error message "Error in if (K >= 
> > ncol(x)) stop("K should not be greater than or equal to the number of 
> > columns in x") :
> >   argument is of length zero'.
> >
> > My dataset has 3688 obs. of  111 variables.
> >
> > Would a quick adjustment to the default value of K resolve this?
> >
> > If anybody with more experience with the package than me has a general 
> > suggestion I would appreciate it.
> >
> > --John Spaarks
> >
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Rotation Forest Error Message

2020-08-20 Thread Abby Spurdle
Note that I'm not familiar with this package or the method.
Also note that you haven't told anyone what function you're using, or
what your call was.

I'm assuming that you're using the rotationForest() function.
According to its help page, the default is:

K = round(ncol(x)/3, 0)

There's no reason why the default K value should be higher than the
number of columns, unless:
(1) There's a bug with the package; or
(2) There's a problem with your input.

I note that the package is only version 0.1.3, so a bug is not out of
the question.
Also, I'm a little surprised the author didn't use integer division:

K = ncol (x) %/% 3

You could just set K to the above value, and see what happens...


On Fri, Aug 21, 2020 at 1:06 PM Sparks, John  wrote:
>
> Hi R Helpers,
>
> I wanted to try the rotationForest package.
>
> I pointed it at my data set and got the error message "Error in if (K >= 
> ncol(x)) stop("K should not be greater than or equal to the number of columns 
> in x") :
>   argument is of length zero'.
>
> My dataset has 3688 obs. of  111 variables.
>
> Would a quick adjustment to the default value of K resolve this?
>
> If anybody with more experience with the package than me has a general 
> suggestion I would appreciate it.
>
> --John Spaarks
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Rotation Forest Error Message

2020-08-20 Thread Sparks, John
Hi R Helpers,

I wanted to try the rotationForest package.

I pointed it at my data set and got the error message "Error in if (K >= 
ncol(x)) stop("K should not be greater than or equal to the number of columns 
in x") :
  argument is of length zero'.

My dataset has 3688 obs. of  111 variables.

Would a quick adjustment to the default value of K resolve this?

If anybody with more experience with the package than me has a general 
suggestion I would appreciate it.

--John Spaarks


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] & and |

2020-08-20 Thread Bert Gunter
The single grep regex solutions offered to Ivan's problem were fine, but do
not readily generalize to the conjunction of multiple (>2, say) regex
patterns that can appear anywhere in a string and in any order. However,
note that this can easily be done using the Perl zero width lookahead
construction,  "(?=...)" .
e.g.
> test <- test <- c("xyCz",
"xAyCz","xAyBzC","xCByAz","xACyB","BAyyC","CBxBAy")

## to search for strings contain "A", "B", & "C" in any order
> grep("(?=.*A)(?=.*B)(?=.*C)", test, perl = TRUE)
[1] 3 4 5 6 7

Note that this matches on one or multiple instances of the patterns. If one
wants only exactly one instance of each conjunct,  then something like this
should do:

> lookfor <- c("A","B","C")
> notme <- paste0("[^",lookfor,"]*")
> z <- paste0("(?=", notme, lookfor, notme, "$)",collapse = "")
> grep(z, test, perl = TRUE)
[1] 3 4 5 6

Cheers,
Bert




On Wed, Aug 19, 2020 at 11:38 PM Ivan Calandra  wrote:

> Thank you all for all the very helpful answers!
>
> Best,
> Ivan
>
> --
> Dr. Ivan Calandra
> TraCEr, laboratory for Traceology and Controlled Experiments
> MONREPOS Archaeological Research Centre and
> Museum for Human Behavioural Evolution
> Schloss Monrepos
> 56567 Neuwied, Germany
> +49 (0) 2631 9772-243
> https://www.researchgate.net/profile/Ivan_Calandra
>
> On 20/08/2020 3:28, Richard O'Keefe wrote:
> > There are & and | operators in the R language.
> > There is an | operator in regular expressions.
> > There is NOT any & operator in regular expressions.
> > grep("ConfoMap", mydata, value=TRUE)
> > looks for elements of mydata containing the literal
> > string 'ConfoMap'.
> >
> > > foo <- c("a","b","cab","back")
> > > foo[grepl("a",foo) & grepl("b",foo)]
> > [1] "cab"  "back"
> >
> > grepl returns a TRUE/FALSE vector.
> >
> > On Thu, 20 Aug 2020 at 02:53, Ivan Calandra  > > wrote:
> >
> > Dear useRs,
> >
> > I feel really stupid, but I cannot understand why "&" doesn't work
> > as I
> > expect, while "|" does.
> >
> > I have the following vector:
> > mydata <- c("SSFA-ConfoMap_GuineaPigs_NMPfilled.csv",
> > "SSFA-ConfoMap_Lithics_NMPfilled.csv",
> > "SSFA-ConfoMap_Sheeps_NMPfilled.csv",
> > "SSFA-Toothfrax_GuineaPigs.xlsx",
> > "SSFA-Toothfrax_Lithics.xlsx", "SSFA-Toothfrax_Sheeps.xlsx")
> > and I want to find the values that include both "ConfoMap" and
> > "GuineaPigs".
> >
> > If I do:
> > grep("ConfoMap", mydata, value=TRUE)
> > it returns an empty vector, character(0).
> >
> > But if I do:
> > grep("ConfoMap|GuineaPigs", mydata, value=TRUE)
> > it returns all the elements that include either "ConfoMap" or
> > "GuineaPigs", as I would expect.
> >
> > So what is wrong with my "&" construct? How can I return the elements
> > that include both parts?
> >
> > Thank you for your help!
> > Ivan
> >
> > --
> > Dr. Ivan Calandra
> > TraCEr, laboratory for Traceology and Controlled Experiments
> > MONREPOS Archaeological Research Centre and
> > Museum for Human Behavioural Evolution
> > Schloss Monrepos
> > 56567 Neuwied, Germany
> > +49 (0) 2631 9772-243
> > https://www.researchgate.net/profile/Ivan_Calandra
> >
> > __
> > R-help@r-project.org  mailing list --
> > To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] combine filter() and select()

2020-08-20 Thread Hadley Wickham
On Wed, Aug 19, 2020 at 10:03 AM Ivan Calandra  wrote:
>
> Dear useRs,
>
> I'm new to the tidyverse world and I need some help on basic things.
>
> I have the following tibble:
> mytbl <- structure(list(files = c("a", "b", "c", "d", "e", "f"), prop =
> 1:6), row.names = c(NA, -6L), class = c("tbl_df", "tbl", "data.frame"))
>
> I want to subset the rows with "a" in the column "files", and keep only
> that column.
>
> So I did:
> myfile <- mytbl %>%
>   filter(grepl("a", files)) %>%
>   select(files)
>
> It works, but I believe there must be an easier way to combine filter()
> and select(), right?

Not in the tidyverse. As others have mentioned, both [ and subset() in
base R allow you to simultaneously subset rows and columns, but
there's no single verb in the tidyverse that does both. This is
somewhat informed by the observation that in data frames, unlike
matrices, rows and columns are not exchangeable, and you typically
want to express subsetting in rather different ways.

Hadley

-- 
http://hadley.nz

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R-es] Puntos de corte de dos curvas de densidad

2020-08-20 Thread Manuel Mendoza
Buenas tardes, tengo una variable bimodal (*var)*, de presencias y
ausencias (1s y 0s) y otra variable, *prob*, con las probabilidades (entre
0 y 1) que le asigna un modelo.
Con: *ggplot(Preds, aes(x=prob, fill= var )) + geom_density(alpha=.3)*
obtengo la distribución de las presencias y de las ausencias, por separado,
en función del valor de probabilidad asignado. Las dos curvas se cruzan dos
veces. ¿Hay forma de determinar los valores de *prob* con los que se
corresponden esos puntos de corte?
Gracias,
Manuel

[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


[R] [R-pkgs] skedastic: Heteroskedasticity Diagnostics for Linear Regression Models

2020-08-20 Thread Thomas Farrar
Dear All,

Allow me to re-introduce the skedastic package (version 1.0.0) which now
implements more than 20 different heteroskedasticity tests for the linear
regression model, as well as a graphical diagnostic tool and some helper
functions with broader applications (e.g., computing probability
distributions of certain nonparametric statistics, computing two-sided
p-values from asymmetric distributions using three different methods, and
computing cumulative probabilities for ratios of quadratic forms in normal
random vectors).

All of the included heteroskedasticity tests are taken from
statistics/econometrics literature but many of them have never (to my
knowledge) been made available in statistical software until now.

The new version of the package also incorporates unit tests via testthat
and is thus more robust against code breaks.

CRAN page: https://cran.r-project.org/package=skedastic
Github page: https://github.com/tjfarrar/skedastic

Sincerely,
Thomas

[[alternative HTML version deleted]]

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] select() columns using their positions

2020-08-20 Thread Rui Barradas

Hello,

It is also possible to select by vectors of indices (as opposed to a 
vector):

top_n is just to not clutter the display.


library(dplyr)

data(iris)

iris %>% select(1, 3, 4) %>% top_n(5)
iris %>% select(c(1, 3), 4) %>% top_n(5)


Hope this helps,

Rui Barradas


Às 10:05 de 20/08/20, Ivan Calandra escreveu:

OK, my bad... I'm sure I had tried it and it didn't work, but I guess
the error was somewhere else...

Thank you!
Ivan

--
Dr. Ivan Calandra
TraCEr, laboratory for Traceology and Controlled Experiments
MONREPOS Archaeological Research Centre and
Museum for Human Behavioural Evolution
Schloss Monrepos
56567 Neuwied, Germany
+49 (0) 2631 9772-243
https://www.researchgate.net/profile/Ivan_Calandra

On 20/08/2020 11:03, Jeff Newmiller wrote:

Did you try it?

mydata %>%
   select( c( 1, 2, 4 ) )

On August 20, 2020 1:41:13 AM PDT, Ivan Calandra  wrote:

Dear useRs,

I'm still trying to learn tidyverse syntax.

I would like to select() columns based on their positions/indices, but
I
cannot find a way to do that (I've seen a lot about doing that for
rows,
but I could not find anything for columns). I thought it would be
obvious, but I cannot find it.

Basically, I am looking for something like:
mydata %>%
   select( vector_of_indices )
I know that the pipe is useless here, but there are more steps in my
real code.

The helper num_range() works only when headers contains the positions
(e.g. "x1, x2...").

Of course, it's easy using "[", but I expected it would be possible
with
select() as well; it would make the code more readable than:
mydata %>%
   .[ vector_of_indices ]

Thank you for your help.
Ivan


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to prove sum of probabilities to be one in R?

2020-08-20 Thread Duncan Murdoch

On 20/08/2020 3:42 a.m., Shaami wrote:
> Hi Dear
>
> I am facing a floating-point problem related to the sum of probabilities.
> It is really difficult to prove that sum of probabilities is 1 because of
> some minor differences. The MWE is as follows.
>
>> p1=0.
>> p2=0.003> p1+p2==1[1] FALSE
>
> The sum of probabilities is approximately 1. The difference from 1 is
> 1-p1-p2 =  9.97e-09, that is very small. I need to apply the sum of
> probabilities conditions in my many functions. But execution is halted
> because of floating point.
>
> Could anyone please guide about that?
Use the approximate test

isTRUE(all.equal(p1+p2, 1))

If the default tolerance of sqrt(.Machine$double.eps) (about 1.5e-8) is 
wrong, change it:


isTRUE(all.equal(p1+p2, 1, tolerance = 0.1))

This says anything between 0.9 and 1.1 is equal to 1.

Duncan Murdoch

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] combine filter() and select()

2020-08-20 Thread Martin Morgan
A kind of hybrid answer is to use base::subset(), which supports non-standard 
evaluation (it searches for unquoted symbols like 'files' in the code line 
below in the object that is its first argument; %>% puts 'mytbl' in that first 
position) and row (filter) and column (select) subsets

> mytbl %>% subset(files %in% "a", files)
# A tibble: 1 x 1
  files
  
1 a

Or subset(grepl("a", files), files) if that was what you meant.

One important idea that the tidyverse implements is, in my opinion, 
'endomorphism' -- you get back the same type of object as you put in -- so I 
wouldn't use a base R idiom that returned a vector unless that were somehow 
essential for the next step in the analysis. 

There is value in having separate functions for filter() and select(), and 
probably there are edge cases where filter(), select(), and subset() behave 
differently, but for what it's worth subset() can be used to perform these 
operations individually

> mytbl %>% subset(, files)
# A tibble: 6 x 1
  files
  
1 a
2 b
3 c
4 d
5 e
6 f
> mytbl %>% subset(grepl("a", files), )
# A tibble: 1 x 2
  files  prop
   
1 a 1

Martin Morgan

On 8/20/20, 2:48 AM, "R-help on behalf of Ivan Calandra" 
 wrote:

Hi Jeff,

The code you show is exactly what I usually do, in base R; but I wanted
to play with tidyverse to learn it (and also understand when it makes
sense and when it doesn't).

And yes, of course, in the example I gave, I end up with a 1-cell
tibble, which could be better extracted as a length-1 vector. But my
real goal is not to end up with a single value or even a single column.
I just thought that simplifying my example was the best approach to ask
for advice.

But thank you for letting me know that what I'm doing is pointless!

Ivan

--
Dr. Ivan Calandra
TraCEr, laboratory for Traceology and Controlled Experiments
MONREPOS Archaeological Research Centre and
Museum for Human Behavioural Evolution
Schloss Monrepos
56567 Neuwied, Germany
+49 (0) 2631 9772-243
https://www.researchgate.net/profile/Ivan_Calandra

On 19/08/2020 19:27, Jeff Newmiller wrote:
> The whole point of dplyr primitives is to support data frames... that is, 
lists of columns. When you pare your data frame down to one column you are 
almost certainly using the wrong tool for the job.
>
> So, sure, your code works... and it even does what you wanted in the 
dplyr style, but what a pointless exercise.
>
> grep( "a", mytbl$file, value=TRUE )
>
> On August 19, 2020 7:56:32 AM PDT, Ivan Calandra  wrote:
>> Dear useRs,
>>
>> I'm new to the tidyverse world and I need some help on basic things.
>>
>> I have the following tibble:
>> mytbl <- structure(list(files = c("a", "b", "c", "d", "e", "f"), prop =
>> 1:6), row.names = c(NA, -6L), class = c("tbl_df", "tbl", "data.frame"))
>>
>> I want to subset the rows with "a" in the column "files", and keep only
>> that column.
>>
>> So I did:
>> myfile <- mytbl %>%
>>   filter(grepl("a", files)) %>%
>>   select(files)
>>
>> It works, but I believe there must be an easier way to combine filter()
>> and select(), right?
>>
>> Thank you!
>> Ivan

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to prove sum of probabilities to be one in R?

2020-08-20 Thread Shaami
Hi Dear

I am facing a floating-point problem related to the sum of probabilities.
It is really difficult to prove that sum of probabilities is 1 because of
some minor differences. The MWE is as follows.

> p1=0.
> p2=0.003> p1+p2==1[1] FALSE

The sum of probabilities is approximately 1. The difference from 1 is
1-p1-p2 =  9.97e-09, that is very small. I need to apply the sum of
probabilities conditions in my many functions. But execution is halted
because of floating point.

Could anyone please guide about that?

Thank you

Regards

Shaami

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] select() columns using their positions

2020-08-20 Thread Ivan Calandra
OK, my bad... I'm sure I had tried it and it didn't work, but I guess
the error was somewhere else...

Thank you!
Ivan

--
Dr. Ivan Calandra
TraCEr, laboratory for Traceology and Controlled Experiments
MONREPOS Archaeological Research Centre and
Museum for Human Behavioural Evolution
Schloss Monrepos
56567 Neuwied, Germany
+49 (0) 2631 9772-243
https://www.researchgate.net/profile/Ivan_Calandra

On 20/08/2020 11:03, Jeff Newmiller wrote:
> Did you try it?
>
> mydata %>%
>   select( c( 1, 2, 4 ) )
>
> On August 20, 2020 1:41:13 AM PDT, Ivan Calandra  wrote:
>> Dear useRs,
>>
>> I'm still trying to learn tidyverse syntax.
>>
>> I would like to select() columns based on their positions/indices, but
>> I
>> cannot find a way to do that (I've seen a lot about doing that for
>> rows,
>> but I could not find anything for columns). I thought it would be
>> obvious, but I cannot find it.
>>
>> Basically, I am looking for something like:
>> mydata %>%
>>   select( vector_of_indices )
>> I know that the pipe is useless here, but there are more steps in my
>> real code.
>>
>> The helper num_range() works only when headers contains the positions
>> (e.g. "x1, x2...").
>>
>> Of course, it's easy using "[", but I expected it would be possible
>> with
>> select() as well; it would make the code more readable than:
>> mydata %>%
>>   .[ vector_of_indices ]
>>
>> Thank you for your help.
>> Ivan

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] select() columns using their positions

2020-08-20 Thread Jeff Newmiller
Did you try it?

mydata %>%
  select( c( 1, 2, 4 ) )

On August 20, 2020 1:41:13 AM PDT, Ivan Calandra  wrote:
>Dear useRs,
>
>I'm still trying to learn tidyverse syntax.
>
>I would like to select() columns based on their positions/indices, but
>I
>cannot find a way to do that (I've seen a lot about doing that for
>rows,
>but I could not find anything for columns). I thought it would be
>obvious, but I cannot find it.
>
>Basically, I am looking for something like:
>mydata %>%
>  select( vector_of_indices )
>I know that the pipe is useless here, but there are more steps in my
>real code.
>
>The helper num_range() works only when headers contains the positions
>(e.g. "x1, x2...").
>
>Of course, it's easy using "[", but I expected it would be possible
>with
>select() as well; it would make the code more readable than:
>mydata %>%
>  .[ vector_of_indices ]
>
>Thank you for your help.
>Ivan

-- 
Sent from my phone. Please excuse my brevity.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] select() columns using their positions

2020-08-20 Thread Ivan Calandra
Dear useRs,

I'm still trying to learn tidyverse syntax.

I would like to select() columns based on their positions/indices, but I
cannot find a way to do that (I've seen a lot about doing that for rows,
but I could not find anything for columns). I thought it would be
obvious, but I cannot find it.

Basically, I am looking for something like:
mydata %>%
  select( vector_of_indices )
I know that the pipe is useless here, but there are more steps in my
real code.

The helper num_range() works only when headers contains the positions
(e.g. "x1, x2...").

Of course, it's easy using "[", but I expected it would be possible with
select() as well; it would make the code more readable than:
mydata %>%
  .[ vector_of_indices ]

Thank you for your help.
Ivan

-- 
Dr. Ivan Calandra
TraCEr, laboratory for Traceology and Controlled Experiments
MONREPOS Archaeological Research Centre and
Museum for Human Behavioural Evolution
Schloss Monrepos
56567 Neuwied, Germany
+49 (0) 2631 9772-243
https://www.researchgate.net/profile/Ivan_Calandra

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] combine filter() and select()

2020-08-20 Thread Ivan Calandra
Hi Jeff,

The code you show is exactly what I usually do, in base R; but I wanted
to play with tidyverse to learn it (and also understand when it makes
sense and when it doesn't).

And yes, of course, in the example I gave, I end up with a 1-cell
tibble, which could be better extracted as a length-1 vector. But my
real goal is not to end up with a single value or even a single column.
I just thought that simplifying my example was the best approach to ask
for advice.

But thank you for letting me know that what I'm doing is pointless!

Ivan

--
Dr. Ivan Calandra
TraCEr, laboratory for Traceology and Controlled Experiments
MONREPOS Archaeological Research Centre and
Museum for Human Behavioural Evolution
Schloss Monrepos
56567 Neuwied, Germany
+49 (0) 2631 9772-243
https://www.researchgate.net/profile/Ivan_Calandra

On 19/08/2020 19:27, Jeff Newmiller wrote:
> The whole point of dplyr primitives is to support data frames... that is, 
> lists of columns. When you pare your data frame down to one column you are 
> almost certainly using the wrong tool for the job.
>
> So, sure, your code works... and it even does what you wanted in the dplyr 
> style, but what a pointless exercise.
>
> grep( "a", mytbl$file, value=TRUE )
>
> On August 19, 2020 7:56:32 AM PDT, Ivan Calandra  wrote:
>> Dear useRs,
>>
>> I'm new to the tidyverse world and I need some help on basic things.
>>
>> I have the following tibble:
>> mytbl <- structure(list(files = c("a", "b", "c", "d", "e", "f"), prop =
>> 1:6), row.names = c(NA, -6L), class = c("tbl_df", "tbl", "data.frame"))
>>
>> I want to subset the rows with "a" in the column "files", and keep only
>> that column.
>>
>> So I did:
>> myfile <- mytbl %>%
>>   filter(grepl("a", files)) %>%
>>   select(files)
>>
>> It works, but I believe there must be an easier way to combine filter()
>> and select(), right?
>>
>> Thank you!
>> Ivan

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] combine filter() and select()

2020-08-20 Thread Ivan Calandra
Dear Chris,

I didn't think about having the assignment at the end as you showed; it
indeed fits the pipe workflow better.

By "easy", I actually meant shorter. As you said, in base R, I usually
do that in 1 line, so I was hoping to do the same in tidyverse. But I'm
glad to hear that I'm using tidyverse the proper way :)

Best regards,
Ivan

--
Dr. Ivan Calandra
TraCEr, laboratory for Traceology and Controlled Experiments
MONREPOS Archaeological Research Centre and
Museum for Human Behavioural Evolution
Schloss Monrepos
56567 Neuwied, Germany
+49 (0) 2631 9772-243
https://www.researchgate.net/profile/Ivan_Calandra

On 19/08/2020 19:21, Chris Evans wrote:
> Inline
>
> - Original Message -
>> From: "Ivan Calandra" 
>> To: "R-help" 
>> Sent: Wednesday, 19 August, 2020 16:56:32
>> Subject: [R] combine filter() and select()
>> Dear useRs,
>>
>> I'm new to the tidyverse world and I need some help on basic things.
>>
>> I have the following tibble:
>> mytbl <- structure(list(files = c("a", "b", "c", "d", "e", "f"), prop =
>> 1:6), row.names = c(NA, -6L), class = c("tbl_df", "tbl", "data.frame"))
>>
>> I want to subset the rows with "a" in the column "files", and keep only
>> that column.
>>
>> So I did:
>> myfile <- mytbl %>%
>>   filter(grepl("a", files)) %>%
>>   select(files)
>>
>> It works, but I believe there must be an easier way to combine filter()
>> and select(), right?
> I would write 
>
> mytbl %>%
>   filter(grepl("a", files)) %>%
>   select(files) -> myfile
>
> as I like to keep a sort of "top to bottom and left to right" flow when 
> writing in the tidyverse dialect of R but that's really not important.
>
> Apart from that I think what you've done is "proper tidyverse". To me another 
> difference between the dialects is that classical R often seems to put value 
> on, and make it easy, to do things with incredible few characters.  I think 
> the people who are brilliant at that sort of coding, and there are many on 
> this list, that sort of coding is also easy to read.  I know that Chinese is 
> easy to read if you grew up on it but to a bear of little brain like me, the 
> much more verbose style of tidyverse repays typing time with readability when 
> I come back to my code and, though I have little experience of this yet, when 
> I read other poeple's code.
>
> What did you think wasn't "easy" about what you wrote?
>
> Very best (all),
>
> Chris
>
>> Thank you!
>> Ivan
>>
>> --
>> Dr. Ivan Calandra
>> TraCEr, laboratory for Traceology and Controlled Experiments
>> MONREPOS Archaeological Research Centre and
>> Museum for Human Behavioural Evolution
>> Schloss Monrepos
>> 56567 Neuwied, Germany
>> +49 (0) 2631 9772-243
>> https://www.researchgate.net/profile/Ivan_Calandra
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] & and |

2020-08-20 Thread Ivan Calandra
Thank you all for all the very helpful answers!

Best,
Ivan

--
Dr. Ivan Calandra
TraCEr, laboratory for Traceology and Controlled Experiments
MONREPOS Archaeological Research Centre and
Museum for Human Behavioural Evolution
Schloss Monrepos
56567 Neuwied, Germany
+49 (0) 2631 9772-243
https://www.researchgate.net/profile/Ivan_Calandra

On 20/08/2020 3:28, Richard O'Keefe wrote:
> There are & and | operators in the R language.
> There is an | operator in regular expressions.
> There is NOT any & operator in regular expressions.
> grep("ConfoMap", mydata, value=TRUE)
> looks for elements of mydata containing the literal
> string 'ConfoMap'.
>
> > foo <- c("a","b","cab","back")
> > foo[grepl("a",foo) & grepl("b",foo)]
> [1] "cab"  "back"
>
> grepl returns a TRUE/FALSE vector.
>
> On Thu, 20 Aug 2020 at 02:53, Ivan Calandra  > wrote:
>
> Dear useRs,
>
> I feel really stupid, but I cannot understand why "&" doesn't work
> as I
> expect, while "|" does.
>
> I have the following vector:
> mydata <- c("SSFA-ConfoMap_GuineaPigs_NMPfilled.csv",
> "SSFA-ConfoMap_Lithics_NMPfilled.csv", 
> "SSFA-ConfoMap_Sheeps_NMPfilled.csv",
> "SSFA-Toothfrax_GuineaPigs.xlsx",
> "SSFA-Toothfrax_Lithics.xlsx", "SSFA-Toothfrax_Sheeps.xlsx")
> and I want to find the values that include both "ConfoMap" and
> "GuineaPigs".
>
> If I do:
> grep("ConfoMap", mydata, value=TRUE)
> it returns an empty vector, character(0).
>
> But if I do:
> grep("ConfoMap|GuineaPigs", mydata, value=TRUE)
> it returns all the elements that include either "ConfoMap" or
> "GuineaPigs", as I would expect.
>
> So what is wrong with my "&" construct? How can I return the elements
> that include both parts?
>
> Thank you for your help!
> Ivan
>
> -- 
> Dr. Ivan Calandra
> TraCEr, laboratory for Traceology and Controlled Experiments
> MONREPOS Archaeological Research Centre and
> Museum for Human Behavioural Evolution
> Schloss Monrepos
> 56567 Neuwied, Germany
> +49 (0) 2631 9772-243
> https://www.researchgate.net/profile/Ivan_Calandra
>
> __
> R-help@r-project.org  mailing list --
> To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.