from:"Bogdan Tanasa"

Re: [R] identify the distribution of the data

2023-09-27 Thread Bogdan Tanasa

Dear all,

Thank you for your insights, suggestions and for sharing your knowledge. I
have found the package fitdistrplus to meet our needs.

Warm regards,

Bogdan

On Wed, Feb 8, 2023 at 11:10 PM PIKAL Petr  wrote:

> Hi
>
> Others gave you more fundamental answers. To check the possible
> distribution
> you could use package
>
> https://cran.r-project.org/web/packages/fitdistrplus/index.html
>
> Cheers
> Petr
>
> > -Original Message-
> > From: R-help  On Behalf Of Bogdan Tanasa
> > Sent: Wednesday, February 8, 2023 5:35 PM
> > To: r-help 
> > Subject: [R] identify the distribution of the data
> >
> > Dear all,
> >
> > I do have dataframes with numerical values such as 1,9, 20, 51, 100 etc
> >
> > Which way do you recommend to use in order to identify the type of the
> > distribution of the data (normal, poisson, bernoulli, exponential,
> log-normal etc
> > ..)
> >
> > Thanks so much,
> >
> > Bogdan
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] identify the distribution of the data

2023-02-08 Thread Bogdan Tanasa

Dear all,

I do have dataframes with numerical values such as 1,9, 20, 51, 100 etc

Which way do you recommend to use in order to identify the type of the
distribution of the data (normal, poisson, bernoulli, exponential,
log-normal etc ..)

Thanks so much,

Bogdan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] overlaying two graphs / plots /lines

2023-02-07 Thread Bogdan Tanasa

Thanks a lot Rui and Jim. Works great !

On Tue, Feb 7, 2023, 1:34 PM Rui Barradas  wrote:

> Às 21:18 de 07/02/2023, Jim Lemon escreveu:
> > Hi Bogdan,
> > Try this:
> >
> > A<-data.frame(x=c(1,7,9,20),
> >   y=c(39,91,100,3))
> > B<-data.frame(x=c(10,21,67,99,200),
> >   y=c(9,89,1000,90,1001)) # one value omitted to equalize the rows
> > xrange<-range(c(unlist(A$x),unlist(B$x)))
> > yrange<-range(c(unlist(A$y),unlist(B$y)))
> > plot(A,type="l",xlim=xrange,ylim=yrange,col="red")
> > lines(B,lty=2,col="blue")
> > legend(150,400,c("A","B"),lty=1:2,col=c("red","blue"))
> >
> > There are other tricks to deal with the differences in range between A
> and B.
> >
> > Jim
> >
> > On Wed, Feb 8, 2023 at 7:57 AM Bogdan Tanasa  wrote:
> >>
> >>   Dear all,
> >>
> >> Any suggestions on how I could overlay two or more graphs / plots /
> lines
> >> that have different sizes and the x axes have different breakpoints.
> >>
> >> One dataframe is : A :
> >>
> >> on x axis : 1 , 7, 9, 20, etc ... (100 elements)
> >> on y axis : 39, 91, 100, 3, etc ... (100 elements)
> >>
> >>
> >> The other dataframe is : B :
> >>
> >> on x axis : 10, 21, 67, 99, 200 etc .. (200 elements).
> >> on y axis :  9, 0, 89, 1000, 90, 1001. ... (200 elements).
> >>
> >> Thanks a lot,
> >>
> >> Bogdan
> >>
> >>  [[alternative HTML version deleted]]
> >>
> >> __
> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> Hello,
>
> Here is a ggplot way.
> I'll use the same data.
>
> On each data.frame, create an id column, saying which df it is.
>
>
> A<-data.frame(x=c(1,7,9,20),
>y=c(39,91,100,3))
> B<-data.frame(x=c(10,21,67,99,200),
>y=c(9,89,1000,90,1001)) # one value omitted to equalize
> the rows
>
> suppressPackageStartupMessages({
>library(dplyr)
>library(ggplot2)
> })
>
> bind_rows(
>A %>% mutate(id = "A"),
>B %>% mutate(id = "B")
> )
> #> xy id
> #> 1   1   39  A
> #> 2   7   91  A
> #> 3   9  100  A
> #> 4  203  A
> #> 5  109  B
> #> 6  21   89  B
> #> 7  67 1000  B
> #> 8  99   90  B
> #> 9 200 1001  B
>
>
> To do this in a pipe doesn't change the original data.
> Then pipe the result to ggplot separating the lines by mapping id to
> color. ggplot will automatically take care of the axis ranges.
>
>
> bind_rows(
>A %>% mutate(id = "A"),
>B %>% mutate(id = "B")
> ) %>%
>ggplot(aes(x, y, colour = id)) +
>geom_line() +
>theme_bw()
>
>
> Hope this helps,
>
> Rui Barradas
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] overlaying two graphs / plots /lines

2023-02-07 Thread Bogdan Tanasa

 Dear all,

Any suggestions on how I could overlay two or more graphs / plots / lines
that have different sizes and the x axes have different breakpoints.

One dataframe is : A :

on x axis : 1 , 7, 9, 20, etc ... (100 elements)
on y axis : 39, 91, 100, 3, etc ... (100 elements)


The other dataframe is : B :

on x axis : 10, 21, 67, 99, 200 etc .. (200 elements).
on y axis :  9, 0, 89, 1000, 90, 1001. ... (200 elements).

Thanks a lot,

Bogdan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] confidence intervals

2022-09-26 Thread Bogdan Tanasa

HI everyone,

THe conversation below is already old :) Just wanted to add that I received
the explanations regarding the confidence intervals that are computed in
the article(s) that we have briefly discussed.
The answers have been posted on stackoverflow, com :
https://stats.stackexchange.com/questions/587641/confidence-intervals-of-a-biological-assay/587715#587715

Thanks again for your time and help,

Bogdan



On Fri, Sep 9, 2022 at 8:44 AM Ebert,Timothy Aaron  wrote:

> Not to worry, the second article did not have equations either. It had a
> couple bits of code and a github link to icechip (page 3295).
>
> I got through the paywall using a link through the University of Florida
> library system.
>
>
>
> Maybe the linked papers include citations that describe the analysis in
> enough detail, but that effort exceeds my time limit for this activity.
>
>
>
> Tim
>
>
>
> *From:* David Winsemius 
> *Sent:* Thursday, September 8, 2022 8:51 PM
> *To:* Bogdan Tanasa 
> *Cc:* Ebert,Timothy Aaron ; r-help 
> *Subject:* Re: [R] confidence intervals
>
>
>
> *[External Email]*
>
> The first article had no code and did not describe a formula that I could
> find which matched your code. The second article is behind a paywall.
>
>
>
> —
>
> David.
>
> Sent from my iPhone
>
>
>
> On Sep 3, 2022, at 3:39 PM, Bogdan Tanasa  wrote:
>
> 
>
> Dear Aaron, David, and everyone,
>
>
>
> Thank you again for your comments on my question related to the confidence
> intervals. I am sorry for the late reply.
>
>
>
> The definition of the 95 confidence intervals where our discussion
> originates from has been proposed by the authors of these two articles (I
> am including the links to the articles just to show that the formula has
> been published in a methods article a while ago; the articles are in the
> field of biology though, where not too many of you are part of, I guess).
> These authors have written the scripts and they have made those available
> on github. I have asked a while ago the authors why they have chosen this
> formula, however, I have not received any reply. In any case, at this
> moment I will use the mathematical formulas described in the articles :
>
>
>
> https://www.cell.com/molecular-cell/fulltext/S1097-2765(15)00304-4
> <https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.cell.com%2Fmolecular-cell%2Ffulltext%2FS1097-2765(15)00304-4=05%7C01%7Ctebert%40ufl.edu%7Cb8ea3424c9174ac48ecc08da91fd737e%7C0d4da0f84a314d76ace60a62331e1b84%7C0%7C0%7C637982814996716811%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=vkeF5mjxFkAJtsshVbzOEblOakQVe9TUVG9WUy79KRI%3D=0>
>
>
>
> https://www.nature.com/articles/s41596-019-0218-7
> <https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.nature.com%2Farticles%2Fs41596-019-0218-7=05%7C01%7Ctebert%40ufl.edu%7Cb8ea3424c9174ac48ecc08da91fd737e%7C0d4da0f84a314d76ace60a62331e1b84%7C0%7C0%7C637982814996716811%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=QisH5QlhY699g3ExBEh89nrtR3H3ZX9AarqtvUMKzpw%3D=0>
>
>
>
> Wishing everyone a good weekend,
>
>
>
> Bogdan
>
>
>
>
>
>
>
> On Sun, Aug 28, 2022 at 6:53 PM Ebert,Timothy Aaron 
> wrote:
>
> I have a general dislike of "analysis emergencies." I would like to see a
> data emergency wherein someone must cram 3 years of data collection into 18
> months so that they have time to work out the correct analysis. I am sure
> others would suggest working out how analyze the data before starting the
> experiment.
>
> Our business office gives this advice to faculty members: An emergency on
> your part is not an emergency on our part.
>
> How about starting by answering the questions posted by the people you are
> hoping will help. Focus on David's middle paragraph. However, if you can
> re-code everything to work, then it would seem that you already know the
> answer and it might be simpler/faster to write the correct code.
>
> You might spend some time looking for a scientific paper that uses that
> equation for the confidence interval and thereby get some context to
> explain why the equation is correct.
>
> Tim
>
> -Original Message-
> From: R-help  On Behalf Of Bogdan Tanasa
> Sent: Sunday, August 28, 2022 8:55 PM
> To: David Winsemius 
> Cc: r-help 
> Subject: Re: [R] confidence intervals
>
> [External Email]
>
> Hi David,
>
> Thank you for your comments, and feed-back message. I am very happy to
> learn from the experience of the people on R mailing list, and without any
> doubt, I am very

Re: [R] confidence intervals

2022-09-03 Thread Bogdan Tanasa

Dear Aaron, David, and everyone,

Thank you again for your comments on my question related to the confidence
intervals. I am sorry for the late reply.

The definition of the 95 confidence intervals where our discussion
originates from has been proposed by the authors of these two articles (I
am including the links to the articles just to show that the formula has
been published in a methods article a while ago; the articles are in the
field of biology though, where not too many of you are part of, I guess).
These authors have written the scripts and they have made those available
on github. I have asked a while ago the authors why they have chosen this
formula, however, I have not received any reply. In any case, at this
moment I will use the mathematical formulas described in the articles :

https://www.cell.com/molecular-cell/fulltext/S1097-2765(15)00304-4

https://www.nature.com/articles/s41596-019-0218-7

Wishing everyone a good weekend,

Bogdan



On Sun, Aug 28, 2022 at 6:53 PM Ebert,Timothy Aaron  wrote:

> I have a general dislike of "analysis emergencies." I would like to see a
> data emergency wherein someone must cram 3 years of data collection into 18
> months so that they have time to work out the correct analysis. I am sure
> others would suggest working out how analyze the data before starting the
> experiment.
>
> Our business office gives this advice to faculty members: An emergency on
> your part is not an emergency on our part.
>
> How about starting by answering the questions posted by the people you are
> hoping will help. Focus on David's middle paragraph. However, if you can
> re-code everything to work, then it would seem that you already know the
> answer and it might be simpler/faster to write the correct code.
>
> You might spend some time looking for a scientific paper that uses that
> equation for the confidence interval and thereby get some context to
> explain why the equation is correct.
>
> Tim
>
> -Original Message-
> From: R-help  On Behalf Of Bogdan Tanasa
> Sent: Sunday, August 28, 2022 8:55 PM
> To: David Winsemius 
> Cc: r-help 
> Subject: Re: [R] confidence intervals
>
> [External Email]
>
> Hi David,
>
> Thank you for your comments, and feed-back message. I am very happy to
> learn from the experience of the people on R mailing list, and without any
> doubt, I am very thankful to you and to everyone for sharing their
> knowledge. I do apologize for any confusion that I have created unwillingly
> with my previous email.
>
> About my previous email related to the confidence intervals: indeed I have
> posted the question with a detailed description on stackoverflow, and the
> link is listed below.
>
> I have to admit that I have been in rush willing to have the suggestions
> of R-help members by Monday (if that would have been possible), as I have
> to make a decision at the beginning of this week on whether I need to
> re-code the shell script in R. I have a deadline on Wed. The script itself
> is less important per se, I have included it just to point our the origin
> of my question.
>
> I do certainly respect the principles of online R-help community, and I
> would very much appreciate if I could have your advice on the following :
> shall a "R code related emergency" arise, would it be acceptable to post
> the question on stackoverflow with the corresponding data tables and
> detailed code, and to refer the posting on R-help mailing list ?
>
> If it is acceptable at least for a single email, and if you do not mind, I
> could mention the link to stackoverflow, inviting our members to read it,
> shall they be comfortable with this topic.
>
>
> https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstackoverflow.com%2Fquestions%2F73507697%2Fconfidence-intervals-of-a-biological-assay%3Fnoredirect%3D1%23comment129816241_73507697data=05%7C01%7Ctebert%40ufl.edu%7C0ba5d535471b46c05ec508da89592c20%7C0d4da0f84a314d76ace60a62331e1b84%7C0%7C0%7C637973313343894313%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=8sF7j4OCgH12qx4d8NCiw1%2FDbPa6nrui27S9C3ZNuL0%3Dreserved=0
>
> Thanks a lot, have  a good week !
>
> ~ Bogdan
>
>
> https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstackoverflow.com%2Fquestions%2F73507697%2Fconfidence-intervals-of-a-biological-assay%3Fnoredirect%3D1%23comment129816241_73507697data=05%7C01%7Ctebert%40ufl.edu%7C0ba5d535471b46c05ec508da89592c20%7C0d4da0f84a314d76ace60a62331e1b84%7C0%7C0%7C637973313343894313%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=8sF7j4OCgH12qx4d8NCiw1%2FDbPa6nrui27S9C3ZNuL0%3Dreserved=0
>
> On Sat, Aug 27, 2022, 6:52 PM David Winsemius 
>

Re: [R] confidence intervals

2022-08-28 Thread Bogdan Tanasa

Hi David,

Thank you for your comments, and feed-back message. I am very happy to
learn from the experience of the people on R mailing list, and without any
doubt, I am very thankful to you and to everyone for sharing their
knowledge. I do apologize for any confusion that I have created unwillingly
with my previous email.

About my previous email related to the confidence intervals: indeed I have
posted the question with a detailed description on stackoverflow, and the
link is listed below.

I have to admit that I have been in rush willing to have the suggestions of
R-help members by Monday (if that would have been possible), as I have to
make a decision at the beginning of this week on whether I need to re-code
the shell script in R. I have a deadline on Wed. The script itself is less
important per se, I have included it just to point our the origin of my
question.

I do certainly respect the principles of online R-help community, and I
would very much appreciate if I could have your advice on the following :
shall a "R code related emergency" arise, would it be acceptable to post
the question on stackoverflow with the corresponding data tables and
detailed code, and to refer the posting on R-help mailing list ?

If it is acceptable at least for a single email, and if you do not mind, I
could mention the link to stackoverflow, inviting our members to read it,
shall they be comfortable with this topic.

https://stackoverflow.com/questions/73507697/confidence-intervals-of-a-biological-assay?noredirect=1#comment129816241_73507697

Thanks a lot, have  a good week !

~ Bogdan

https://stackoverflow.com/questions/73507697/confidence-intervals-of-a-biological-assay?noredirect=1#comment129816241_73507697

On Sat, Aug 27, 2022, 6:52 PM David Winsemius 
wrote:

> You cross-posted this to StackOverflow and did not say so.  ... and you
> posted in HTML Bad dog squared. I cast one of the close votes on SO, but
> here I can only say ... READ the Posting Guide.
>
> You also give no citation other than someone's Github files with minimal
> comments in that material. You should indicate whether this code has any
> solid support. Why do you think this code is something to depend upon?
>
> After all, you been posting questions on R-help for several months.
> Don't you think you should make a good faith effort to understand the
> principles underlying this resource?
>
>
> --
>
> David.
>
> On 8/26/22 17:55, Bogdan Tanasa wrote:
> > Dear all,
> >
> > Although I know that it is not a statistics mailing list, given my work
> on
> > ICeChIP
> >
> >
> https://github.com/shah-rohan/icechip/blob/master/Scripts/computeHMDandError
> >
> > I would appreciate to have the answer to a question :
> >
> > given two variables a and b (a and b can have 1000 paired-values) and a
> > calibration number "cal",
> >
> > why the 95 confidence interval has been calculated as such for each value
> > a(i) and b(i) :
> >
> > 100 / cal * sqrt (( a/ (b^2) + (a^2) / (b ^3)) * 1.96
> >
> > Thank you,
> >
> > Bogdan
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] hdf5 files

2022-04-30 Thread Bogdan Tanasa

Dear all,

is there a way to read rge hd5 files in R without using hdf5r library ?

Thanks,

Bogdan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] installing an R package

2022-04-28 Thread Bogdan Tanasa

Thank you all for your time and suggestions.

The issue is that install.package() does not work, because of large file
size,

but at the end I have solved it by using /; options(timeout=1


On Thu, Apr 28, 2022 at 4:01 PM Uwe Ligges 
wrote:

>
>
> On 28.04.2022 23:55, David Winsemius wrote:
> > Pretty sure the right way to install that package is with the Bioc
> installer.
>
> or simply install.packages() after setting the repository.
>
>
> Best,
> Uwe Ligges
>
> >
> > Sent from my iPhone
> >
> >> On Apr 28, 2022, at 3:35 PM, Bogdan Tanasa  wrote:
> >>
> >> HI everyone,
> >>
> >> I must transfer a package from one platform (AWS) where I was able to
> >> install the package
> >>
> >> to another platform (local PC), where I am not able to install the
> package.
> >>
> >> The package is called : BSgenome.Hsapiens.UCSC.hg38
> >>
> >> Is there a way to transfer the files from BSgenome.Hsapiens.UCSC.hg38
> >> folder (below) from AWS to local PC and get it to run ? Thanks !
> >>
> >> 4.0KDESCRIPTION
> >> 4.0KINDEX
> >> 28K Meta
> >> 4.0KNAMESPACE
> >> 20K R
> >> 784Mextdata
> >> 24K help
> >> 12K html
> >>
> >> [[alternative HTML version deleted]]
> >>
> >> __
> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] installing an R package

2022-04-28 Thread Bogdan Tanasa

HI everyone,

I must transfer a package from one platform (AWS) where I was able to
install the package

to another platform (local PC), where I am not able to install the package.

The package is called : BSgenome.Hsapiens.UCSC.hg38

Is there a way to transfer the files from BSgenome.Hsapiens.UCSC.hg38
folder (below) from AWS to local PC and get it to run ? Thanks !

4.0KDESCRIPTION
4.0KINDEX
28K Meta
4.0KNAMESPACE
20K R
784Mextdata
24K help
12K html

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] help with installing R packages on Mac : these packages are downloaded but not compiled

2022-03-23 Thread Bogdan Tanasa

Thank you Jeff.

Well, the same messages I do receive not only I do install "tidyverse" but
also any other packages from BioConductor ;

specifically, the packages are downloaded but not compiled and not
installed.

I believe that it is a more global R issue with Mac Monterey, although I do
not know how to solve it. Thanks,

Bogdan



On Wed, Mar 23, 2022 at 6:42 PM Jeff Newmiller 
wrote:

> Tidyverse has dozens of dependencies... and when a dependency fails to
> install then you often need to install it explicitly... the automatic
> dependency algorithm doesn't seem to work robustly.
>
> Carefully read your error messages... it looks like you should start by
> installing backports.
>
> On March 23, 2022 5:52:29 PM PDT, Bert Gunter 
> wrote:
> >Mac specific issues generally belong on the R-sig-mac list, not here(I
> >of course don't know whether this is Mac specific or not. Folks on the
> >Mac list presumably would).
> >
> >
> >Bert Gunter
> >
> >"The trouble with having an open mind is that people keep coming along
> >and sticking things into it."
> >-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >
> >On Wed, Mar 23, 2022 at 5:41 PM Bogdan Tanasa  wrote:
> >>
> >> Dear all,
> >>
> >> I would appreciate to have your prompt help please on the following
> issue :
> >>
> >> I am the process of installing R and R packages on MacOS Monterrey.
> >>
> >> The packages are downloaded but not compiled and are not installed, as
> >> shown below.
> >>
> >>  I would appreciate any help that you can offer please. Thank you.
> >>
> >> > install.packages("tidyverse", dependencies=T)
> >> Installing package into
> >> ‘/Users/btanasa/Library/R/x86_64/4.1/library’(as ‘lib’ is unspecified)
> >> trying URL '
> https://cran.rstudio.com/bin/macosx/contrib/4.1/tidyverse_1.3.1.tgz'
> >> Content type 'application/x-gzip' length 421072 bytes (411
> >> KB)==
> >> downloaded 411 KB
> >>
> >>
> >> The downloaded binary packages are in
> >>
>  
> /var/folders/j1/vfxkcdz51l945jpfb2vplcsw47dvp9/T//RtmpYudPdW/downloaded_packages>
> >> > > library("tidyverse")
> >> Error: package or namespace load failed for ‘tidyverse’:
> >>  .onLoad failed in loadNamespace() for 'broom', details:
> >>   call: loadNamespace(x)
> >>   error: there is no package called ‘backports’
> >>
> >> [[alternative HTML version deleted]]
> >>
> >> __
> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >
> >__
> >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
>
> --
> Sent from my phone. Please excuse my brevity.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] arrow keys in R

2022-02-26 Thread Bogdan Tanasa

In R, I do press the arrow keys, and 2 things happen :

On one hand, the symbols ^[[A^[[A^[[A appear;

On the other hand, shall I start typing a command, such as "library", I
begin by typing the first 2 letters "li", click "left arrow", and the
result is "li "(i.e. lots of spaces) instead of having the command
"library" written on the screen.

Is there any way to fix it please ? Thanks,

Bogdan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Error in loadNamespace(i, c(lib.loc, .libPaths())

2021-11-05 Thread Bogdan Tanasa

Dear all,

 i am using Monocle3 in order to study disease development by single-cell
technologies :

https://cole-trapnell-lab.github.io/monocle3/

When I am installing additional packages like "spData" from

https://nowosad.github.io/spData/, I am getting the message :

Error in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]) :
  here is no package called ‘spData’

How shall I correct it ?

The installation is in a R/4.0.3 that runs on a SLURM cluster by the system
administrator.

Occasionally, additional packages in R are installed in my home folder :
/home/tanasa/

Thank you !

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] extracting a R object from an R image

2021-11-05 Thread Bogdan Tanasa

Dear all,

I saved my work in a Rimage that contains multiple objects ;

the objects were generated with Monocle3 :

https://cole-trapnell-lab.github.io/monocle3/docs/starting/

one object is called CDS.

How shall I extract this object CDS (that has a complex structure) from the
R image ?

thank you,

Bogdan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] docker containers and R

2021-10-27 Thread Bogdan Tanasa

Dear all, would you please advise :

shall I have a container that runs R (below), and install specifically a
package called UMI4Cats, obviously, a lot of other libraries are
installed.How can I save the docker container that contains the additional
libraries that I have installed and are required by UMI4Cats ?

https://www.bioconductor.org/help/docker/#running


-> % docker run -it --entrypoint=Rscript
bioconductor/bioconductor_docker:RELEASE_3_13 -e 'capabilities()'
   jpeg pngtiff   tcltk X11aqua
   TRUETRUETRUETRUE   FALSE   FALSE
   http/ftp sockets  libxmlfifo  cledit   iconv
   TRUETRUETRUETRUE   FALSETRUE
NLS   Rprof profmem   cairo ICU long.double
  FALSETRUETRUETRUETRUETRUE
libcurl
   TRUE

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R code in RData

2021-10-27 Thread Bogdan Tanasa

Dear all, would you please advice :

I have an Rdata file, what is the way to print the R code that has been
used inside the Rdata file ?

thank you,

Bogdan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] about a p-value < 2.2e-16

2021-03-19 Thread Bogdan Tanasa

thanks a lot, Jiefei ! and thanks to all for your time and comments !

have a good weekend !




On Fri, Mar 19, 2021 at 10:01 PM Jiefei Wang  wrote:

> Hi Bogdan,
>
> I think the journal is asking about the exact value of the pvalue, it
> doesn't matter if it is from the exact distribution or normal
> approximation. However, it does not make any sense to report such a small
> pvlaue. If I was you, I would show the reviewers the exact pvalue they want
> and gently explain why you did not put it into your paper. If they insist
> that the number must be on the paper, then go ahead and do it.
>
> Best,
> Jiefei
>
>
>
> Bogdan Tanasa  于 2021年3月20日周六 上午2:39写道：
>
>> Thank you Kevin, their wording is "Please note that the exact p value
>> should be provided, when possible, etc"
>>
>> by "exact p-value" i believe that they do mean indeed the actual number,
>> and not to specify "exact=TRUE" ;
>>
>> as we are working with 1000 genes, shall i specify "exact=TRUE" on my PC,
>> it runs out of memory ...
>>
>> wilcox.test(rnorm(1000), rnorm(1000, 2), exact=TRUE)$p.value
>>
>> On Fri, Mar 19, 2021 at 11:10 AM Kevin Thorpe 
>> wrote:
>>
>> > I have to ask since. Are you sure the journal simply means by exact
>> > p-value that they don’t want to see a p-value given as < 0.0001, for
>> > example, and simply want the actual number?
>> >
>> > I cannot imagine they really meant exact as in the p-value from some
>> exact
>> > distribution.
>> >
>> > --
>> > Kevin E. Thorpe
>> > Head of Biostatistics,  Applied Health Research Centre (AHRC)
>> > Li Ka Shing Knowledge Institute of St. Michael's
>> > Assistant Professor, Dalla Lana School of Public Health
>> > University of Toronto
>> > email: kevin.tho...@utoronto.ca  Tel: 416.864.5776  Fax: 416.864.3016
>> >
>> > > On Mar 19, 2021, at 1:22 PM, Bogdan Tanasa  wrote:
>> > >
>> > > EXTERNAL EMAIL:
>> > >
>> > > Dear all, thank you all for comments and help.
>> > >
>> > > as far as i can see, shall we have samples of 1000 records, only
>> > > "exact=FALSE" allows the code to run:
>> > >
>> > > wilcox.test(rnorm(1000), rnorm(1000, 2), exact=FALSE)$p.value
>> > > [1] 7.304863e-231
>> > >
>> > > shall i use "exact=TRUE", it runs out of memory on my 64GB RAM PC :
>> > >
>> > > wilcox.test(rnorm(1000), rnorm(1000, 2), exact=TRUE)$p.value
>> > > (the job is terminated by OS)
>> > >
>> > > shall you have any other suggestions, please let me know. thanks a
>> lot !
>> > >
>> > > On Fri, Mar 19, 2021 at 9:05 AM Bert Gunter 
>> > wrote:
>> > >
>> > >> I **believe** -- if my old memory still serves-- that the "exact"
>> > >> specification uses a home grown version of the algorithm to calculate
>> > >> exact,  or close approximations to the exact, permutation
>> distribution
>> > >> originally developed by Cyrus Mehta, founder of StatXact software.
>> Of
>> > >> course, examining the C code source would determine this, but I don't
>> > care
>> > >> to attempt this.
>> > >>
>> > >> If this is (no longer?) correct, please point this out.
>> > >>
>> > >> Best,
>> > >>
>> > >> Bert Gunter
>> > >>
>> > >> "The trouble with having an open mind is that people keep coming
>> along
>> > and
>> > >> sticking things into it."
>> > >> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>> > >>
>> > >>
>> > >> On Fri, Mar 19, 2021 at 8:42 AM Jiefei Wang 
>> wrote:
>> > >>
>> > >>> Hi Spencer,
>> > >>>
>> > >>> Thanks for your test results, I do not know the answer as I haven't
>> > >>> used wilcox.test for many years. I do not know if it is possible to
>> > >>> compute
>> > >>> the exact distribution of the Wilcoxon rank sum statistic, but I
>> think
>> > it
>> > >>> is very likely, as the document of `Wilcoxon` says:
>> > >>>
>> > >>> This distribution is obtained as follows. Let x and y be two random,
>> > >>> independent samples

Re: [R] about a p-value < 2.2e-16

2021-03-19 Thread Bogdan Tanasa

Thank you Kevin, their wording is "Please note that the exact p value
should be provided, when possible, etc"

by "exact p-value" i believe that they do mean indeed the actual number,
and not to specify "exact=TRUE" ;

as we are working with 1000 genes, shall i specify "exact=TRUE" on my PC,
it runs out of memory ...

wilcox.test(rnorm(1000), rnorm(1000, 2), exact=TRUE)$p.value

On Fri, Mar 19, 2021 at 11:10 AM Kevin Thorpe 
wrote:

> I have to ask since. Are you sure the journal simply means by exact
> p-value that they don’t want to see a p-value given as < 0.0001, for
> example, and simply want the actual number?
>
> I cannot imagine they really meant exact as in the p-value from some exact
> distribution.
>
> --
> Kevin E. Thorpe
> Head of Biostatistics,  Applied Health Research Centre (AHRC)
> Li Ka Shing Knowledge Institute of St. Michael's
> Assistant Professor, Dalla Lana School of Public Health
> University of Toronto
> email: kevin.tho...@utoronto.ca  Tel: 416.864.5776  Fax: 416.864.3016
>
> > On Mar 19, 2021, at 1:22 PM, Bogdan Tanasa  wrote:
> >
> > EXTERNAL EMAIL:
> >
> > Dear all, thank you all for comments and help.
> >
> > as far as i can see, shall we have samples of 1000 records, only
> > "exact=FALSE" allows the code to run:
> >
> > wilcox.test(rnorm(1000), rnorm(1000, 2), exact=FALSE)$p.value
> > [1] 7.304863e-231
> >
> > shall i use "exact=TRUE", it runs out of memory on my 64GB RAM PC :
> >
> > wilcox.test(rnorm(1000), rnorm(1000, 2), exact=TRUE)$p.value
> > (the job is terminated by OS)
> >
> > shall you have any other suggestions, please let me know. thanks a lot !
> >
> > On Fri, Mar 19, 2021 at 9:05 AM Bert Gunter 
> wrote:
> >
> >> I **believe** -- if my old memory still serves-- that the "exact"
> >> specification uses a home grown version of the algorithm to calculate
> >> exact,  or close approximations to the exact, permutation distribution
> >> originally developed by Cyrus Mehta, founder of StatXact software.  Of
> >> course, examining the C code source would determine this, but I don't
> care
> >> to attempt this.
> >>
> >> If this is (no longer?) correct, please point this out.
> >>
> >> Best,
> >>
> >> Bert Gunter
> >>
> >> "The trouble with having an open mind is that people keep coming along
> and
> >> sticking things into it."
> >> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >>
> >>
> >> On Fri, Mar 19, 2021 at 8:42 AM Jiefei Wang  wrote:
> >>
> >>> Hi Spencer,
> >>>
> >>> Thanks for your test results, I do not know the answer as I haven't
> >>> used wilcox.test for many years. I do not know if it is possible to
> >>> compute
> >>> the exact distribution of the Wilcoxon rank sum statistic, but I think
> it
> >>> is very likely, as the document of `Wilcoxon` says:
> >>>
> >>> This distribution is obtained as follows. Let x and y be two random,
> >>> independent samples of size m and n. Then the Wilcoxon rank sum
> statistic
> >>> is the number of all pairs (x[i], y[j]) for which y[j] is not greater
> than
> >>> x[i]. This statistic takes values between 0 and m * n, and its mean and
> >>> variance are m * n / 2 and m * n * (m + n + 1) / 12, respectively.
> >>>
> >>> As a nice feature of the non-parametric statistic, it is usually
> >>> distribution-free so you can pick any distribution you like to compute
> the
> >>> same statistic. I wonder if this is the case, but I might be wrong.
> >>>
> >>> Cheers,
> >>> Jiefei
> >>>
> >>>
> >>> On Fri, Mar 19, 2021 at 10:57 PM Spencer Graves <
> >>> spencer.gra...@effectivedefense.org> wrote:
> >>>
> >>>>
> >>>>
> >>>> On 2021-3-19 9:52 AM, Jiefei Wang wrote:
> >>>>> After digging into the R source, it turns out that the argument
> >>> `exact`
> >>>> has
> >>>>> nothing to do with the numeric precision. It only affects the
> >>> statistic
> >>>>> model used to compute the p-value. When `exact=TRUE` the true
> >>>> distribution
> >>>>> of the statistic will be used. Otherwise, a normal approximation will
> >>> be
> >>>>> used.
> >>>>>

Re: [R] about a p-value < 2.2e-16

2021-03-19 Thread Bogdan Tanasa

Dear all, thank you all for comments and help.

as far as i can see, shall we have samples of 1000 records, only
"exact=FALSE" allows the code to run:

wilcox.test(rnorm(1000), rnorm(1000, 2), exact=FALSE)$p.value
[1] 7.304863e-231

shall i use "exact=TRUE", it runs out of memory on my 64GB RAM PC :

wilcox.test(rnorm(1000), rnorm(1000, 2), exact=TRUE)$p.value
(the job is terminated by OS)

shall you have any other suggestions, please let me know. thanks a lot !

On Fri, Mar 19, 2021 at 9:05 AM Bert Gunter  wrote:

> I **believe** -- if my old memory still serves-- that the "exact"
> specification uses a home grown version of the algorithm to calculate
> exact,  or close approximations to the exact, permutation distribution
> originally developed by Cyrus Mehta, founder of StatXact software.  Of
> course, examining the C code source would determine this, but I don't care
> to attempt this.
>
> If this is (no longer?) correct, please point this out.
>
> Best,
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Fri, Mar 19, 2021 at 8:42 AM Jiefei Wang  wrote:
>
>> Hi Spencer,
>>
>> Thanks for your test results, I do not know the answer as I haven't
>> used wilcox.test for many years. I do not know if it is possible to
>> compute
>> the exact distribution of the Wilcoxon rank sum statistic, but I think it
>> is very likely, as the document of `Wilcoxon` says:
>>
>> This distribution is obtained as follows. Let x and y be two random,
>> independent samples of size m and n. Then the Wilcoxon rank sum statistic
>> is the number of all pairs (x[i], y[j]) for which y[j] is not greater than
>> x[i]. This statistic takes values between 0 and m * n, and its mean and
>> variance are m * n / 2 and m * n * (m + n + 1) / 12, respectively.
>>
>> As a nice feature of the non-parametric statistic, it is usually
>> distribution-free so you can pick any distribution you like to compute the
>> same statistic. I wonder if this is the case, but I might be wrong.
>>
>> Cheers,
>> Jiefei
>>
>>
>> On Fri, Mar 19, 2021 at 10:57 PM Spencer Graves <
>> spencer.gra...@effectivedefense.org> wrote:
>>
>> >
>> >
>> > On 2021-3-19 9:52 AM, Jiefei Wang wrote:
>> > > After digging into the R source, it turns out that the argument
>> `exact`
>> > has
>> > > nothing to do with the numeric precision. It only affects the
>> statistic
>> > > model used to compute the p-value. When `exact=TRUE` the true
>> > distribution
>> > > of the statistic will be used. Otherwise, a normal approximation will
>> be
>> > > used.
>> > >
>> > > I think the documentation needs to be improved here, you can compute
>> the
>> > > exact p-value *only* when you do not have any ties in your data. If
>> you
>> > > have ties in your data you will get the p-value from the normal
>> > > approximation no matter what value you put in `exact`. This behavior
>> > should
>> > > be documented or a warning should be given when `exact=TRUE` and ties
>> > > present.
>> > >
>> > > FYI, if the exact p-value is required, `pwilcox` function will be
>> used to
>> > > compute the p-value. There are no details on how it computes the
>> pvalue
>> > but
>> > > its C code seems to compute the probability table, so I assume it
>> > computes
>> > > the exact p-value from the true distribution of the statistic, not a
>> > > permutation or MC p-value.
>> >
>> >
>> >My example shows that it does NOT use Monte Carlo, because
>> > otherwise it uses some distribution.  I believe the term "exact" means
>> > that it uses the permutation distribution, though I could be mistaken.
>> > If it's NOT a permutation distribution, I don't know what it is.
>> >
>> >
>> >Spencer
>> > >
>> > > Best,
>> > > Jiefei
>> > >
>> > >
>> > >
>> > > On Fri, Mar 19, 2021 at 10:01 PM Jiefei Wang 
>> wrote:
>> > >
>> > >> Hey,
>> > >>
>> > >> I just want to point out that the word "exact" has two meanings. It
>> can
>> > >> mean the numerically accurate p-value as Bogdan asked in his first
>> > email

Re: [R] about a p-value < 2.2e-16

2021-03-19 Thread Bogdan Tanasa

Dear Jiefei, and all,

many thanks for your time and comments, suggestions, insights.

-- bogdan

On Fri, Mar 19, 2021 at 7:52 AM Jiefei Wang  wrote:

> After digging into the R source, it turns out that the argument `exact`
> has nothing to do with the numeric precision. It only affects the statistic
> model used to compute the p-value. When `exact=TRUE` the true distribution
> of the statistic will be used. Otherwise, a normal approximation will be
> used.
>
> I think the documentation needs to be improved here, you can compute the
> exact p-value *only* when you do not have any ties in your data. If you
> have ties in your data you will get the p-value from the normal
> approximation no matter what value you put in `exact`. This behavior should
> be documented or a warning should be given when `exact=TRUE` and ties
> present.
>
> FYI, if the exact p-value is required, `pwilcox` function will be used to
> compute the p-value. There are no details on how it computes the pvalue but
> its C code seems to compute the probability table, so I assume it computes
> the exact p-value from the true distribution of the statistic, not a
> permutation or MC p-value.
>
> Best,
> Jiefei
>
>
>
> On Fri, Mar 19, 2021 at 10:01 PM Jiefei Wang  wrote:
>
>> Hey,
>>
>> I just want to point out that the word "exact" has two meanings. It can
>> mean the numerically accurate p-value as Bogdan asked in his first email,
>> or it could mean the p-value calculated from the exact distribution of the
>> statistic(In this case, U stat). These two are actually not related, even
>> though they all called "exact".
>>
>> Best,
>> Jiefei
>>
>> On Fri, Mar 19, 2021 at 9:31 PM Spencer Graves <
>> spencer.gra...@effectivedefense.org> wrote:
>>
>>>
>>>
>>> On 2021-3-19 12:54 AM, Bogdan Tanasa wrote:
>>> > thanks a lot, Vivek ! in other words, assuming that we work with 1000
>>> data
>>> > points,
>>> >
>>> > shall we use EXACT = TRUE, it uses the normal approximation,
>>> >
>>> > while if EXACT=FALSE (for these large samples), it does not ?
>>>
>>>
>>>As David Winsemius noted, the documentation is not clear.
>>> Consider the following:
>>>
>>> > set.seed(1)  > x <- rnorm(100) > y <- rnorm(100, 2) > > wilcox.test(x,
>>> y)$p.value
>>> [1] 1.172189e-25 > wilcox.test(x, y)$p.value [1] 1.172189e-25 > >
>>> wilcox.test(x, y, EXACT=TRUE)$p.value [1] 1.172189e-25 > wilcox.test(x,
>>> y, EXACT=TRUE)$p.value [1] 1.172189e-25 > wilcox.test(x, y,
>>> exact=TRUE)$p.value [1] 4.123875e-32 > wilcox.test(x, y,
>>> exact=TRUE)$p.value [1] 4.123875e-32 > > wilcox.test(x, y,
>>> EXACT=FALSE)$p.value [1] 1.172189e-25 > wilcox.test(x, y,
>>> EXACT=FALSE)$p.value [1] 1.172189e-25 > wilcox.test(x, y,
>>> exact=FALSE)$p.value [1] 1.172189e-25 > wilcox.test(x, y,
>>> exact=FALSE)$p.value [1] 1.172189e-25 > We get two values here:
>>> 1.172189e-25 and 4.123875e-32. The first one, I think, is the normal
>>> approximation, which is the same as exact=FALSE. I think that with
>>> exact=FALSE, you get a permutation distribution, though I'm not sure.
>>> You might try looking at "wilcox_test in package coin for exact,
>>> asymptotic and Monte Carlo conditional p-values, including in the
>>> presence of ties" to see if it is clearer. NOTE: R is case sensitive, so
>>> "EXACT" is a different variable from "exact". It is interpreted as an
>>> optional argument, which is not recognized and therefore ignored in this
>>> context.
>>>   Hope this helps.
>>>   Spencer
>>>
>>>
>>> > On Thu, Mar 18, 2021 at 10:47 PM Vivek Das  wrote:
>>> >
>>> >> Hi Bogdan,
>>> >>
>>> >> You can also get the information from the link of the Wilcox.test
>>> function
>>> >> page.
>>> >>
>>> >> “By default (if exact is not specified), an exact p-value is computed
>>> if
>>> >> the samples contain less than 50 finite values and there are no ties.
>>> >> Otherwise, a normal approximation is used.”
>>> >>
>>> >> For more:
>>> >>
>>> >>
>>> https://stat.ethz.ch/R-manual/R-devel/library/stats/html/wilcox.test.html
>>> >>
>>> >> Hope this helps!
>>> >>
>>> >> Best,
>

Re: [R] about a p-value < 2.2e-16

2021-03-18 Thread Bogdan Tanasa

thanks a lot, Vivek ! in other words, assuming that we work with 1000 data
points,

shall we use EXACT = TRUE, it uses the normal approximation,

while if EXACT=FALSE (for these large samples), it does not ?

On Thu, Mar 18, 2021 at 10:47 PM Vivek Das  wrote:

> Hi Bogdan,
>
> You can also get the information from the link of the Wilcox.test function
> page.
>
> “By default (if exact is not specified), an exact p-value is computed if
> the samples contain less than 50 finite values and there are no ties.
> Otherwise, a normal approximation is used.”
>
> For more:
>
> https://stat.ethz.ch/R-manual/R-devel/library/stats/html/wilcox.test.html
>
> Hope this helps!
>
> Best,
>
> VD
>
>
> On Thu, Mar 18, 2021 at 10:36 PM Bogdan Tanasa  wrote:
>
>> Dear Peter, thanks a lot. yes, we can see a very precise p-value, and that
>> was the request from the journal.
>>
>> if I may ask another question please : what is the meaning of "exact=TRUE"
>> or "exact=FALSE" in wilcox.test ?
>>
>> i can see that the "numerically precise" p-values are different. thanks a
>> lot !
>>
>> tst = wilcox.test(rnorm(100), rnorm(100, 2), exact=TRUE)
>> tst$p.value
>> [1] 8.535524e-25
>>
>> tst = wilcox.test(rnorm(100), rnorm(100, 2), exact=FALSE)
>> tst$p.value
>> [1] 3.448211e-25
>>
>> On Thu, Mar 18, 2021 at 10:15 PM Peter Langfelder <
>> peter.langfel...@gmail.com> wrote:
>>
>> > I thinnk the answer is much simpler. The print method for hypothesis
>> > tests (class htest) truncates the p-values. In the above example,
>> > instead of using
>> >
>> > wilcox.test(rnorm(100), rnorm(100, 2), exact=TRUE)
>> >
>> > and copying the output, just print the p-value:
>> >
>> > tst = wilcox.test(rnorm(100), rnorm(100, 2), exact=TRUE)
>> > tst$p.value
>> >
>> > [1] 2.988368e-32
>> >
>> >
>> > I think this value is what the journal asks for.
>> >
>> > HTH,
>> >
>> > Peter
>> >
>> > On Thu, Mar 18, 2021 at 10:05 PM Spencer Graves
>> >  wrote:
>> > >
>> > >I would push back on that from two perspectives:
>> > >
>> > >
>> > >  1.  I would study exactly what the journal said very
>> > > carefully.  If they mandated "wilcox.test", that function has an
>> > > argument called "exact".  If that's what they are asking, then using
>> > > that argument gives the exact p-value, e.g.:
>> > >
>> > >
>> > >  > wilcox.test(rnorm(100), rnorm(100, 2), exact=TRUE)
>> > >
>> > >  Wilcoxon rank sum exact test
>> > >
>> > > data:  rnorm(100) and rnorm(100, 2)
>> > > W = 691, p-value < 2.2e-16
>> > >
>> > >
>> > >  2.  If that's NOT what they are asking, then I'm not
>> > > convinced what they are asking makes sense:  There is is no such thing
>> > > as an "exact p value" except to the extent that certain assumptions
>> > > hold, and all models are wrong (but some are useful), as George Box
>> > > famously said years ago.[1]  Truth only exists in mathematics, and
>> > > that's because it's a fiction to start with ;-)
>> > >
>> > >
>> > >Hope this helps.
>> > >Spencer Graves
>> > >
>> > >
>> > > [1]
>> > > https://en.wikipedia.org/wiki/All_models_are_wrong
>> > >
>> > >
>> > > On 2021-3-18 11:12 PM, Bogdan Tanasa wrote:
>> > > >   <
>> > https://meta.stackexchange.com/questions/362285/about-a-p-value-2-2e-16
>> >
>> > > > Dear all,
>> > > >
>> > > > i would appreciate having your advice on the following please :
>> > > >
>> > > > in R, the wilcox.test() provides "a p-value < 2.2e-16", when we
>> compare
>> > > > sets of 1000 genes expression (in the genomics field).
>> > > >
>> > > > however, the journal asks us to provide the exact p value ...
>> > > >
>> > > > would it be legitimate to write : "p-value = 0" ? thanks a lot,
>> > > >
>> > > > -- bogdan
>> > > >
>> > > >   [[alternative HTML version deleted]]
>> > > >
>> > > > ___

Re: [R] about a p-value < 2.2e-16

2021-03-18 Thread Bogdan Tanasa

Dear Peter, thanks a lot. yes, we can see a very precise p-value, and that
was the request from the journal.

if I may ask another question please : what is the meaning of "exact=TRUE"
or "exact=FALSE" in wilcox.test ?

i can see that the "numerically precise" p-values are different. thanks a
lot !

tst = wilcox.test(rnorm(100), rnorm(100, 2), exact=TRUE)
tst$p.value
[1] 8.535524e-25

tst = wilcox.test(rnorm(100), rnorm(100, 2), exact=FALSE)
tst$p.value
[1] 3.448211e-25

On Thu, Mar 18, 2021 at 10:15 PM Peter Langfelder <
peter.langfel...@gmail.com> wrote:

> I thinnk the answer is much simpler. The print method for hypothesis
> tests (class htest) truncates the p-values. In the above example,
> instead of using
>
> wilcox.test(rnorm(100), rnorm(100, 2), exact=TRUE)
>
> and copying the output, just print the p-value:
>
> tst = wilcox.test(rnorm(100), rnorm(100, 2), exact=TRUE)
> tst$p.value
>
> [1] 2.988368e-32
>
>
> I think this value is what the journal asks for.
>
> HTH,
>
> Peter
>
> On Thu, Mar 18, 2021 at 10:05 PM Spencer Graves
>  wrote:
> >
> >I would push back on that from two perspectives:
> >
> >
> >  1.  I would study exactly what the journal said very
> > carefully.  If they mandated "wilcox.test", that function has an
> > argument called "exact".  If that's what they are asking, then using
> > that argument gives the exact p-value, e.g.:
> >
> >
> >  > wilcox.test(rnorm(100), rnorm(100, 2), exact=TRUE)
> >
> >  Wilcoxon rank sum exact test
> >
> > data:  rnorm(100) and rnorm(100, 2)
> > W = 691, p-value < 2.2e-16
> >
> >
> >  2.  If that's NOT what they are asking, then I'm not
> > convinced what they are asking makes sense:  There is is no such thing
> > as an "exact p value" except to the extent that certain assumptions
> > hold, and all models are wrong (but some are useful), as George Box
> > famously said years ago.[1]  Truth only exists in mathematics, and
> > that's because it's a fiction to start with ;-)
> >
> >
> >Hope this helps.
> >Spencer Graves
> >
> >
> > [1]
> > https://en.wikipedia.org/wiki/All_models_are_wrong
> >
> >
> > On 2021-3-18 11:12 PM, Bogdan Tanasa wrote:
> > >   <
> https://meta.stackexchange.com/questions/362285/about-a-p-value-2-2e-16>
> > > Dear all,
> > >
> > > i would appreciate having your advice on the following please :
> > >
> > > in R, the wilcox.test() provides "a p-value < 2.2e-16", when we compare
> > > sets of 1000 genes expression (in the genomics field).
> > >
> > > however, the journal asks us to provide the exact p value ...
> > >
> > > would it be legitimate to write : "p-value = 0" ? thanks a lot,
> > >
> > > -- bogdan
> > >
> > >   [[alternative HTML version deleted]]
> > >
> > > __
> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] about a p-value < 2.2e-16

2021-03-18 Thread Bogdan Tanasa

Dear Spencer, thank you very much for your prompt email and help. When
using :

> wilcox.test(rnorm(100), rnorm(100, 2), exact=TRUE)
W = 698, p-value < 2.2e-16

> wilcox.test(rnorm(100), rnorm(100, 2), exact=FALSE)
W = 1443, p-value < 2.2e-16

and in both cases p-value < 2.2e-16. By "exact" p-value, i have meant the
"precise" p-value ;

If I may ask please, could we write p-value = 0 ?

i have noted a similar conversation on stackexchange, although the answer
is not very clear (to me).

https://stats.stackexchange.com/questions/78839/how-should-tiny-p-values-be-reported-and-why-does-r-put-a-minimum-on-2-22e-1

thanks again,

bogdan

On Thu, Mar 18, 2021 at 10:05 PM Spencer Graves <
spencer.gra...@effectivedefense.org> wrote:

>I would push back on that from two perspectives:
>
>
>  1.  I would study exactly what the journal said very
> carefully.  If they mandated "wilcox.test", that function has an
> argument called "exact".  If that's what they are asking, then using
> that argument gives the exact p-value, e.g.:
>
>
>  > wilcox.test(rnorm(100), rnorm(100, 2), exact=TRUE)
>
>  Wilcoxon rank sum exact test
>
> data:  rnorm(100) and rnorm(100, 2)
> W = 691, p-value < 2.2e-16
>
>
>  2.  If that's NOT what they are asking, then I'm not
> convinced what they are asking makes sense:  There is is no such thing
> as an "exact p value" except to the extent that certain assumptions
> hold, and all models are wrong (but some are useful), as George Box
> famously said years ago.[1]  Truth only exists in mathematics, and
> that's because it's a fiction to start with ;-)
>
>
>Hope this helps.
>Spencer Graves
>
>
> [1]
> https://en.wikipedia.org/wiki/All_models_are_wrong
>
>
> On 2021-3-18 11:12 PM, Bogdan Tanasa wrote:
> >   <
> https://meta.stackexchange.com/questions/362285/about-a-p-value-2-2e-16>
> > Dear all,
> >
> > i would appreciate having your advice on the following please :
> >
> > in R, the wilcox.test() provides "a p-value < 2.2e-16", when we compare
> > sets of 1000 genes expression (in the genomics field).
> >
> > however, the journal asks us to provide the exact p value ...
> >
> > would it be legitimate to write : "p-value = 0" ? thanks a lot,
> >
> > -- bogdan
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] about a p-value < 2.2e-16

2021-03-18 Thread Bogdan Tanasa

 
Dear all,

i would appreciate having your advice on the following please :

in R, the wilcox.test() provides "a p-value < 2.2e-16", when we compare
sets of 1000 genes expression (in the genomics field).

however, the journal asks us to provide the exact p value ...

would it be legitimate to write : "p-value = 0" ? thanks a lot,

-- bogdan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] overlaying frequency histograms or density plots in R

2021-02-25 Thread Bogdan Tanasa

Dear Rui, and Petr,

many many thanks for your time and advice ! 'm still exploring the R code
that you have suggested !

On Thu, Feb 25, 2021 at 2:59 AM Rui Barradas  wrote:

> Hello,
>
> First of all, I believe you want argument fill, not colour. In ggplot2
> colour is about the border and fill about the interior.
>
> As for the question,
>
> 1. Create a basic plot with the common aesthetics.
>
>
> library(ggplot2)
>
> pp_ALL <- iris[c(1, 5)]
> names(pp_ALL) <- c("VALUE", "EXP")
>
> p <- ggplot(data = pp_ALL, mapping = aes(x = VALUE, fill = EXP))
>
>
>
> 2. geom_density should use alpha transparency, since the densities
> overlap. colour = NA removes the densities black border.
>
>
> p + geom_density(alpha = 0.5, colour = NA)
>
>
> 3. y = ..density.. plots relative frequencies histograms, for the
> default absolute frequencies or counts, comment the mapping out.
>
> position = position_dodge() allows for extra goodies, such as to change
> the space between bars, their width or to keep empty spaces when some
> factor levels are missing (preserve = "single").
>
> For the test data, with 50 elements per factor level, use a much smaller
> number of bins.
>
> Package scales has functions to display labels in percent format, there
> is no need to multiply by 100.
>
>
> p + geom_histogram(
>mapping = aes(y = ..density..),
>position = position_dodge(),
>bins = 10)
>
> p + geom_histogram(
>mapping = aes(y = ..density..),
>    position = position_dodge(),
>bins = 10) +
>scale_y_continuous(labels = scales::label_percent())
>
>
> Hope this helps,
>
> Rui Barradas
>
>
> Às 07:43 de 25/02/21, Bogdan Tanasa escreveu:
> > Thanks a lot Petr !
> >
> > shall i uses "dodge" also for the RELATIVE FREQUENCY HISTOGRAMS :
> >
> > p <- ggplot(iris, aes(x=Sepal.Length, y=..count../sum(..count..)*100,
> > colour=Species))
> > p+geom_histogram(position="dodge")
> >
> > or is there any other way to display the RELATIVE FREQUENCY HISTOGRAMS ?
> >
> > thanks again !
> >
> > On Wed, Feb 24, 2021 at 11:00 PM PIKAL Petr 
> wrote:
> >
> >> Hi
> >>
> >> You should use position dodge.
> >>
> >> p <- ggplot(iris, aes(x=Sepal.Length, colour=Species))
> >> p+geom_density()
> >> p <- ggplot(iris, aes(x=Sepal.Length, y=..density.., colour=Species))
> >> p+geom_histogram(position="dodge")
> >>
> >> Cheers
> >> Petr
> >>> -Original Message-
> >>> From: R-help  On Behalf Of Bogdan Tanasa
> >>> Sent: Wednesday, February 24, 2021 11:07 PM
> >>> To: r-help 
> >>> Subject: [R] overlaying frequency histograms or density plots in R
> >>>
> >>> Dear all, we do have a dataframe with a FACTOR called EXP that has 3
> >> LEVELS ;
> >>>
> >>>   head(pp_ALL)
> >>>  VALUE  EXP
> >>> 1 1639742 DMSO
> >>> 2 1636822 DMSO
> >>> 3 1634202 DMSO
> >>>
> >>> shall i aim to overlay the relative frequency histograms, or the
> density
> >>> histograms for the FACTOR LEVELS,
> >>>
> >>> please would you let me know why the following 2 pieces of R code show
> >>> very different results :
> >>>
> >>> ggplot(pp_ALL, aes(x=VALUE, colour=EXP)) + geom_density()
> >>>
> >>> versus
> >>>
> >>> ggplot(data=pp_ALL) +
> >>> geom_histogram(mapping=aes(x=VALUE, y=..density.., colour=EXP),
> >>>   bins=1000)
> >>>
> >>> thanks,
> >>>
> >>> bogdan
> >>>
> >>> ps : perhaps i shall email to the folks on ggplot2 mailing list too ...
> >>>
> >>>[[alternative HTML version deleted]]
> >>>
> >>> __
> >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide http://www.R-project.org/posting-
> >>> guide.html
> >>> and provide commented, minimal, self-contained, reproducible code.
> >>
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] overlaying frequency histograms or density plots in R

2021-02-24 Thread Bogdan Tanasa

Thanks a lot Petr !

shall i uses "dodge" also for the RELATIVE FREQUENCY HISTOGRAMS :

p <- ggplot(iris, aes(x=Sepal.Length, y=..count../sum(..count..)*100,
colour=Species))
p+geom_histogram(position="dodge")

or is there any other way to display the RELATIVE FREQUENCY HISTOGRAMS ?

thanks again !

On Wed, Feb 24, 2021 at 11:00 PM PIKAL Petr  wrote:

> Hi
>
> You should use position dodge.
>
> p <- ggplot(iris, aes(x=Sepal.Length, colour=Species))
> p+geom_density()
> p <- ggplot(iris, aes(x=Sepal.Length, y=..density.., colour=Species))
> p+geom_histogram(position="dodge")
>
> Cheers
> Petr
> > -Original Message-
> > From: R-help  On Behalf Of Bogdan Tanasa
> > Sent: Wednesday, February 24, 2021 11:07 PM
> > To: r-help 
> > Subject: [R] overlaying frequency histograms or density plots in R
> >
> > Dear all, we do have a dataframe with a FACTOR called EXP that has 3
> LEVELS ;
> >
> >  head(pp_ALL)
> > VALUE  EXP
> > 1 1639742 DMSO
> > 2 1636822 DMSO
> > 3 1634202 DMSO
> >
> > shall i aim to overlay the relative frequency histograms, or the density
> > histograms for the FACTOR LEVELS,
> >
> > please would you let me know why the following 2 pieces of R code show
> > very different results :
> >
> > ggplot(pp_ALL, aes(x=VALUE, colour=EXP)) + geom_density()
> >
> > versus
> >
> > ggplot(data=pp_ALL) +
> >geom_histogram(mapping=aes(x=VALUE, y=..density.., colour=EXP),
> >  bins=1000)
> >
> > thanks,
> >
> > bogdan
> >
> > ps : perhaps i shall email to the folks on ggplot2 mailing list too ...
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-
> > guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] overlaying frequency histograms or density plots in R

2021-02-24 Thread Bogdan Tanasa

Dear all, we do have a dataframe with a FACTOR called EXP that has 3 LEVELS
;

 head(pp_ALL)
VALUE  EXP
1 1639742 DMSO
2 1636822 DMSO
3 1634202 DMSO

shall i aim to overlay the relative frequency histograms, or the density
histograms for the FACTOR LEVELS,

please would you let me know why the following 2 pieces of R code show very
different results :

ggplot(pp_ALL, aes(x=VALUE, colour=EXP)) + geom_density()

versus

ggplot(data=pp_ALL) +
   geom_histogram(mapping=aes(x=VALUE, y=..density.., colour=EXP),
 bins=1000)

thanks,

bogdan

ps : perhaps i shall email to the folks on ggplot2 mailing list too ...

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] algorithms that cluster time series data

2020-07-06 Thread Bogdan Tanasa

Dear Sarah,

thank you very much for pointing to the list of available packages and
algorithms.

On Mon, Jul 6, 2020 at 10:20 AM Sarah Goslee  wrote:

> Hi,
>
> Unsupervised classification (clustering) is a huge field. There's an
> entire task view devoted to it, where you can see many of the large
> array of R packages that perform some sort of clustering.
>
> https://cran.r-project.org/web/views/Cluster.html
>
> Since that is an overwhelming list, you may be best served by looking
> at how others in your field have approached similar problems, and then
> look for R packages that perform the relevant analyses.
>
> Sarah
>
> On Mon, Jul 6, 2020 at 1:11 PM Bogdan Tanasa  wrote:
> >
> > Dear all,
> >
> > please may I ask for a suggestion regarding the algorithms to cluster the
> > expression data in single cells (scRNA-seq) at multiple time points :
> >
> > we do have expression data for 30 000 genes  in 10 datasets that have
> been
> > collected at multiple time points,
> >
> > and i was wondering if you could please recommend *any algorithms/R
> > packages that could help with the clustering of the gene expression at
> > different time points.* thanks a lot, and all the best,
> >
> > -- bogdan
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
> Sarah Goslee (she/her)
> http://www.numberwright.com
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] algorithms that cluster time series data

2020-07-06 Thread Bogdan Tanasa

Dear all,

please may I ask for a suggestion regarding the algorithms to cluster the
expression data in single cells (scRNA-seq) at multiple time points :

we do have expression data for 30 000 genes  in 10 datasets that have been
collected at multiple time points,

and i was wondering if you could please recommend *any algorithms/R
packages that could help with the clustering of the gene expression at
different time points.* thanks a lot, and all the best,

-- bogdan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] 2 docker containers with R ?

2020-04-11 Thread Bogdan Tanasa

thanks a lot, Ashim, and Ivan !

On Sat, Apr 11, 2020 at 4:25 AM Ashim Kapoor  wrote:

> Dear Bogdan,
>
> Perhaps
>
> https://rstudio.github.io/packrat/
>
> can be of help?
>
> Best,
> Ashim
>
> On Sat, Apr 11, 2020 at 4:47 PM Ivan Krylov  wrote:
>
>> On Sat, 11 Apr 2020 03:44:51 -0700
>> Bogdan Tanasa  wrote:
>>
>> > how could I have Seurat2 and Seurat3 on the same machine
>>
>> What I would try first is to install Seurat2 and Seurat3 in separate
>> library directories: add the `lib` argument to install.packages when
>> installing a given version and `lib.loc` when loading it using
>> library().
>>
>> While Docker is definitely able to solve this problem, in my opinion,
>> virtualising a whole operating system is overkill in this case.
>>
>> --
>> Best regards,
>> Ivan
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] 2 docker containers with R ?

2020-04-11 Thread Bogdan Tanasa

Dear all,

we wish everyone a safe and healthy time !

i'm looking forward to have you suggestions please : I am using a R package
that is called Seurat for scRNA-seq analysis (https://satijalab.org/seurat/)
that has two versions (version 2 or version3 with distinct functions);

i'd appreciate if you could please let me know how could I have Seurat2 and
Seurat3 on the same machine, and preferentially run either Seurat2 or
Seurat3 ? I believe that I could use 2 docker containers (how could install
R in 2 different containers ?), or is there another solution ? thanks a lot
!

-- bogdan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] a package with randomization/permutation tests for genomic analyses

2020-03-04 Thread Bogdan Tanasa

Dear all,

please would you let me know, is there a package in R that has
implemented randomization/permutation
tests for joint genomic analyses

(an example of joint genomic analyses -- when jointly considering both GENE
EXPRESSION and PROTEIN BINDING along the DNA).

The context of my question is the following :

let's consider 1000 UP_REGULATED genes with increased PROTEIN X and with a
HISTONE MARK Y :

in order to show that PROTEIN X is related to HISTONE MARK Y for 1000
UP-regulated genes, what "controls" would you use for the comparison :

-- 1000 RANDOM GENES (and multiple randomization tests)

-- 1000 UP-REG GENES with NO PROTEIN X, NO HISTONE MARK Y

-- 1000 UP-REG GENES with PROTEIN X, and NO HISTONE MARK Y

-- 1000 UP-REG GENES with NO PROTEIN X, and with HISTONE MARK Y

-- 1000 NOT-UP-REG GENES with NO PROTEIN X, NO HISTONE MARK Y

-- 1000 NOT-UP-REG GENES with PROTEIN X, and NO HISTONE MARK Y

-- 1000 NOT-UP-REG GENES with NO PROTEIN X, and with HISTONE MARK Y

-- anything else ?

thanks a lot,

bogdan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] about .function

2020-01-30 Thread Bogdan Tanasa

appreciate it ! thank you Duncan !

On Thu, Jan 30, 2020 at 11:18 AM Duncan Murdoch 
wrote:

> On 30/01/2020 1:38 p.m., Bogdan Tanasa wrote:
> > Dear all,
> >
> > if I may ask please a very simple question :
> >
> > what does "." mean in front of  function name : an example below . thank
> > you very much !
> >
> > .set_pbmc_color_11<-function() {
> >myColors <- c( "dodgerblue2",
> >   "green4",
> >   "#6A3D9A", # purple
> >   "grey",
> >   "tan4",
> >   "yellow",
> >   "#FF7F00", # orange
> >   "black",
> >   "#FB9A99", # pink
> >   "orchid",
> >   "red")
> >
>
> It means that the default ls() won't list the function, you'd need
> ls(all.names = TRUE).  By convention such functions are usually meant
> for internal use, but there are lots of exceptions to that convention.
> The same convention is used in Unix-alike file systems.
>
> Duncan Murdoch
>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] about .function

2020-01-30 Thread Bogdan Tanasa

Dear all,

if I may ask please a very simple question :

what does "." mean in front of  function name : an example below . thank
you very much !

.set_pbmc_color_11<-function() {
  myColors <- c( "dodgerblue2",
 "green4",
 "#6A3D9A", # purple
 "grey",
 "tan4",
 "yellow",
 "#FF7F00", # orange
 "black",
 "#FB9A99", # pink
 "orchid",
 "red")

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] about paired or unpaired t.test or wilcox.test in R

2019-09-11 Thread Bogdan Tanasa

Dear all,

if would be great if you could please advise on the use of PAIRED or
UNPAIRED T.TEST or WILCOX.TEST in R :

let's say shall we have 2 samples :

-- CONTROL : where we measure the expression of 100 genes G1 ... G100 in
one million cells C1 ...C1mil

-- TREATMENT : where we measure the expression of 100 genes G1 ... G100 in
one million cells D1 ...D1mil

when we compare the expression of these 100 genes G1 ...G100, in CONTROL vs
TREATMENT, shall we use UNPAIRED TESTS, correct ?

as the cells in CONTROL C1..C1mil are different than the cells in TREATMENT
D1..D1mil ? thanks a lot !

-- bogdan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] computing standard deviation in R and in Python

2019-05-24 Thread Bogdan Tanasa

Dear Rui, thank you very much !

On Fri, May 24, 2019 at 4:35 AM Rui Barradas  wrote:

> Hello,
>
> This has to do with what kind of variance estimator is being used.
> R uses the unbiased estimator and Python the MLE one.
>
>
>
> var1 <- function(x){
>n <- length(x)
>(sum(x^2) - sum(x)^2/n)/(n - 1)
> }
> var2 <- function(x){
>n <- length(x)
>(sum(x^2) - sum(x)^2/n)/n
> }
>
> sd1 <- function(x) sqrt(var1(x))
> sd2 <- function(x) sqrt(var2(x))
>
> z <- matrix(c(1,2,3,4,5,6,7,8,9), nrow=3, ncol=3, byrow=T)
>
> apply(z, 1, sd1)  # R
> apply(z, 1, sd2)  # Python
>
> apply(z, 2, sd1)  # R
> apply(z, 2, sd2)  # Python
>
>
> Hope this helps,
>
> Rui Barradas
>
> Às 11:27 de 24/05/19, Bogdan Tanasa escreveu:
> > Dear all, please would you advise :
> >
> > do python and R have different ways to compute the standard deviation
> (sd) ?
> >
> > for example, in python, starting with :
> >
> > a = np.array([[1,2,3],  [4,5,6], [7,8,9]])
> > print(a.std(axis=1)) ### per row : [0.81649658 0.81649658 0.81649658]
> > print(a.std(axis=0)) ### per column : [2.44948974 2.44948974 2.44948974]
> >
> > # and in R :
> >
> >
> >
> > z <- matrix(c(1,2,3,4,5,6,7,8,9), nrow=3, ncol=3, byrow=T)
> > # z# [,1] [,2] [,3]#[1,] 1 2 3#[2,] 4 5 6#[3,] 7 8 9
> > # apply(z, 1, sd)
> > sd(z[1,]) #1
> > sd(z[2,]) #1
> > sd(z[3,]) #1
> > # apply(z, 2, sd)
> > sd(z[,1]) #3
> > sd(z[,2]) #3
> > sd(z[,3]) #3
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] computing standard deviation in R and in Python

2019-05-24 Thread Bogdan Tanasa

Dear all, please would you advise :

do python and R have different ways to compute the standard deviation (sd) ?

for example, in python, starting with :

a = np.array([[1,2,3],  [4,5,6], [7,8,9]])
print(a.std(axis=1)) ### per row : [0.81649658 0.81649658 0.81649658]
print(a.std(axis=0)) ### per column : [2.44948974 2.44948974 2.44948974]

# and in R :



z <- matrix(c(1,2,3,4,5,6,7,8,9), nrow=3, ncol=3, byrow=T)
# z# [,1] [,2] [,3]#[1,] 1 2 3#[2,] 4 5 6#[3,] 7 8 9
# apply(z, 1, sd)
sd(z[1,]) #1
sd(z[2,]) #1
sd(z[3,]) #1
# apply(z, 2, sd)
sd(z[,1]) #3
sd(z[,2]) #3
sd(z[,3]) #3

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] to run an older version of R on my machine

2019-05-24 Thread Bogdan Tanasa

thanks a lot, Alfredo ! did not know about Rswitch - very helpful !

On Thu, May 23, 2019 at 11:39 PM Alfredo Cortell <
alfredo.cortell.nico...@gmail.com> wrote:

> Hi Bogdan,
>
> The way I do this is the following: I have different R versions installed,
> and then I downloaded and use Rswitch to change between versions. You just
> open it, select the version you want, and R will open in that version
> directly. It works with R and Rstudio in a MacOS HighSierra, although I
> have read that it doesn't work in every platform. I don't know about
> ubuntu. I would advise you to try it anyhow. Good luck with that!
>
> All the best,
>
> Alfredo
>
> El jue., 23 may. 2019 a las 21:38, Bogdan Tanasa ()
> escribió:
>
>> Dear all,
>>
>> if you could help me please with a solution to a simple question :
>>
>> i believe that my ubuntu machine automatically installed R 3.6.0 : when i
>> type : > R. it says :
>>
>> R version 3.6.0 (2019-04-26) -- "Planting of a Tree"
>> Copyright (C) 2019 The R Foundation for Statistical Computing
>> Platform: x86_64-pc-linux-gnu (64-bit)
>>
>> However, I need to use a previous version of R, namely R 3.5, that was
>> installed and did run on my Ubuntu machine, and I can see lots of folders
>> in the directory (a long list follows below) :
>>
>> /home/bogdan/R/x86_64-pc-linux-gnu-library/3.5
>>
>> Please would you advise how can I revert to R 3.5 instead of using R 3.6 .
>> Thanks a lot,
>>
>> bogdan
>>
>> ps : the list of folders in ~/R/x86_64-pc-linux-gnu-library/3.5
>>
>> abind/
>> acepack/
>> ALL/
>> alphahull/
>> amap/
>> annotate/
>> AnnotationDbi/
>> AnnotationFilter/
>> AnnotationForge/
>> apcluster/
>> ape/
>> aroma.light/
>> askpass/
>> assertthat/
>> backports/
>> base64enc/
>> bbmle/
>> beachmat/
>> beeswarm/
>> BH/
>> bibtex/
>> bindr/
>> bindrcpp/
>> Biobase/
>> BiocFileCache/
>> BiocGenerics/
>> biocGraph/
>> BiocInstaller/
>> BiocManager/
>> BiocNeighbors/
>> BiocParallel/
>> BiocStyle/
>> BiocVersion/
>> biocViews/
>> biomaRt/
>> Biostrings/
>> biovizBase/
>> bit/
>> bit64/
>> bitops/
>> bladderbatch/
>> blob/
>> bookdown/
>> brew/
>> broom/
>> BSgenome/
>> Cairo/
>> callr/
>> car/
>> carData/
>> Category/
>> caTools/
>> CCA/
>> CCP/
>> cellranger/
>> cellrangerRkit/
>> checkmate/
>> circlize/
>> cli/
>> clipr/
>> clisymbols/
>> coda/
>> colorspace/
>> combinat/
>> ComplexHeatmap/
>> contfrac/
>> corpcor/
>> corrplot/
>> cowplot/
>> crayon/
>> crosstalk/
>> cubature/
>> curl/
>> cvTools/
>> data.table/
>> DBI/
>> dbplyr/
>> DDRTree/
>> DelayedArray/
>> DelayedMatrixStats/
>> deldir/
>> densityClust/
>> DEoptimR/
>> desc/
>> DESeq/
>> DESeq2/
>> deSolve/
>> destiny/
>> devtools/
>> dichromat/
>> digest/
>> diptest/
>> distillery/
>> doBy/
>> docopt/
>> doParallel/
>> doRNG/
>> doSNOW/
>> dotCall64/
>> dplyr/
>> DropletUtils/
>> dtw/
>> dynamicTreeCut/
>> e1071/
>> EDASeq/
>> edgeR/
>> ellipse/
>> ellipsis/
>> elliptic/
>> EnrichmentBrowser/
>> enrichR/
>> EnsDb.Hsapiens.v86/
>> EnsDb.Mmusculus.v79/
>> ensembldb/
>> evaluate/
>> extRemes/
>> fansi/
>> fastcluster/
>> fastICA/
>> fda/
>> fields/
>> fitdistrplus/
>> fit.models/
>> flexmix/
>> FNN/
>> forcats/
>> foreach/
>> formatR/
>> Formula/
>> fpc/
>> fs/
>> futile.logger/
>> futile.options/
>> gage/
>> gbRd/
>> gdata/
>> genefilter/
>> geneplotter/
>> generics/
>> GenomeInfoDb/
>> GenomeInfoDbData/
>> GenomicAlignments/
>> GenomicFeatures/
>> GenomicRanges/
>> GEOquery/
>> GetoptLong/
>> GGally/
>> ggbeeswarm/
>> ggbio/
>> ggdendro/
>> ggfortify/
>> ggplot2/
>> ggrepel/
>> ggridges/
>> ggthemes/
>> gh/
>> git2r/
>> githubinstall/
>> Glimma/
>> GlobalOptions/
>> glue/
>> gmodels/
>> GO.db/
>> goftest/
>> googleVis/
>> GOplot/
>> GOstats/
>> gplots/
>> graph/
>> graphite/
>> gridExtra/
>> GSA/
&g

[R] to run an older version of R on my machine

2019-05-23 Thread Bogdan Tanasa

Dear all,

if you could help me please with a solution to a simple question :

i believe that my ubuntu machine automatically installed R 3.6.0 : when i
type : > R. it says :

R version 3.6.0 (2019-04-26) -- "Planting of a Tree"
Copyright (C) 2019 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

However, I need to use a previous version of R, namely R 3.5, that was
installed and did run on my Ubuntu machine, and I can see lots of folders
in the directory (a long list follows below) :

/home/bogdan/R/x86_64-pc-linux-gnu-library/3.5

Please would you advise how can I revert to R 3.5 instead of using R 3.6 .
Thanks a lot,

bogdan

ps : the list of folders in ~/R/x86_64-pc-linux-gnu-library/3.5

abind/
acepack/
ALL/
alphahull/
amap/
annotate/
AnnotationDbi/
AnnotationFilter/
AnnotationForge/
apcluster/
ape/
aroma.light/
askpass/
assertthat/
backports/
base64enc/
bbmle/
beachmat/
beeswarm/
BH/
bibtex/
bindr/
bindrcpp/
Biobase/
BiocFileCache/
BiocGenerics/
biocGraph/
BiocInstaller/
BiocManager/
BiocNeighbors/
BiocParallel/
BiocStyle/
BiocVersion/
biocViews/
biomaRt/
Biostrings/
biovizBase/
bit/
bit64/
bitops/
bladderbatch/
blob/
bookdown/
brew/
broom/
BSgenome/
Cairo/
callr/
car/
carData/
Category/
caTools/
CCA/
CCP/
cellranger/
cellrangerRkit/
checkmate/
circlize/
cli/
clipr/
clisymbols/
coda/
colorspace/
combinat/
ComplexHeatmap/
contfrac/
corpcor/
corrplot/
cowplot/
crayon/
crosstalk/
cubature/
curl/
cvTools/
data.table/
DBI/
dbplyr/
DDRTree/
DelayedArray/
DelayedMatrixStats/
deldir/
densityClust/
DEoptimR/
desc/
DESeq/
DESeq2/
deSolve/
destiny/
devtools/
dichromat/
digest/
diptest/
distillery/
doBy/
docopt/
doParallel/
doRNG/
doSNOW/
dotCall64/
dplyr/
DropletUtils/
dtw/
dynamicTreeCut/
e1071/
EDASeq/
edgeR/
ellipse/
ellipsis/
elliptic/
EnrichmentBrowser/
enrichR/
EnsDb.Hsapiens.v86/
EnsDb.Mmusculus.v79/
ensembldb/
evaluate/
extRemes/
fansi/
fastcluster/
fastICA/
fda/
fields/
fitdistrplus/
fit.models/
flexmix/
FNN/
forcats/
foreach/
formatR/
Formula/
fpc/
fs/
futile.logger/
futile.options/
gage/
gbRd/
gdata/
genefilter/
geneplotter/
generics/
GenomeInfoDb/
GenomeInfoDbData/
GenomicAlignments/
GenomicFeatures/
GenomicRanges/
GEOquery/
GetoptLong/
GGally/
ggbeeswarm/
ggbio/
ggdendro/
ggfortify/
ggplot2/
ggrepel/
ggridges/
ggthemes/
gh/
git2r/
githubinstall/
Glimma/
GlobalOptions/
glue/
gmodels/
GO.db/
goftest/
googleVis/
GOplot/
GOstats/
gplots/
graph/
graphite/
gridExtra/
GSA/
GSEABase/
gtable/
gtools/
hash/
haven/
HDF5Array/
hdf5r/
hexbin/
highr/
Hmisc/
hms/
HSMMSingleCell/
htmlTable/
htmltools/
htmlwidgets/
httpuv/
httr/
hwriter/
hypergeo/
ica/
igraph/
impute/
ini/
inline/
IRanges/
irlba/
iterators/
jsonlite/
kBET/
KEGGgraph/
KEGGREST/
kernlab/
knitr/
labeling/
laeken/
lambda.r/
lars/
later/
latticeExtra/
lazyeval/
limma/
Linnorm/
lle/
lme4/
Lmoments/
lmtest/
locfit/
loo/
lsei/
lubridate/
M3Drop/
magrittr/
maps/
maptools/
markdown/
MAST/
Matrix/
MatrixModels/
matrixStats/
mclust/
MCMCglmm/
memoise/
metap/
mime/
minqa/
mixtools/
mnormt/
mockery/
modelr/
modeltools/
moments/
monocle/
munsell/
Mus.musculus/
mvoutlier/
mvtnorm/
NADA/
nloptr/
npsurv/
numDeriv/
openssl/
openxlsx/
OrganismDbi/
org.Hs.eg.db/
org.Mm.eg.db/
orthopolynom/
ouija/
packrat/
pathview/
pbapply/
pbkrtest/
pcaMethods/
pcaPP/
pcaReduce/
penalized/
permute/
PFAM.db/
pheatmap/
pillar/
pkgbuild/
pkgconfig/
pkgload/
pkgmaker/
PKI/
plogr/
plotly/
pls/
plyr/
plyranges/
png/
polyclip/
polynom/
prabclus/
praise/
preprocessCore/
prettyunits/
processx/
progress/
promises/
ProtGenerics/
proxy/
pryr/
ps/
purrr/
qlcMatrix/
quantreg/
R6/
randomForest/
ranger/
RANN/
rappdirs/
RBGL/
rcmdcheck/
RColorBrewer/
Rcpp/
RcppAnnoy/
RcppArmadillo/
RcppEigen/
RcppProgress/
RCurl/
Rdpack/
readr/
readxl/
refGenome/
registry/
reldist/
rematch/
remotes/
ReportingTools/
reprex/
reshape/
reshape2/
reticulate/
Rgraphviz/
rhdf5/
Rhdf5lib/
rio/
rJava/
rjson/
RJSONIO/
rlang/
RMariaDB/
rmarkdown/
R.methodsS3/
Rmisc/
RMTstat/
rngtools/
robCompositions/
robust/
robustbase/
ROCR/
R.oo/
Rook/
rprojroot/
rrcov/
Rsamtools/
rsconnect/
RSQLite/
rstan/
rstudioapi/
rsvd/
rtracklayer/
Rtsne/
RUnit/
R.utils/
RUVSeq/
rvest/
S4Vectors/
safe/
SC3/
scales/
scater/
scatterplot3d/
scde/
scfind/
scImpute/
scmap/
SCnorm/
scran/
scRNAseq/
scRNA.seq.funcs/
SDMTools/
segmented/
selectr/
sessioninfo/
Seurat/
sgeostat/
shape/
shiny/
ShortRead/
SingleCellExperiment/
slam/
SLICER/
smoother/
snow/
snowfall/
sourcetools/
sp/
spam/
SparseM/
sparsesvd/
spatstat/
spatstat.data/
spatstat.utils/
SPIA/
splancs/
sROC/
StanHeaders/
statmod/
stringi/
stringr/
SummarizedExperiment/
sva/
sys/
tensor/
tensorA/
testthat/
tibble/
tidyr/
tidyselect/
tidyverse/
tinytex/
topGO/
trimcluster/
tripack/
truncnorm/
TSCAN/
tsne/
TTR/
TxDb.Mmusculus.UCSC.mm10.ensGene/
TxDb.Mmusculus.UCSC.mm10.knownGene/
UpSetR/
usethis/
utf8/
VariantAnnotation/
vcd/
vegan/
VennDiagram/
Vennerable/
venneuler/
VGAM/
VIM/
vipor/
viridis/
viridisLite/
WGCNA/
whisker/
withr/
WriteXLS/
xfun/
XML/
xml2/
xopen/
xtable/

[R] display of ECDF

2018-12-17 Thread Bogdan Tanasa

Dear all,

please could you advise me on the following : I would like to display a few
CDF data (the R code is below), by using a set of numerical BREAKS on a X
axis to be shown at EQUAL DISTANCE from each other (although numerically,
the BREAKS are on log10 axis and do not reflecting an equal distance):

df <- data.frame(
  x = c(rnorm(100, 0, 3), rnorm(100, 0, 10)),
  g = gl(2, 100)
)

breaks=c(0.001, 0.01, 0.1, 1, 5, 10, 20, 30, 100)

ggplot(df, aes(x, colour = g)) + stat_ecdf()  +
scale_x_log10(breaks=breaks),

how shall I do it ? thanks a lot !

-- bogdan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] selecting the COLUMNS in a dataframe function of the numerical values in a ROW

2018-11-01 Thread Bogdan Tanasa

very helpful, thanks a lot !

On Thu, Nov 1, 2018 at 9:59 PM William Michels 
wrote:

> Perhaps one of the following two methods:
>
> > zgene = data.frame(  TTT=c(0,1,0,0),
> +TTA=c(0,1,1,0),
> + ATA=c(1,0,0,0),
> +  ATT=c(0,0,0,0),
> + row.names=c("gene1", "gene2", "gene3", "gene4"))
> > zgene
>   TTT TTA ATA ATT
> gene1   0   0   1   0
> gene2   1   1   0   0
> gene3   0   1   0   0
> gene4   0   0   0   0
> >
> > zgene[ , zgene[2,1:4] > 0]
>   TTT TTA
> gene1   0   0
> gene2   1   1
> gene3   0   1
> gene4   0   0
> >
> > zgene[ , zgene[rownames(zgene) == "gene2",1:4] > 0]
>   TTT TTA
> gene1   0   0
> gene2   1   1
> gene3   0   1
> gene4   0   0
> >
>
> Best Regards,
>
> Bill.
>
> William Michels, Ph.D.
>
>
>
> On Thu, Nov 1, 2018 at 9:07 PM, Bogdan Tanasa  wrote:
> > Dear Bill, and Bill,
> >
> > many thanks for taking the time to advice, and for your suggestions. I
> > believe that I shall rephrase a bit my question, with a better example :
> > thank you again in advance for your help.
> >
> > Let's assume that we start from a data frame :
> >
> > x = data.frame(  TTT=c(0,1,0,0),
> >TTA=c(0,1,1,0),
> > ATA=c(1,0,0,0),
> >  ATT=c(0,0,0,0),
> > row.names=c("gene1", "gene2", "gene3", "gene4"))
> >
> > Shall we select "gene2", at the end, we would like to have ONLY the
> COLUMNS,
> > where "gene2" is NOT-ZERO. In other words, the output contains only the
> > first 2 columns :
> >
> > output = data.frame(  TTT=c(0,1,0,0),
> >TTA=c(0,1,1,0),
> >row.names=c("gene1", "gene2", "gene3",
> > "gene4"))
> >
> >  with much appreciation,
> >
> > -- bogdan
> >
> > On Thu, Nov 1, 2018 at 6:34 PM William Michels 
> > wrote:
> >>
> >> Hi Bogdan,
> >>
> >> Are you saying you want to drop columns that sum to zero? If so, I'm
> >> not sure you've given us a good example dataframe, since all your
> >> numeric columns give non-zero sums.
> >>
> >> Otherwise, what you're asking for is trivial. Below is an example
> >> dataframe ("ygene") with an example "AGA" column that gets dropped:
> >>
> >> > xgene <- data.frame(TTT=c(0,1,0,0),
> >> +TTA=c(0,1,1,0),
> >> +ATA=c(1,0,0,0),
> >> +gene=c("gene1", "gene2", "gene3", "gene4"))
> >> >
> >> > xgene[ , colSums(xgene[,1:3]) > 0 ]
> >>   TTT TTA ATA  gene
> >> 1   0   0   1 gene1
> >> 2   1   1   0 gene2
> >> 3   0   1   0 gene3
> >> 4   0   0   0 gene4
> >> >
> >> > ygene <- data.frame(TTT=c(0,1,0,0),
> >> + TTA=c(0,1,1,0),
> >> + AGA=c(0,0,0,0),
> >> + gene=c("gene1", "gene2", "gene3", "gene4"))
> >> >
> >> > ygene[ , colSums(ygene[,1:3]) > 0 ]
> >>   TTT TTA  gene
> >> 1   0   0 gene1
> >> 2   1   1 gene2
> >> 3   0   1 gene3
> >> 4   0   0 gene4
> >>
> >>
> >> HTH,
> >>
> >> Bill.
> >>
> >> William Michels, Ph.D.
> >>
> >>
> >> On Thu, Nov 1, 2018 at 5:45 PM, Bogdan Tanasa  wrote:
> >> > Dear all, please may I ask for a suggestion :
> >> >
> >> > considering a dataframe  that contains the numerical values for gene
> >> > expression, for example :
> >> >
> >> >  x = data.frame(TTT=c(0,1,0,0),
> >> >TTA=c(0,1,1,0),
> >> >ATA=c(1,0,0,0),
> >> >gene=c("gene1", "gene2", "gene3", "gene4"))
> >> >
> >> > how could I select only the COLUMNS where the value of a GENE (a ROW)
> is
> >> > non-zero ?
> >> >
> >> > thank you !
> >> >
> >> > -- bogdan
> >> >
> >> > [[alternative HTML version deleted]]
> >> >
> >> > __
> >> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> > https://stat.ethz.ch/mailman/listinfo/r-help
> >> > PLEASE do read the posting guide
> >> > http://www.R-project.org/posting-guide.html
> >> > and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] selecting the COLUMNS in a dataframe function of the numerical values in a ROW

2018-11-01 Thread Bogdan Tanasa

Dear Bill, and Bill,

many thanks for taking the time to advice, and for your suggestions. I
believe that I shall rephrase a bit my question, with a better example :
thank you again in advance for your help.

Let's assume that we start from a data frame :

x = data.frame(  TTT=c(0,1,0,0),
   TTA=c(0,1,1,0),
ATA=c(1,0,0,0),
 ATT=c(0,0,0,0),
row.names=c("gene1", "gene2", "gene3", "gene4"))

Shall we select "gene2", at the end, we would like to have ONLY the
COLUMNS, where "gene2" is NOT-ZERO. In other words, the output contains
only the first 2 columns :

output = data.frame(  TTT=c(0,1,0,0),
   TTA=c(0,1,1,0),
   row.names=c("gene1", "gene2", "gene3",
"gene4"))

 with much appreciation,

-- bogdan

On Thu, Nov 1, 2018 at 6:34 PM William Michels 
wrote:

> Hi Bogdan,
>
> Are you saying you want to drop columns that sum to zero? If so, I'm
> not sure you've given us a good example dataframe, since all your
> numeric columns give non-zero sums.
>
> Otherwise, what you're asking for is trivial. Below is an example
> dataframe ("ygene") with an example "AGA" column that gets dropped:
>
> > xgene <- data.frame(TTT=c(0,1,0,0),
> +TTA=c(0,1,1,0),
> +ATA=c(1,0,0,0),
> +gene=c("gene1", "gene2", "gene3", "gene4"))
> >
> > xgene[ , colSums(xgene[,1:3]) > 0 ]
>   TTT TTA ATA  gene
> 1   0   0   1 gene1
> 2   1   1   0 gene2
> 3   0   1   0 gene3
> 4   0   0   0 gene4
> >
> > ygene <- data.frame(TTT=c(0,1,0,0),
> + TTA=c(0,1,1,0),
> + AGA=c(0,0,0,0),
> + gene=c("gene1", "gene2", "gene3", "gene4"))
> >
> > ygene[ , colSums(ygene[,1:3]) > 0 ]
>   TTT TTA  gene
> 1   0   0 gene1
> 2   1   1 gene2
> 3   0   1 gene3
> 4   0   0 gene4
>
>
> HTH,
>
> Bill.
>
> William Michels, Ph.D.
>
>
> On Thu, Nov 1, 2018 at 5:45 PM, Bogdan Tanasa  wrote:
> > Dear all, please may I ask for a suggestion :
> >
> > considering a dataframe  that contains the numerical values for gene
> > expression, for example :
> >
> >  x = data.frame(TTT=c(0,1,0,0),
> >TTA=c(0,1,1,0),
> >ATA=c(1,0,0,0),
> >gene=c("gene1", "gene2", "gene3", "gene4"))
> >
> > how could I select only the COLUMNS where the value of a GENE (a ROW) is
> > non-zero ?
> >
> > thank you !
> >
> > -- bogdan
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] selecting the COLUMNS in a dataframe function of the numerical values in a ROW

2018-11-01 Thread Bogdan Tanasa

Dear all, please may I ask for a suggestion :

considering a dataframe  that contains the numerical values for gene
expression, for example :

 x = data.frame(TTT=c(0,1,0,0),
   TTA=c(0,1,1,0),
   ATA=c(1,0,0,0),
   gene=c("gene1", "gene2", "gene3", "gene4"))

how could I select only the COLUMNS where the value of a GENE (a ROW) is
non-zero ?

thank you !

-- bogdan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] about the series of numbers

2018-09-20 Thread Bogdan Tanasa

Dear all,

if I may ask please a question that is likely very naive :

shall I write in R > "1:9", it will generate "1 2 3 4 5 6 7 8 9"

shall I write > "0.1:0.9", why does it generate only 0.1 ?

thank you !

-- bogdan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] installing R 3.5.1

2018-08-25 Thread Bogdan Tanasa

Dear Berwin, thank you very much . I guess that I shall update my Ubuntu OS
;

after "sudo apt-get -f install", I am getting the same message :

The following packages have unmet dependencies:
 r-recommended : Depends: r-cran-kernsmooth (>= 2.2.14) but it is not going
to be installed
 Depends: r-cran-mass but it is not going to be installed
 Depends: r-cran-class but it is not going to be installed
 Depends: r-cran-nnet but it is not going to be installed
E: Unable to correct problems, you have held broken packages.

On Sat, Aug 25, 2018 at 8:02 AM, Berwin A Turlach 
wrote:

> Dear Bogdan,
>
> On Sat, 25 Aug 2018 07:24:40 -0700
> Bogdan Tanasa  wrote:
>
> > installed E: Unable to correct problems, you have held broken
> > packages.
>
> Perhaps this is the problem, did you try "apt-get -f install"?
>
> Cheers,
>
> Berwin
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] installing R 3.5.1

2018-08-25 Thread Bogdan Tanasa

Dear Benoit, many thanks for your suggestions. Have a good weekend !

Message: 26
Date: Sat, 25 Aug 2018 11:13:49 +0200
From: Benoit Vaillant 
To: Bogdan Tanasa 
Cc: r-help 
Subject: Re: [R] installing R 3.5.1
Message-ID: <20180825091348.7tidm7fvhiudr...@auroras.fr>
Content-Type: text/plain; charset="iso-8859-15"

Hello Bogdan,

This reply is off topic for the list, appologies. This problem is more
r-sig-debian related (see below).

Though Berwin already mentionned a possible solution, here is another.

On Fri, Aug 24, 2018 at 06:28:59PM -0700, Bogdan Tanasa wrote:
> I am trying to install R 3.5.1 on my Ubuntu 14.04 system;

You are trying to install R (latest version) on a system that is
outdated by the latest LTS (16.04) and more than four years old
now. ;-)

If you go to: https://cloud.r-project.org/bin/linux/ubuntu/

You'll get some hints, like:
"R 3.5 packages for Ubuntu on i386 and amd64 are available for most
stable Desktop releases of Ubuntu until their official end of life
date. However, only the latest Long Term Support (LTS) release is
fully supported."

Note the *only the latest LTS* ;-)

You'll also get the r-sig-debian list link to report issues.

> Would you please advise, what shall I do next ? Thanks a lot !

If you have the time, upgrade your LTS by migrating your system from
14.04 to 16.04 and then 18.04 (Bionic Beaver).

Best regards,

-- 
Benoît

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] installing R 3.5.1

2018-08-25 Thread Bogdan Tanasa

Dear Berwin, thank you for your help.

On my system, after "sudo apt-get install r-base r-recommended", it says :
[..]
The following packages have unmet dependencies:
 r-recommended : Depends: r-cran-kernsmooth (>= 2.2.14) but it is not going
to be installed
 Depends: r-cran-mass but it is not going to be installed
 Depends: r-cran-class but it is not going to be installed
 Depends: r-cran-nnet but it is not going to be installed
E: Unable to correct problems, you have held broken packages.

Although these packages seem to be well installed ..

On Sat, Aug 25, 2018 at 12:51 AM Berwin A Turlach 
wrote:

> G'day Bogdan,
>
> On Fri, 24 Aug 2018 18:28:59 -0700
> Bogdan Tanasa  wrote:
>
> > I am trying to install R 3.5.1 on my Ubuntu 14.04 system; however, I
> > am getting the following message :
> >
> > sudo apt-get install r-base
> > [...]
> > The following packages have unmet dependencies:
> >  r-base : Depends: r-recommended (= 3.5.1-1trusty) but it is not
> > going to be installed
> > E: Unable to correct problems, you have held broken packages.
>
> For me such problems are usually fixed by specifying the package that
> "is not going to be installed" but on which the package I want to
> install depends also to apt-get install.
>
> What does
> sudo apt-get install r-base r-recommended
> do on your system?
>
> Cheers,
>
> Berwin
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] installing R 3.5.1

2018-08-24 Thread Bogdan Tanasa

Dear all,

I am trying to install R 3.5.1 on my Ubuntu 14.04 system; however, I am
getting the following message :

sudo apt-get install r-base
[...]
The following packages have unmet dependencies:
 r-base : Depends: r-recommended (= 3.5.1-1trusty) but it is not going to
be installed
E: Unable to correct problems, you have held broken packages.

In the file /etc/apt/sources.list , I have set up :

deb https://cloud.r-project.org/bin/linux/ubuntu trusty/
deb https://cloud.r-project.org/bin/linux/ubuntu trusty-cran35/

Would you please advise, what shall I do next ? Thanks a lot !

-- bogdan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] a suggestion about the display of structural variants in R

2018-07-28 Thread Bogdan Tanasa

Thank you Jeff. Yes, certainly, I posted a message on BioC too, although I
have not received any suggestions by now.

On Sat, Jul 28, 2018 at 8:05 AM, Jeff Newmiller 
wrote:

> My suggestion is to pay attention to Boris and ask people who do this kind
> of plotting frequently... and they are typically found on the Bioconductor
> mailing list, not this list.
>
> On Sat, 28 Jul 2018, Bogdan Tanasa wrote:
>
> Dear Boris,
>>
>> good morning, and thank you for your message.  After thinking a bit more
>> yesterday, I believe that I could adapt the functionality of some R
>> packages that display the synteny regions across multiple species (here
>> please see an example Figure 1 from http://www.g3journal.org/
>> content/7/6/1775.figures-only),
>>
>> although I have not found yet a R package that does this display (in my
>> case, instead of distinct species, I will just show distinct chromosomes
>> connected by translocations). If you have any suggestions, please let me
>> know.
>>
>> thanks a lot,
>>
>> -- bogdan
>>
>>
>> On Sat, Jul 28, 2018 at 6:42 AM, Boris Steipe 
>> wrote:
>>
>> Maybe the Bioconductor package "intansv" can help you. You asked for
>>> linear chromosomes, but such data is commonly plotted in Circos plots as
>>> e.g. with the Bioconductor OmicsCircos package (cf.
>>> https://bioconductor.org/packages/devel/bioc/vignettes/
>>> OmicCircos/inst/doc/OmicCircos_vignette.pdf)
>>>
>>> However the Bioconductor Project has its own support mailing list, R-Help
>>> is for programming help.
>>>
>>>
>>> B.
>>>
>>>
>>>
>>> On 2018-07-28, at 02:24, Bogdan Tanasa  wrote:
>>>>
>>>> Dear all,
>>>>
>>>> we wish you a fruitful and refreshing weekend ! Thought that I may also
>>>> write to ask you for a suggestion, specifically if you could please
>>>>
>>> advise
>>>
>>>> on whether there is any package already built (in R) that could help
>>>> with
>>>> the following data visualization :
>>>>
>>>>
>>>>we have a set of mutations from many cancer samples
>>>>
>>>>we would like to display the POINT MUTATIONS along the chromosome
>>>> coordinates (on the linear scale, ie. HORIZONTALLY)
>>>>
>>>>we would like to display the TRANSLOCATIONS (and GENE FUSIONS), as
>>>> VERTICAL LINES connecting the breakpoints that are located on the
>>>> chromosomes that are represented HORIZONTALLY
>>>>
>>>> Thanks a lot,
>>>>
>>>> -- bogdan
>>>>
>>>>   [[alternative HTML version deleted]]
>>>>
>>>> __
>>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide http://www.R-project.org/
>>>>
>>> posting-guide.html
>>>
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>
>>>
>>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posti
>> ng-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
> 
> ---
> Jeff NewmillerThe .   .  Go Live...
> DCN:Basics: ##.#.   ##.#.  Live
> Go...
>   Live:   OO#.. Dead: OO#..  Playing
> Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
> /Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
> 
> ---
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] a suggestion about the display of structural variants in R

2018-07-28 Thread Bogdan Tanasa

Dear Boris,

good morning, and thank you for your message.  After thinking a bit more
yesterday, I believe that I could adapt the functionality of some R
packages that display the synteny regions across multiple species (here
please see an example Figure 1 from http://www.g3journal.org/
content/7/6/1775.figures-only),

although I have not found yet a R package that does this display (in my
case, instead of distinct species, I will just show distinct chromosomes
connected by translocations). If you have any suggestions, please let me
know.

thanks a lot,

-- bogdan

On Sat, Jul 28, 2018 at 6:42 AM, Boris Steipe 
wrote:

> Maybe the Bioconductor package "intansv" can help you. You asked for
> linear chromosomes, but such data is commonly plotted in Circos plots as
> e.g. with the Bioconductor OmicsCircos package (cf.
> https://bioconductor.org/packages/devel/bioc/vignettes/
> OmicCircos/inst/doc/OmicCircos_vignette.pdf)
>
> However the Bioconductor Project has its own support mailing list, R-Help
> is for programming help.
>
>
> B.
>
>
>
> > On 2018-07-28, at 02:24, Bogdan Tanasa  wrote:
> >
> > Dear all,
> >
> > we wish you a fruitful and refreshing weekend ! Thought that I may also
> > write to ask you for a suggestion, specifically if you could please
> advise
> > on whether there is any package already built (in R) that could help with
> > the following data visualization :
> >
> >
> >we have a set of mutations from many cancer samples
> >
> >we would like to display the POINT MUTATIONS along the chromosome
> > coordinates (on the linear scale, ie. HORIZONTALLY)
> >
> >we would like to display the TRANSLOCATIONS (and GENE FUSIONS), as
> > VERTICAL LINES connecting the breakpoints that are located on the
> > chromosomes that are represented HORIZONTALLY
> >
> > Thanks a lot,
> >
> > -- bogdan
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] a suggestion about the display of structural variants in R

2018-07-28 Thread Bogdan Tanasa

Dear all,

we wish you a fruitful and refreshing weekend ! Thought that I may also
write to ask you for a suggestion, specifically if you could please advise
on whether there is any package already built (in R) that could help with
the following data visualization :


we have a set of mutations from many cancer samples

we would like to display the POINT MUTATIONS along the chromosome
coordinates (on the linear scale, ie. HORIZONTALLY)

we would like to display the TRANSLOCATIONS (and GENE FUSIONS), as
VERTICAL LINES connecting the breakpoints that are located on the
chromosomes that are represented HORIZONTALLY

Thanks a lot,

-- bogdan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] initiate elements in a dataframe with lists

2018-07-25 Thread Bogdan Tanasa

Dear Jeff, it is a precious help and a fabulous suggestion. I will slowly
go over the R code that you have sent. Thanks a lot !

On Wed, Jul 25, 2018 at 10:43 AM, Jeff Newmiller 
wrote:

> The code below reeks of a misconception that lists are efficient to add
> items to, which is a confusion with the computer science term "linked
> list".  In R, a list is NOT a linked list... it is a vector, which means
> the memory used by the list is allocated at the time it is created, and
> REALLOCATED when a new item is added. The only reason you should use a list
> is because you expect to put values of different types or shapes into it,
> which does not appear to apply in this use case.
>
> In R, you should make a valiant effort to create things right the first
> time, and if that doesn't work then preallocate the space you will need in
> the vectors you are working with. Since you have a need to store a variable
> number of elements in each intersectX element, the column needs to be a
> list but the elements of that list can perfectly well be character vectors.
>
> x <- data.frame( TYPE=c("DEL", "DEL", "DUP", "TRA", "INV", "TRA")
>, CHRA=c("chr1", "chr1", "chr1", "chr1", "chr2", "chr2")
>, POSA=c(10, 15, 120, 340, 100, 220)
>, CHRB=c("chr1", "chr1", "chr1", "chr2", "chr2", "chr1")
>, POSB=c(30, 100, 300, 20, 200, 320)
>, stringsAsFactors = FALSE
>)
> compareRng <- function( chr1, pos1, chr2, pos2, delta ) {
>   ( chr1 == chr2
>   & ( pos2 - delta ) < pos1
>   & pos1 < ( pos2 + delta )
>   )
> }
> makeIntersectX <- function( n, chrlabel, poslabel, delta ) {
>   lgclidx <- rep( TRUE, nrow( x ) )
>   lgclidx[ n ] <- FALSE
>   x[[ chrlabel ]][ compareRng( x[[ chrlabel ]][ n ]
> , x[[ poslabel ]][ n ]
> , x[[ chrlabel ]]
> , x[[ poslabel ]]
> , delta
> )
> & lgclidx
> ]
> }
>
> x$intersectA <- lapply( seq.int( nrow( x ) )
>   , makeIntersectX
>   , chrlabel = "CHRA"
>   , poslabel = "POSA"
>   , delta = 10L
>   )
> x$intersectB <- lapply( seq.int( nrow( x ) )
>   , makeIntersectX
>   , chrlabel = "CHRB"
>   , poslabel = "POSB"
>   , delta = 21L
>   )
>
>> x
>>
>   TYPE CHRA POSA CHRB POSB intersectA intersectB
> 1  DEL chr1   10 chr1   30   chr1
> 2  DEL chr1   15 chr1  100   chr1
> 3  DUP chr1  120 chr1  300  chr1
> 4  TRA chr1  340 chr2   20
> 5  INV chr2  100 chr2  200
> 6  TRA chr2  220 chr1  320  chr1
>
> Note that depending on what you plan to do beyond this point, it might
> actually be more performant to use a data frame with repeated rows instead
> of list columns... but I cannot tell from what you have provided.
>
>
> On Wed, 25 Jul 2018, Bogdan Tanasa wrote:
>
> Dear Thierry and Juan, thank you for your help. Thank you all.
>>
>> Now, if I would like to add an element to the empty list, how shall I do :
>> for example, shall i = 2, and j = 1, in a bit of more complex R code :
>>
>> x <- data.frame(TYPE=c("DEL", "DEL", "DUP", "TRA", "INV", "TRA"),
>> CHRA=c("chr1", "chr1", "chr1", "chr1", "chr2", "chr2"),
>> POSA=c(10, 15, 120, 340, 100, 220),
>> CHRB=c("chr1", "chr1", "chr1", "chr2", "chr2", "chr1"),
>> POSB=c(30, 100, 300, 20, 200, 320))
>>
>> x$labA <- paste(x$CHRA, x$POSA, sep="_")
>> x$labB <- paste(x$CHRB, x$POSB, sep="_")
>>
>> x$POSA_left <- x$POSA - 10
>> x$POSA_right <- x$POSA + 10
>>
>> x$POSB_left <- x$POSB - 10
>> x$POSB_right <- x$POSB + 10
>>
>> x$intersectA <- rep(list(list()), nrow(x))
>> x$intersectB <- rep(list(list()), nrow(x))
>>
>> And we know that for i = 2, and j = 1, the condition is TRUE :
>>
>> i <- 2
>>
>> j <- 1
>>
>> if ( (x$CHRA[i] == x$CHRA[j] ) &a

Re: [R] initiate elements in a dataframe with lists

2018-07-25 Thread Bogdan Tanasa

Dear Thierry and Juan, thank you for your help. Thank you all.

Now, if I would like to add an element to the empty list, how shall I do :
for example, shall i = 2, and j = 1, in a bit of more complex R code :

x <- data.frame(TYPE=c("DEL", "DEL", "DUP", "TRA", "INV", "TRA"),
CHRA=c("chr1", "chr1", "chr1", "chr1", "chr2", "chr2"),
POSA=c(10, 15, 120, 340, 100, 220),
CHRB=c("chr1", "chr1", "chr1", "chr2", "chr2", "chr1"),
POSB=c(30, 100, 300, 20, 200, 320))

x$labA <- paste(x$CHRA, x$POSA, sep="_")
x$labB <- paste(x$CHRB, x$POSB, sep="_")

x$POSA_left <- x$POSA - 10
x$POSA_right <- x$POSA + 10

x$POSB_left <- x$POSB - 10
x$POSB_right <- x$POSB + 10

x$intersectA <- rep(list(list()), nrow(x))
x$intersectB <- rep(list(list()), nrow(x))

And we know that for i = 2, and j = 1, the condition is TRUE :

i <- 2

j <- 1

if ( (x$CHRA[i] == x$CHRA[j] ) &&
 (x$POSA[i] > x$POSA_left[j] ) &&
 (x$POSA[i] < x$POSA_right[j] ) ){
   x$intersectA[i] <- c(x$intersectA[i], x$labA[j])}

the R code does not work. Thank you for your kind help !

On Wed, Jul 25, 2018 at 12:26 AM, Thierry Onkelinx  wrote:

> Dear Bogdan,
>
> You are looking for x$intersectA <- vector("list", nrow(x))
>
> Best regards,
>
>
> ir. Thierry Onkelinx
> Statisticus / Statistician
>
> Vlaamse Overheid / Government of Flanders
> INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND
> FOREST
> Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
> thierry.onkel...@inbo.be
> Havenlaan 88
> <https://maps.google.com/?q=Havenlaan+88=gmail=g> bus 73,
> 1000 Brussel
> www.inbo.be
>
> 
> ///
> To call in the statistician after the experiment is done may be no more
> than asking him to perform a post-mortem examination: he may be able to say
> what the experiment died of. ~ Sir Ronald Aylmer Fisher
> The plural of anecdote is not data. ~ Roger Brinner
> The combination of some data and an aching desire for an answer does not
> ensure that a reasonable answer can be extracted from a given body of data.
> ~ John Tukey
> 
> ///
>
> <https://www.inbo.be>
>
> 2018-07-25 8:55 GMT+02:00 Bogdan Tanasa :
>
>> Dear all,
>>
>> assuming that I do have a dataframe like :
>>
>> x <- data.frame(TYPE=c("DEL", "DEL", "DUP", "TRA", "INV", "TRA"),
>> CHRA=c("chr1", "chr1", "chr1", "chr1", "chr2", "chr2"),
>> POSA=c(10, 15, 120, 340, 100, 220),
>> CHRB=c("chr1", "chr1", "chr1", "chr2", "chr2", "chr1"),
>> POSB=c(30, 100, 300, 20, 200, 320)) ,
>>
>> how could I initiate another 2 columns in x, where each element in these 2
>> columns is going to be a list (the list could be updated later). Thank
>> you !
>>
>> Shall I do,
>>
>> for (i in 1:dim(x)[1]) { x$intersectA[i] <- list()}
>>
>> for (i in 1:dim(x)[1]) { x$intersectB[i] <- list()}
>>
>> nothing is happening. Thank you very much !
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posti
>> ng-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] initiate elements in a dataframe with lists

2018-07-25 Thread Bogdan Tanasa

Dear Thierry and Juan, thank you for your help. Thank you very much.

Now, if I would like to add an element to the empty list, how shall I do :
for example, shall i = 2, and j = 1, in a bit of more complex R code :

x <- data.frame(TYPE=c("DEL", "DEL", "DUP", "TRA", "INV", "TRA"),
CHRA=c("chr1", "chr1", "chr1", "chr1", "chr2", "chr2"),
POSA=c(10, 15, 120, 340, 100, 220),
CHRB=c("chr1", "chr1", "chr1", "chr2", "chr2", "chr1"),
POSB=c(30, 100, 300, 20, 200, 320))

x$labA <- paste(x$CHRA, x$POSA, sep="_")
x$labB <- paste(x$CHRB, x$POSB, sep="_")

x$POSA_left <- x$POSA - 10
x$POSA_right <- x$POSA + 10

x$POSB_left <- x$POSB - 10
x$POSB_right <- x$POSB + 10

x$intersectA <- rep(list(list()), nrow(x))
x$intersectB <- rep(list(list()), nrow(x))

And we know that for i = 2, and j = 1, the condition is TRUE :

i <- 2
j <- 1

if ( (x$CHRA[i] == x$CHRA[j] ) &&
 (x$POSA[i] > x$POSA_left[j] ) &&
 (x$POSA[i] < x$POSA_right[j] ) )
{
   x$intersectA[i] <- c(x$intersectA[i], x$labA[j])
}

the R code does not work. Thank you for your kind help !


>
> On Wed, Jul 25, 2018 at 12:26 AM, Thierry Onkelinx <
> thierry.onkel...@inbo.be> wrote:
>
>> Dear Bogdan,
>>
>> You are looking for x$intersectA <- vector("list", nrow(x))
>>
>> Best regards,
>>
>>
>> ir. Thierry Onkelinx
>> Statisticus / Statistician
>>
>> Vlaamse Overheid / Government of Flanders
>> INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE
>> AND FOREST
>> Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
>> thierry.onkel...@inbo.be
>> Havenlaan 88
>> <https://maps.google.com/?q=Havenlaan+88=gmail=g> bus 73,
>> 1000 Brussel
>> www.inbo.be
>>
>> 
>> ///
>> To call in the statistician after the experiment is done may be no more
>> than asking him to perform a post-mortem examination: he may be able to say
>> what the experiment died of. ~ Sir Ronald Aylmer Fisher
>> The plural of anecdote is not data. ~ Roger Brinner
>> The combination of some data and an aching desire for an answer does not
>> ensure that a reasonable answer can be extracted from a given body of data.
>> ~ John Tukey
>> 
>> ///
>>
>> <https://www.inbo.be>
>>
>> 2018-07-25 8:55 GMT+02:00 Bogdan Tanasa :
>>
>>> Dear all,
>>>
>>> assuming that I do have a dataframe like :
>>>
>>> x <- data.frame(TYPE=c("DEL", "DEL", "DUP", "TRA", "INV", "TRA"),
>>> CHRA=c("chr1", "chr1", "chr1", "chr1", "chr2", "chr2"),
>>> POSA=c(10, 15, 120, 340, 100, 220),
>>> CHRB=c("chr1", "chr1", "chr1", "chr2", "chr2", "chr1"),
>>> POSB=c(30, 100, 300, 20, 200, 320)) ,
>>>
>>> how could I initiate another 2 columns in x, where each element in these
>>> 2
>>> columns is going to be a list (the list could be updated later). Thank
>>> you !
>>>
>>> Shall I do,
>>>
>>> for (i in 1:dim(x)[1]) { x$intersectA[i] <- list()}
>>>
>>> for (i in 1:dim(x)[1]) { x$intersectB[i] <- list()}
>>>
>>> nothing is happening. Thank you very much !
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posti
>>> ng-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] initiate elements in a dataframe with lists

2018-07-25 Thread Bogdan Tanasa

Thank you Juan.

On Wed, Jul 25, 2018 at 12:56 AM, Juan Telleria Ruiz de Aguirre <
jtelleria.rproj...@gmail.com> wrote:

> Check tidyverse's purrr package:
>
> https://github.com/rstudio/cheatsheets/raw/master/purrr.pdf
>
> In the second page of the cheatsheet there is info on how to create list
> columns within a data.frame :)
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] initiate elements in a dataframe with lists

2018-07-25 Thread Bogdan Tanasa

Dear all,

assuming that I do have a dataframe like :

x <- data.frame(TYPE=c("DEL", "DEL", "DUP", "TRA", "INV", "TRA"),
CHRA=c("chr1", "chr1", "chr1", "chr1", "chr2", "chr2"),
POSA=c(10, 15, 120, 340, 100, 220),
CHRB=c("chr1", "chr1", "chr1", "chr2", "chr2", "chr1"),
POSB=c(30, 100, 300, 20, 200, 320)) ,

how could I initiate another 2 columns in x, where each element in these 2
columns is going to be a list (the list could be updated later). Thank you !

Shall I do,

for (i in 1:dim(x)[1]) { x$intersectA[i] <- list()}

for (i in 1:dim(x)[1]) { x$intersectB[i] <- list()}

nothing is happening. Thank you very much !

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] coloring edges in IGRAPH

2018-07-24 Thread Bogdan Tanasa

Thank you all. Think I did solve it by using the code below :

plot(g_decompose[[1]], edge.color=edge_attr(g_decompose[[1]])$COLOR)

plot(g_decompose[[2]], edge.color=edge_attr(g_decompose[[2]])$COLOR)

On Tue, Jul 24, 2018 at 5:48 PM, Bogdan Tanasa  wrote:

> Dear all,
>
> I would appreciate a piece of advice please : I am aiming to color the
> edges in a graph, by using IGRAPH package.
>
> It works well for the big braph, however, when I decompose the graph into
> 2 subgraphs and color code those, the color of the edges change
> (unexpectedly).
>
> more precisely, as an example -- we have a dataframe :
>
> el <- data.frame(Partner1=c(1, 3, 4, 5, 6), Partner2=c(2, 2, 5, 7, 7),
> TYPE=c("DEL", "DEL", "DUP", "TRA", "TRA"))
>
> el$COLOR[el$TYPE=="DEL"] <- "red"
>
> el$COLOR[el$TYPE=="DUP"] <- "green"
>
> el$COLOR[el$TYPE=="INS"] <- "yellow"
>
> el$COLOR[el$TYPE=="INV"] <- "brown"
>
> el$COLOR[el$TYPE=="TRA"] <- "blue"
>
> #> el
> #  Partner1 Partner2 TYPE COLOR
> #112  DEL   red
> #232  DEL   red
> #345  DUP green
> #457  TRA  blue
> #567  TRA  blue
>
> g <- graph_from_data_frame(d = el, directed = TRUE)
>
> plot(g, edge.color=el$COLOR)
>
> ### here decomposing the graph into 2 SUBGRAPHS :
>
> g_decompose <- decompose.graph(g)
>
> plot(g_decompose[[1]], edge.color=el$COLOR) ## here the edges are red
> (that is fine)
>
> plot(g_decompose[[2]], edge.color=el$COLOR) ## here the edges shall be
> blue and green, not red and green .
>
> #
>
> many thanks !
>
> -- bogdan
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] coloring edges in IGRAPH

2018-07-24 Thread Bogdan Tanasa

Dear all,

I would appreciate a piece of advice please : I am aiming to color the
edges in a graph, by using IGRAPH package.

It works well for the big braph, however, when I decompose the graph into 2
subgraphs and color code those, the color of the edges change
(unexpectedly).

more precisely, as an example -- we have a dataframe :

el <- data.frame(Partner1=c(1, 3, 4, 5, 6), Partner2=c(2, 2, 5, 7, 7),
TYPE=c("DEL", "DEL", "DUP", "TRA", "TRA"))

el$COLOR[el$TYPE=="DEL"] <- "red"

el$COLOR[el$TYPE=="DUP"] <- "green"

el$COLOR[el$TYPE=="INS"] <- "yellow"

el$COLOR[el$TYPE=="INV"] <- "brown"

el$COLOR[el$TYPE=="TRA"] <- "blue"

#> el
#  Partner1 Partner2 TYPE COLOR
#112  DEL   red
#232  DEL   red
#345  DUP green
#457  TRA  blue
#567  TRA  blue

g <- graph_from_data_frame(d = el, directed = TRUE)

plot(g, edge.color=el$COLOR)

### here decomposing the graph into 2 SUBGRAPHS :

g_decompose <- decompose.graph(g)

plot(g_decompose[[1]], edge.color=el$COLOR) ## here the edges are red (that
is fine)

plot(g_decompose[[2]], edge.color=el$COLOR) ## here the edges shall be blue
and green, not red and green .

#

many thanks !

-- bogdan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] help with merging two dataframes function of "egrep"-like formulas

2018-07-18 Thread Bogdan Tanasa

it looks great, thank you very much Jeff for your time and kind help !

On Wed, Jul 18, 2018 at 7:51 PM, Jeff Newmiller 
wrote:

> The traditional (SQL) way to attack this problem is to make the data
> structure simpler so that faster comparisons can be utilized:
>
> 
> A <- data.frame(z=c("a*b", "c*d", "d*e", "e*f"), t =c(1, 2, 3, 4))
> B <- data.frame(z=c("a*b::x*y", "c", "", "g*h"), t =c(1, 2, 3, 4))
>
> library(dplyr)
> #>
> #> Attaching package: 'dplyr'
> #> The following objects are masked from 'package:stats':
> #>
> #> filter, lag
> #> The following objects are masked from 'package:base':
> #>
> #> intersect, setdiff, setequal, union
> library(tidyr)
> Bx <- (   B
>   %>% mutate( z_B = as.character( z ) )
>   %>% rename( t_B = t )
>   %>% separate_rows( z, sep="::" )
>   )
> Bx
> #> z t_B  z_B
> #> 1 a*b   1 a*b::x*y
> #> 2 x*y   1 a*b::x*y
> #> 3   c   2c
> #> 4   3
> #> 5 g*h   4  g*h
> result <- (   A
>   %>% mutate( z = as.character( z ) )
>   %>% rename( t_A = t )
>   %>% inner_join( Bx, by="z" )
>   )
> result
> #> z t_A t_B  z_B
> #> 1 a*b   1   1 a*b::x*y
>
> #' Created on 2018-07-18 by the [reprex package](http://reprex.tidyver
> se.org) (v0.2.0).
> 
>
> Note that this is preferable if you can avoid ever creating the complex
> data z in B, but Bx is much more flexible and less error prone than B.
> (Especially if you don't have to create B$z_B at all, but have some other
> unique identifier(s) for the groupings represented by each row in B.)
>
>
> On Wed, 18 Jul 2018, Bogdan Tanasa wrote:
>
> Thanks a lot ! It looks that I am getting the same results with :
>>
>> B %>% regex_left_join(A, by = c(z = 'z'))
>>
>> On Wed, Jul 18, 2018 at 3:57 PM, Riley Finn  wrote:
>>
>> please may I ask for a piece of advise regarding merging two dataframes :
>>>
>>>> A <- data.frame(z=c("a*b", "c*d", "d*e", "e*f"), t =c(1, 2, 3, 4))
>>>> B <- data.frame(z=c("a*b::x*y", "c", "", "g*h"), t =c(1, 2, 3, 4))
>>>> function of the criteria :
>>>> if "the elements in the 1st column of A could be found among the
>>>> elements
>>>> of the 1st column of B" i.e.
>>>> for the example above, we shall combine in the results only the row with
>>>> "a*b" of A with the row with "a*b::x*y" of B.
>>>>
>>>
>>>
>>> This may be what you are looking for:
>>>
>>> library(fuzzyjoin)
>>>
>>> The inner join returns just the one row where the string matches.
>>> B %>%
>>>   regex_inner_join(A, by = c(z = 'z'))
>>>
>>> While the full join returns NA's where the string does not match.
>>> B %>%
>>>   regex_full_join(A, by = c(z = 'z'))
>>>
>>> On Wed, Jul 18, 2018 at 5:20 PM Bogdan Tanasa  wrote:
>>>
>>> Dear all,
>>>>
>>>> please may I ask for a piece of advise regarding merging two dataframes
>>>> :
>>>>
>>>> A <- data.frame(z=c("a*b", "c*d", "d*e", "e*f"), t =c(1, 2, 3, 4))
>>>>
>>>> B <- data.frame(z=c("a*b::x*y", "c", "", "g*h"), t =c(1, 2, 3, 4))
>>>>
>>>> function of the criteria :
>>>>
>>>> if "the elements in the 1st column of A could be found among the
>>>> elements
>>>> of the 1st column of B" i.e.
>>>>
>>>> for the example above, we shall combine in the results only the row with
>>>> "a*b" of A with the row with "a*b::x*y" of B.
>>>>
>>>> thank you,
>>>>
>>>> bogdan
>>>>
>>>> [[alternative HTML version deleted]]
>>>>
>>>> __
>>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide http://www.R-project.org/
>>>> posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>>
>>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posti
>> ng-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
> 
> ---
> Jeff NewmillerThe .   .  Go Live...
> DCN:Basics: ##.#.   ##.#.  Live
> Go...
>   Live:   OO#.. Dead: OO#..  Playing
> Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
> /Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
> 
> ---
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] help with merging two dataframes function of "egrep"-like formulas

2018-07-18 Thread Bogdan Tanasa

Dear Riley,

thank you very much for your help and solution. I got some inspiration from
stackoverflow website,

and I did use sqldf library. It looks that the formula below works too.
Thanks a lot !

sqldf("select B.*, A.* from B left join A on instr(B.z,  A.z)")


On Wed, Jul 18, 2018 at 3:57 PM, Riley Finn  wrote:

> please may I ask for a piece of advise regarding merging two dataframes :
>> A <- data.frame(z=c("a*b", "c*d", "d*e", "e*f"), t =c(1, 2, 3, 4))
>> B <- data.frame(z=c("a*b::x*y", "c", "", "g*h"), t =c(1, 2, 3, 4))
>> function of the criteria :
>> if "the elements in the 1st column of A could be found among the elements
>> of the 1st column of B" i.e.
>> for the example above, we shall combine in the results only the row with
>> "a*b" of A with the row with "a*b::x*y" of B.
>
>
> This may be what you are looking for:
>
> library(fuzzyjoin)
>
> The inner join returns just the one row where the string matches.
> B %>%
>   regex_inner_join(A, by = c(z = 'z'))
>
> While the full join returns NA's where the string does not match.
> B %>%
>   regex_full_join(A, by = c(z = 'z'))
>
> On Wed, Jul 18, 2018 at 5:20 PM Bogdan Tanasa  wrote:
>
>> Dear all,
>>
>> please may I ask for a piece of advise regarding merging two dataframes :
>>
>> A <- data.frame(z=c("a*b", "c*d", "d*e", "e*f"), t =c(1, 2, 3, 4))
>>
>> B <- data.frame(z=c("a*b::x*y", "c", "", "g*h"), t =c(1, 2, 3, 4))
>>
>> function of the criteria :
>>
>> if "the elements in the 1st column of A could be found among the elements
>> of the 1st column of B" i.e.
>>
>> for the example above, we shall combine in the results only the row with
>> "a*b" of A with the row with "a*b::x*y" of B.
>>
>> thank you,
>>
>> bogdan
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/
>> posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] help with merging two dataframes function of "egrep"-like formulas

2018-07-18 Thread Bogdan Tanasa

Thanks a lot ! It looks that I am getting the same results with :

B %>% regex_left_join(A, by = c(z = 'z'))

On Wed, Jul 18, 2018 at 3:57 PM, Riley Finn  wrote:

> please may I ask for a piece of advise regarding merging two dataframes :
>> A <- data.frame(z=c("a*b", "c*d", "d*e", "e*f"), t =c(1, 2, 3, 4))
>> B <- data.frame(z=c("a*b::x*y", "c", "", "g*h"), t =c(1, 2, 3, 4))
>> function of the criteria :
>> if "the elements in the 1st column of A could be found among the elements
>> of the 1st column of B" i.e.
>> for the example above, we shall combine in the results only the row with
>> "a*b" of A with the row with "a*b::x*y" of B.
>
>
> This may be what you are looking for:
>
> library(fuzzyjoin)
>
> The inner join returns just the one row where the string matches.
> B %>%
>   regex_inner_join(A, by = c(z = 'z'))
>
> While the full join returns NA's where the string does not match.
> B %>%
>   regex_full_join(A, by = c(z = 'z'))
>
> On Wed, Jul 18, 2018 at 5:20 PM Bogdan Tanasa  wrote:
>
>> Dear all,
>>
>> please may I ask for a piece of advise regarding merging two dataframes :
>>
>> A <- data.frame(z=c("a*b", "c*d", "d*e", "e*f"), t =c(1, 2, 3, 4))
>>
>> B <- data.frame(z=c("a*b::x*y", "c", "", "g*h"), t =c(1, 2, 3, 4))
>>
>> function of the criteria :
>>
>> if "the elements in the 1st column of A could be found among the elements
>> of the 1st column of B" i.e.
>>
>> for the example above, we shall combine in the results only the row with
>> "a*b" of A with the row with "a*b::x*y" of B.
>>
>> thank you,
>>
>> bogdan
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/
>> posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] help with merging two dataframes function of "egrep"-like formulas

2018-07-18 Thread Bogdan Tanasa

Dear all,

please may I ask for a piece of advise regarding merging two dataframes :

A <- data.frame(z=c("a*b", "c*d", "d*e", "e*f"), t =c(1, 2, 3, 4))

B <- data.frame(z=c("a*b::x*y", "c", "", "g*h"), t =c(1, 2, 3, 4))

function of the criteria :

if "the elements in the 1st column of A could be found among the elements
of the 1st column of B" i.e.

for the example above, we shall combine in the results only the row with
"a*b" of A with the row with "a*b::x*y" of B.

thank you,

bogdan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] even display of unevenly spaced numbers on x/y coordinates

2018-07-15 Thread Bogdan Tanasa

Hi Jeff,

thank you again for your help, and for your suggestion to subset the data :

DF500 <- subset( DF, LENGTH < 500 )

yes, I did run the code, and I believe that it is easier to present/defend
the results, after using "subset".

-- bogdan

On Sat, Jul 14, 2018 at 11:07 PM, Jeff Newmiller 
wrote:

> But did you run the code? Apparently not.
>
> On July 14, 2018 10:34:32 PM PDT, Bogdan Tanasa  wrote:
> >Dear Jeff,
> >
> >thank you for your prompt reply and kind help.
> >
> >During our previous conversation, we worked on a different topic,
> >namely
> >subsetting the dataframe before using ecdf() function in ggplot2.
> >
> >Now, i would like to know, how I could evenly space on the x axis the
> >values (0, 0.01, 0.1, 1, 10). Thanks again, and happy weekend ;) !
> >
> >-- bogdan
> >
> >
> >On Sat, Jul 14, 2018 at 10:25 PM, Jeff Newmiller
> >
> >wrote:
> >
> >> Isn't this what I showed you how to do in [1]?
> >>
> >> [1] https://stat.ethz.ch/pipermail/r-help/2018-July/455215.html
> >>
> >> On July 14, 2018 10:16:36 PM PDT, Bogdan Tanasa 
> >wrote:
> >> >Dear all,
> >> >
> >> >please would you advise on how I could make an even display of
> >unevenly
> >> >spaced number on a graph in R. For example, considering the code
> >below
> >> >:
> >> >
> >> >BREAKS = c(0, 0.1, 1, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200,
> >> >300,
> >> >400, 500)
> >> >
> >> >a <- seq(0,100,0.1)
> >> >b <- seq(0,1000,0.1)
> >> >
> >> >plot(ecdf(a), col="red", xlim=c(0,100), main=NA, breaks=BREAKS)
> >> >plot(ecdf(b), col="green", xlim=c(0,100), add=T, breaks=BREAKS)
> >> >
> >> >I would like to show on X axis (0, 0.1, 1 and 10) spaced in an
> >> >equal/even
> >> >manner.
> >> >
> >> >thanks !
> >> >
> >> >bogdan
> >> >
> >> >   [[alternative HTML version deleted]]
> >> >
> >> >__
> >> >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> >https://stat.ethz.ch/mailman/listinfo/r-help
> >> >PLEASE do read the posting guide
> >> >http://www.R-project.org/posting-guide.html
> >> >and provide commented, minimal, self-contained, reproducible code.
> >>
> >> --
> >> Sent from my phone. Please excuse my brevity.
> >>
>
> --
> Sent from my phone. Please excuse my brevity.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] even display of unevenly spaced numbers on x/y coordinates

2018-07-14 Thread Bogdan Tanasa

Dear Jeff,

thank you for your prompt reply and kind help.

During our previous conversation, we worked on a different topic, namely
subsetting the dataframe before using ecdf() function in ggplot2.

Now, i would like to know, how I could evenly space on the x axis the
values (0, 0.01, 0.1, 1, 10). Thanks again, and happy weekend ;) !

-- bogdan

On Sat, Jul 14, 2018 at 10:25 PM, Jeff Newmiller 
wrote:

> Isn't this what I showed you how to do in [1]?
>
> [1] https://stat.ethz.ch/pipermail/r-help/2018-July/455215.html
>
> On July 14, 2018 10:16:36 PM PDT, Bogdan Tanasa  wrote:
> >Dear all,
> >
> >please would you advise on how I could make an even display of unevenly
> >spaced number on a graph in R. For example, considering the code below
> >:
> >
> >BREAKS = c(0, 0.1, 1, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200,
> >300,
> >400, 500)
> >
> >a <- seq(0,100,0.1)
> >b <- seq(0,1000,0.1)
> >
> >plot(ecdf(a), col="red", xlim=c(0,100), main=NA, breaks=BREAKS)
> >plot(ecdf(b), col="green", xlim=c(0,100), add=T, breaks=BREAKS)
> >
> >I would like to show on X axis (0, 0.1, 1 and 10) spaced in an
> >equal/even
> >manner.
> >
> >thanks !
> >
> >bogdan
> >
> >   [[alternative HTML version deleted]]
> >
> >__
> >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
>
> --
> Sent from my phone. Please excuse my brevity.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] even display of unevenly spaced numbers on x/y coordinates

2018-07-14 Thread Bogdan Tanasa

Dear all,

please would you advise on how I could make an even display of unevenly
spaced number on a graph in R. For example, considering the code below :

BREAKS = c(0, 0.1, 1, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 300,
400, 500)

a <- seq(0,100,0.1)
b <- seq(0,1000,0.1)

plot(ecdf(a), col="red", xlim=c(0,100), main=NA, breaks=BREAKS)
plot(ecdf(b), col="green", xlim=c(0,100), add=T, breaks=BREAKS)

I would like to show on X axis (0, 0.1, 1 and 10) spaced in an equal/even
manner.

thanks !

bogdan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] about ECDF display in ggplot2

2018-07-09 Thread Bogdan Tanasa

07L, 8593L, 2085L, 6467L, 8065L, 5385L,
>   5635L, 8363L, 7587L, 5172L, 7326L, 1015L, 6817L, 5560L, 1324L,
>   716L, 4136L, 6945L, 6536L, 7281L, 1516L, 8415L, 2616L, 1328L,
>   6406L, 2886L, 6933L, 3511L, 6040L, 6905L, 1672L, 259L, 1208L,
>   6051L, 8315L, 4896L, 5351L, 1752L, 4759L, 1597L, 4017L, 2818L,
>   1033L, 1654L, 6483L, 3659L, 3678L, 4266L, 3797L, 1212L, 7322L,
>   5258L, 7052L, 6826L, 8147L, 7655L, 2813L, 2300L, 6584L, 6629L,
>   8140L, 7034L, 1183L, 2551L, 1726L, 6950L, 1143L, 1144L, 641L,
>   471L, 4712L, 995L, 6582L, 6476L), class = "data.frame")
>
>
> # display with PLOT FUNCTION:
>
>
> # saving files should be avoided in reproducible examples... especially
> files
> # that cannot be transmitted through the R-help mailing list such as pdf
> files
> #pdf("display.R.ecdf.LENGTH.pdf", width=10, height=6, paper='special')
>
> # Your original plot commands below create a fake impression of the data by
> # falsifying the axes. If you really are only interested in data points
> less
> # than 500, you should be explicit about creating a data set containing
> only
> # such constrained values before plotting them.
> plot(ecdf(DF$LENGTH), xlab="DEL SIZE",
>  ylab="fraction of DEL",
>  main="LENGTH of DEL",
>  xlim=c(0,500),
>  col = "dark red", axes = FALSE)
> ticks_y <- c(0, 0.2, 0.4, 0.6, 0.8, 1, 1.2, 1.4)
> axis(2, at=ticks_y, labels=ticks_y, col.axis="red")
> ticks_x <- c(0, 100, 200, 400, 500, 600, 700, 800)
> axis(1, at=ticks_x, labels=ticks_x, col.axis="blue")
>
> #' ![](file1f4143e5e164_reprex_files/figure-markdown_strict/rep
> rex-body-1.png)
>
> # my recommendation
> DF500 <- subset( DF, LENGTH < 500 )
> plot( ecdf( DF500$LENGTH )
> , xlab = "DEL SIZE"
> , ylab = "fraction of DEL"
> , main = "LENGTH of DEL"
> , col = "dark red"
> )
>
> #' ![](file1f4143e5e164_reprex_files/figure-markdown_strict/rep
> rex-body-2.png)
>
> # alternatively
> plot( ecdf( DF$LENGTH )
> , xlab = "DEL SIZE"
> , ylab = "fraction of DEL"
> , main = "LENGTH of DEL"
> , col = "dark red"
> , xlim=c( 1, 1e9 )
> , log="x"
> )
>
> #' ![](file1f4143e5e164_reprex_files/figure-markdown_strict/rep
> rex-body-3.png)
>
>
>
> #dev.off()
>
> # display in GGPLOT2 :
>
> BREAKS = c(0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 300, 400, 500,
>1000, 1, 10, 100, 1000, 1, 10)
>
> barfill <- "#4271AE"
> barlines <- "#1F3552"
>
> #pdf("display.ggplot2.ecdf.LENGTH.pdf", width=10, height=6,
> paper='special')
>
> # ggplot's limits behavior is enabling your false representation of the
> data, but it
> # warns you of the data removal
> ggplot(DF, aes(LENGTH)) +
>   stat_ecdf(geom = "point", colour = barlines, fill = barfill) +
>   scale_x_continuous(name = "LENGTH of DEL",
>  breaks = BREAKS,
>  limits=c(0, 500)
>  ) +
>   scale_y_continuous(name = "FRACTION") +
>   ggtitle("ECDF of LENGTH") +
>   theme_bw() +
>   theme(legend.position = "bottom", legend.direction =
> "horizontal",
>legend.box = "horizontal",
>legend.key.size = unit(1, "cm"),
>axis.title = element_text(size = 12),
>legend.text = element_text(size = 9),
>legend.title=element_text(face = "bold", size = 9))
> #> Warning: Removed 80 rows containing non-finite values (stat_ecdf).
>
> #' ![](file1f4143e5e164_reprex_files/figure-markdown_strict/rep
> rex-body-4.png)
>
>
> # my recommendation
> ggplot(DF500, aes(LENGTH)) +
>   stat_ecdf(geom = "point", colour = barlines, fill = barfill) +
>   scale_x_continuous(name = "LENGTH of DEL",
>  breaks = BREAKS ) +
>   scale_y_continuous(name = "FRACTION") +
>   ggtitle("ECDF of LENGTH") +
>   theme_bw() +
>   theme(legend.position = "bottom", legend.direction = "horizontal",
> legend.box = "horizontal",
> legend.key.size = unit(1, "cm"),
> axis.title = element_text(size = 12),
>

Re: [R] about ECDF display in ggplot2

2018-07-08 Thread Bogdan Tanasa

Dear Jeff,

thank you for your email.

Yes, in order to be more descriptive/comprehensive, please find attached to
my email the following files (my apologies ... I am sending these as
attachments, as I do not have a web server running at this moment) :

-- the R script (R_script_display_ECDF.R) that reads the file "LENGTH" and
outputs ECDF figure by using the standard R function or ggplot2.

-- the display of ECDF by using standard R function
("display.R.ecdf.LENGTH.pdf")

-- the display of ECDF by using ggplot2 ("display.ggplot2.ecdf.LENGTH.pdf")

The ECDF over xlim(0,500) looks very different (contrasting plot(ecdf) vs
ggplot2).  Please would you advise why ? what shall I change in my ggplot2
code ?

thanks a lot,

- bogdan

ps : the R code is also written below :

 library("ggplot2")
>


> file <- read.delim("LENGTH", sep="\t", header=T, stringsAsFactors=F)
>


> # display with PLOT FUNCTION:
>


> pdf("display.R.ecdf.LENGTH.pdf", width=10, height=6, paper='special')
>


> plot(ecdf(file$LENGTH), xlab="DEL SIZE",
>  ylab="fraction of DEL",
>  main="LENGTH of DEL",
>  xlim=c(0,500),
>  col = "dark red", axes = FALSE)
>


> ticks_y <- c(0, 0.2, 0.4, 0.6, 0.8, 1, 1.2, 1.4)
>


> axis(2, at=ticks_y, labels=ticks_y, col.axis="red")
>


> ticks_x <- c(0, 100, 200, 400, 500, 600, 700, 800)
>


> axis(1, at=ticks_x, labels=ticks_x, col.axis="blue")
>


> dev.off()
>


> # display in GGPLOT2 :
>


> BREAKS = c(0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 300, 400, 500,
>1000, 1, 10, 100, 1000, 1, 10)
>


> barfill <- "#4271AE"
> barlines <- "#1F3552"
>


> pdf("display.ggplot2.ecdf.LENGTH.pdf", width=10, height=6,
> paper='special')
>


> ggplot(file, aes(LENGTH)) +
>   stat_ecdf(geom = "point", colour = barlines, fill = barfill) +
>   scale_x_continuous(name = "LENGTH of DEL",
>  breaks = BREAKS,
>  limits=c(0, 500)) +
>   scale_y_continuous(name = "FRACTION") +
>   ggtitle("ECDF of LENGTH") +
>   theme_bw() +
>   theme(legend.position = "bottom", legend.direction =
> "horizontal",
>legend.box = "horizontal",
>legend.key.size = unit(1, "cm"),
>axis.title = element_text(size = 12),
>legend.text = element_text(size = 9),
>legend.title=element_text(face = "bold", size = 9))
>


> dev.off()








On Sat, Jul 7, 2018 at 9:47 PM, Jeff Newmiller 
wrote:

> It is a feature of ggplot that points excluded by limits raise warnings,
> while base graphics do not.
>
> You may find that using coord_cartesian with the xlim=c(0,500) argument
> works better with ggplot by showing the consequences of points out of the
> limits on lines within the viewport.
>
> There are other possible problems with your data that your
> non-reproducible example does not show, and sending R code in
> HTML-formatted email usually corrupts it.. so please follow the
> recommendations in the Posting Guide next time you post.
>
> On July 6, 2018 4:32:41 PM PDT, Bogdan Tanasa  wrote:
> >Dear all,
> >
> >I would appreciate having your advice/suggestions/comments on the
> >following
> >:
> >
> >1 -- starting from a vector that contains LENGTHS (numerically, the
> >values
> >are from 1 to 10 000)
> >
> >2 -- shall I display the ECDF by using the R code and some "limits" :
> >
> >BREAKS = c(0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 300, 400,
> >500,
> > 1000, 1, 10, 100, 1000, 1, 10)
> >
> >ggplot(x, aes(LENGTH)) +
> >  stat_ecdf(geom = "point") +
> >  scale_x_continuous(name = "LENGTH of DEL",
> > breaks = BREAKS,
> > limits=c(0, 500))
> >
> >3 -- I am getting the following warning message : "Warning message:
> >Removed
> >109 rows containing non-finite values (stat_ecdf)."
> >
> >The question is : are these 109 values removed from VISUALIZATION as i
> >set
> >up the "limits", or are these 109 values removed from statistical
> >CALCULATION?
> >
> >4 -- in contrast, shall I use the

[R] about ECDF display in ggplot2

2018-07-06 Thread Bogdan Tanasa

Dear all,

I would appreciate having your advice/suggestions/comments on the following
:

1 -- starting from a vector that contains LENGTHS (numerically, the values
are from 1 to 10 000)

2 -- shall I display the ECDF by using the R code and some "limits" :

BREAKS = c(0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 300, 400, 500,
   1000, 1, 10, 100, 1000, 1, 10)

ggplot(x, aes(LENGTH)) +
  stat_ecdf(geom = "point") +
  scale_x_continuous(name = "LENGTH of DEL",
 breaks = BREAKS,
 limits=c(0, 500))

3 -- I am getting the following warning message : "Warning message: Removed
109 rows containing non-finite values (stat_ecdf)."

The question is : are these 109 values removed from VISUALIZATION as i set
up the "limits", or are these 109 values removed from statistical
CALCULATION?

4 -- in contrast, shall I use the standard R functions plot(ecdf), there is
no "warning mesage"

plot(ecdf(x$LENGTH), xlab="DEL LENGTH",
 ylab="Fraction of DEL", main="DEL", xlim=c(0,500),
 col = "dark red")

Thanks a lot !

-- bogdan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] a question about R script : "Can only modify plain character vectors."

2018-05-08 Thread Bogdan Tanasa

Dear all,

would appreciate a suggestion about the following situation : I am running
a script in R, and shall i execute it in the terminal, step by step, it
works fine.

however, if  i do source ("script.R"), it does not complete and I am
getting the error :
"Can only modify plain character vectors."

what may go wrong ? thank you for your help,

-- bogdan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] splitting a dataframe in R based on multiple gene names in a specific column

2017-08-22 Thread Bogdan Tanasa

I would appreciate please a suggestion on how to do the following :

i'm working with a dataframe in R that contains in a specific column
multiple gene names, eg :

> df.sample.gene[15:20,2:8]
 Chr Start   End Ref Alt Func.refGene
Gene.refGene284 chr2  16080996  16080996   C   T ncRNA_exonic
   GACAT3448 chr2 113979920 113979920   C   T ncRNA_exonic
LINC01191,LOC100499194465 chr2 131279347 131279347   C   G
ncRNA_exonic  LOC440910525 chr2 22358 22358   T
A   exonic  AP1S3626 chr3  99794575  99794575   G
 A   exonic COL8A1643 chr3 132601066 132601066   A
  G   exonic  ACKR4

How could I obtain a dataframe where each line that has multiple gene names
(in the field Gene.refGene) is replicated with only one gene name ? i.e.

for the second row :

  448 chr2 113979920 113979920   C   T ncRNA_exonic LINC01191,LOC100499194

we shall get in the final output (that contains all the rows) :

  448 chr2 113979920 113979920   C   T ncRNA_exonic LINC01191
  448 chr2 113979920 113979920   C   T ncRNA_exonic LOC100499194

thanks a lot !

-- bogdan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] submitting R scripts with command_line_arguments to PBS HPC clusters

2017-07-12 Thread Bogdan Tanasa

Dear Peter, that is very very helpful, many thanks for your suggestions ;) !

On Tue, Jul 11, 2017 at 11:34 PM, Anthoni, Peter (IMK) <
peter.anth...@kit.edu> wrote:

> Hi,
>
> The problem is most likely, you need to call a R CMD BATCH with your
> arguments and the R-script inside of a shell script that you submit to your
> qsub.
> Unfortunately we don't use qsub anymore so can't test it, but it should be
> as follows:
>
> R-script eg. test.R:
> > ##First read in the arguments listed at the command line
> > args=(commandArgs(TRUE))
> >
> > ##args is now a list of character vectors
> > ## First check to see if arguments are passed.
> > if(length(args)==0){
> >   stop("no args specified")
> > }
> > ## Then cycle through each element of the list and evaluate the
> expressions.
> > for(i in 1:length(args)){
> >   print(args[[i]])
> >   eval(parse(text=args[[i]]))
> > }
> > print(TUMOR)
> > print(GERMLINE)
> > print(CHR)
>
>
> qsub shell script test.sh:
> > #!/bin/bash
> >
> > #Note: the single quote '...' around the --args ... "..." "..." is
> important!
> > R CMD BATCH --no-save --no-restore '--args TUMOR="tumor.bam"
> GERMLINE="germline.bam" CHR="chr22"' test.R test.Rout
>
> then you submit with a qsub with all the options you specified the test.sh
> qsub  test.sh
>
> cheers
> Peter
>
>
>
> > On 12. Jul 2017, at 03:01, Jeff Newmiller <jdnew...@dcn.davis.ca.us>
> wrote:
> >
> > This sounds like an operating system specific question, in that "submit
> the R script to a PBS HPC scheduler" would be the kind of action that would
> run R with very different environment variables and possibly different
> access credentials than your usual interactive terminal.  A thorough
> reading of the "Installation and Administration Guide" and some study of
> your HPC documentation are in order.
> > --
> > Sent from my phone. Please excuse my brevity.
> >
> > On July 11, 2017 5:25:20 PM PDT, Bogdan Tanasa <tan...@gmail.com> wrote:
> >> Dear all,
> >>
> >> please could you advise me on the following : I've written a R script
> >> that
> >> reads 3 arguments from the command line, i.e. :
> >>
> >> " args <- commandArgs(TRUE)
> >> TUMOR <- args[1]
> >> GERMLINE <- args[2]
> >> CHR <- args[3] ".
> >>
> >> when I submit the R script to a PBS HPC scheduler, I do the following
> >> (below), but ... I am getting an error message.
> >> (I am not posting the error message, because the R script I wrote works
> >> fine when it is run from a regular terminal ..)
> >>
> >> Please may I ask, how do you usually submit the R scripts with command
> >> line
> >> arguments to PBS HPC schedulers ?
> >>
> >> qsub -d $PWD -l nodes=1:ppn=4 -l vmem=10gb -m bea -M tan...@gmail.com \
> >> -v TUMOR="tumor.bam",GERMLINE="germline.bam",CHR="chr22" \
> >> -e script.efile.chr22 \
> >> -o script.ofile.chr22 \
> >> script.R
> >>
> >> Thank you very very much  !
> >>
> >> -- bogdan
> >>
> >>  [[alternative HTML version deleted]]
> >>
> >> __
> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] submitting R scripts with command_line_arguments to PBS HPC clusters

2017-07-11 Thread Bogdan Tanasa

Dear all,

please could you advise me on the following : I've written a R script that
reads 3 arguments from the command line, i.e. :

" args <- commandArgs(TRUE)
TUMOR <- args[1]
GERMLINE <- args[2]
CHR <- args[3] ".

when I submit the R script to a PBS HPC scheduler, I do the following
(below), but ... I am getting an error message.
(I am not posting the error message, because the R script I wrote works
fine when it is run from a regular terminal ..)

Please may I ask, how do you usually submit the R scripts with command line
arguments to PBS HPC schedulers ?

qsub -d $PWD -l nodes=1:ppn=4 -l vmem=10gb -m bea -M tan...@gmail.com \
-v TUMOR="tumor.bam",GERMLINE="germline.bam",CHR="chr22" \
-e script.efile.chr22 \
-o script.ofile.chr22 \
script.R

Thank you very very much  !

-- bogdan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] reshaping the data

2017-07-03 Thread Bogdan Tanasa

Thanks a lot gentlemen, and particularly Petr -- the R code you did share
helped tremendously ;)

On Mon, Jul 3, 2017 at 2:53 AM, PIKAL Petr <petr.pi...@precheza.cz> wrote:

> Hi
>
> Do you want something like
>
> dcast(test, Sample~Gene, fun=function(x) paste(x, collapse=","))
>
> or
>
> dcast(test, Sample~Gene, fun=function(x) sum(as.numeric(x)))
>
> 1 means INDEL, 2 means SNV and three means both
>
> Cheers
> Petr
>
>
> > -Original Message-
> > From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Bogdan
> > Tanasa
> > Sent: Monday, July 3, 2017 9:22 AM
> > To: r-help <r-help@r-project.org>
> > Subject: [R] reshaping the data
> >
> > Dear all,
> >
> > I would appreciate please a piece of help regarding the use of
> acast/dcast
> > functions in reshape2 package.
> >
> > Specifically, I'm working with a data frame, that has information about
> > SAMPLE, GENE, and TYPE of MUTATION (as shown below):
> >
> > SampleGene  Type
> > 22M   AEBP1   SNV
> > 17M   AEBP1   SNV
> > 22M ATR   INDEL
> > 22M ATR   SNV
> > 11M BTK   SNV
> > 11M BTK INDEL
> >
> >
> > I would like to transform this DATAFRAME into a MATRIX that has GENE on
> > ROWS, SAMPLE on COLUMNS, and the elements of the matrix are SNV or INDEL
> > (ie the types of mutations).
> >
> > The R code starts with :
> >
> > y <- data.frame(Sample = x$Sample, Gene = x$Gene, Type=x$Type)
> >
> > z <- acast(y, Cancer_Gene ~ Sample)
> >
> > although in z, I do not have the information on Type (i.e.SNV or INDEL).
> >
> > thanks a lot,
> >
> > -- bogdan
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> 
> Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou
> určeny pouze jeho adresátům.
> Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě
> neprodleně jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho kopie
> vymažte ze svého systému.
> Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento email
> jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat.
> Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou modifikacemi
> či zpožděním přenosu e-mailu.
>
> V případě, že je tento e-mail součástí obchodního jednání:
> - vyhrazuje si odesílatel právo ukončit kdykoliv jednání o uzavření
> smlouvy, a to z jakéhokoliv důvodu i bez uvedení důvodu.
> - a obsahuje-li nabídku, je adresát oprávněn nabídku bezodkladně přijmout;
> Odesílatel tohoto e-mailu (nabídky) vylučuje přijetí nabídky ze strany
> příjemce s dodatkem či odchylkou.
> - trvá odesílatel na tom, že příslušná smlouva je uzavřena teprve
> výslovným dosažením shody na všech jejích náležitostech.
> - odesílatel tohoto emailu informuje, že není oprávněn uzavírat za
> společnost žádné smlouvy s výjimkou případů, kdy k tomu byl písemně zmocněn
> nebo písemně pověřen a takové pověření nebo plná moc byly adresátovi tohoto
> emailu případně osobě, kterou adresát zastupuje, předloženy nebo jejich
> existence je adresátovi či osobě jím zastoupené známá.
>
> This e-mail and any documents attached to it may be confidential and are
> intended only for its intended recipients.
> If you received this e-mail by mistake, please immediately inform its
> sender. Delete the contents of this e-mail with all attachments and its
> copies from your system.
> If you are not the intended recipient of this e-mail, you are not
> authorized to use, disseminate, copy or disclose this e-mail in any manner.
> The sender of this e-mail shall not be liable for any possible damage
> caused by modifications of the e-mail or by delay with transfer of the
> email.
>
> In case that this e-mail forms part of business dealings:
> - the sender reserves the right to end negotiations about entering into a
> contract in any time, for any reason, and without stating any reasoning.
> - if the e-mail contains an offer, the recipient is entitled to
> immediately accept such offer; The sender of this e-mail (offer) excludes
> any acceptance of the offer on the part of the recipient containing any
> amendment or variation.
> - the sender insists on that the respective contract is concl

[R] reshaping the data

2017-07-03 Thread Bogdan Tanasa

Dear all,

I would appreciate please a piece of help regarding the use of acast/dcast
functions in reshape2 package.

Specifically, I'm working with a data frame, that has information about
SAMPLE, GENE, and TYPE of MUTATION (as shown below):

SampleGene  Type
22M   AEBP1   SNV
17M   AEBP1   SNV
22M ATR   INDEL
22M ATR   SNV
11M BTK   SNV
11M BTK INDEL


I would like to transform this DATAFRAME into a MATRIX that has GENE on
ROWS, SAMPLE on COLUMNS, and the elements of the matrix are SNV or INDEL
(ie the types of mutations).

The R code starts with :

y <- data.frame(Sample = x$Sample, Gene = x$Gene, Type=x$Type)

z <- acast(y, Cancer_Gene ~ Sample)

although in z, I do not have the information on Type (i.e.SNV or INDEL).

thanks a lot,

-- bogdan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] integrating 2 lists and a data frame in R

2017-06-06 Thread Bogdan Tanasa

Thank you Bert for your suggestion ;).

On Tue, Jun 6, 2017 at 8:19 AM, Bert Gunter <bgunter.4...@gmail.com> wrote:

> Simple matrix indexing suffices without any fancier functionality.
>
> ## First convert M and N to character vectors -- which they should
> have been in the first place!
>
> M <- sort(as.character(M[,1]))
> N <-  sort(as.character(N[,1]))
>
> ## This could be a one-liner, but I'll split it up for clarity.
>
> res <-matrix(NA, length(M),length(N),dimnames = list(M,N))
>
> res[as.matrix(C[,2:1])] <- C$I ## matrix indexing
>
> res
>
> Cheers,
> Bert
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Tue, Jun 6, 2017 at 7:46 AM, Bogdan Tanasa <tan...@gmail.com> wrote:
> > Thank you David. Using xtabs operation simplifies the code very much,
> many
> > thanks ;)
> >
> > On Tue, Jun 6, 2017 at 7:44 AM, David Winsemius <dwinsem...@comcast.net>
> > wrote:
> >
> >>
> >> > On Jun 6, 2017, at 4:01 AM, Jim Lemon <drjimle...@gmail.com> wrote:
> >> >
> >> > Hi Bogdan,
> >> > Kinda messy, but:
> >> >
> >> > N <- data.frame(N=c("n1","n2","n3","n4"))
> >> > M <- data.frame(M=c("m1","m2","m3","m4","m5"))
> >> > C <- data.frame(n=c("n1","n2","n3"), m=c("m1","m1","m3"),
> >> I=c(100,300,400))
> >> > MN<-as.data.frame(matrix(NA,nrow=length(N[,1]),ncol=length(M[,1])))
> >> > names(MN)<-M[,1]
> >> > rownames(MN)<-N[,1]
> >> > C[,1]<-as.character(C[,1])
> >> > C[,2]<-as.character(C[,2])
> >> > for(row in 1:dim(C)[1]) MN[C[row,1],C[row,2]]<-C[row,3]
> >>
> >> `xtabs` offers another route:
> >>
> >> C$m <- factor(C$m, levels=M$M)
> >> C$n <- factor(C$n, levels=N$N)
> >>
> >> Option 1:  Zeroes in the empty positions:
> >> > (X <- xtabs(I ~ m+n , C, addNA=TRUE))
> >> n
> >> m n1  n2  n3  n4
> >>   m1 100 300   0   0
> >>   m2   0   0   0   0
> >>   m3   0   0 400   0
> >>   m4   0   0   0   0
> >>   m5   0   0   0   0
> >>
> >> Option 2: Sparase matrix
> >> > (X <- xtabs(I ~ m+n , C, sparse=TRUE))
> >> 5 x 4 sparse Matrix of class "dgCMatrix"
> >> n
> >> m n1  n2  n3 n4
> >>   m1 100 300   .  .
> >>   m2   .   .   .  .
> >>   m3   .   . 400  .
> >>   m4   .   .   .  .
> >>   m5   .   .   .  .
> >>
> >> I wasn't sure if the sparse reuslts of xtabs would make a distinction
> >> between 0 and NA, but happily it does:
> >>
> >> > C <- data.frame(n=c("n1","n2","n3", "n3", "n4"), m=c("m1","m1","m3",
> >> "m4", "m5"), I=c(100,300,400, NA, 0))
> >> > C
> >>n  m   I
> >> 1 n1 m1 100
> >> 2 n2 m1 300
> >> 3 n3 m3 400
> >> 4 n3 m4  NA
> >> 5 n4 m5   0
> >> > (X <- xtabs(I ~ m+n , C, sparse=TRUE))
> >> 4 x 4 sparse Matrix of class "dgCMatrix"
> >> n
> >> m n1  n2  n3 n4
> >>   m1 100 300   .  .
> >>   m3   .   . 400  .
> >>   m4   .   .   .  .
> >>   m5   .   .   .  0
> >>
> >> (In the example I forgot to repeat the lines that augmented the factor
> >> levels so m2 is not seen.
> >>
> >> --
> >> Davod
> >> >
> >> >
> >> > Jim
> >> >
> >> > On Tue, Jun 6, 2017 at 3:51 PM, Bogdan Tanasa <tan...@gmail.com>
> wrote:
> >> >> Dear Bert,
> >> >>
> >> >> thank you for your response. here it is the piece of R code : given 3
> >> data
> >> >> frames below ---
> >> >>
> >> >> N <- data.frame(N=c("n1","n2","n3","n4"))
> >> >>
> >> >> M <- data.frame(M=c("m1","m2","m3","m4","m5"))
> >> >>
> >> >> C <- data.frame(n=c("n1","n2","n3"), m=c("m1

Re: [R] integrating 2 lists and a data frame in R

2017-06-06 Thread Bogdan Tanasa

Thank you David. Using xtabs operation simplifies the code very much, many
thanks ;)

On Tue, Jun 6, 2017 at 7:44 AM, David Winsemius <dwinsem...@comcast.net>
wrote:

>
> > On Jun 6, 2017, at 4:01 AM, Jim Lemon <drjimle...@gmail.com> wrote:
> >
> > Hi Bogdan,
> > Kinda messy, but:
> >
> > N <- data.frame(N=c("n1","n2","n3","n4"))
> > M <- data.frame(M=c("m1","m2","m3","m4","m5"))
> > C <- data.frame(n=c("n1","n2","n3"), m=c("m1","m1","m3"),
> I=c(100,300,400))
> > MN<-as.data.frame(matrix(NA,nrow=length(N[,1]),ncol=length(M[,1])))
> > names(MN)<-M[,1]
> > rownames(MN)<-N[,1]
> > C[,1]<-as.character(C[,1])
> > C[,2]<-as.character(C[,2])
> > for(row in 1:dim(C)[1]) MN[C[row,1],C[row,2]]<-C[row,3]
>
> `xtabs` offers another route:
>
> C$m <- factor(C$m, levels=M$M)
> C$n <- factor(C$n, levels=N$N)
>
> Option 1:  Zeroes in the empty positions:
> > (X <- xtabs(I ~ m+n , C, addNA=TRUE))
> n
> m n1  n2  n3  n4
>   m1 100 300   0   0
>   m2   0   0   0   0
>   m3   0   0 400   0
>   m4   0   0   0   0
>   m5   0   0   0   0
>
> Option 2: Sparase matrix
> > (X <- xtabs(I ~ m+n , C, sparse=TRUE))
> 5 x 4 sparse Matrix of class "dgCMatrix"
> n
> m n1  n2  n3 n4
>   m1 100 300   .  .
>   m2   .   .   .  .
>   m3   .   . 400  .
>   m4   .   .   .  .
>   m5   .   .   .  .
>
> I wasn't sure if the sparse reuslts of xtabs would make a distinction
> between 0 and NA, but happily it does:
>
> > C <- data.frame(n=c("n1","n2","n3", "n3", "n4"), m=c("m1","m1","m3",
> "m4", "m5"), I=c(100,300,400, NA, 0))
> > C
>n  m   I
> 1 n1 m1 100
> 2 n2 m1 300
> 3 n3 m3 400
> 4 n3 m4  NA
> 5 n4 m5   0
> > (X <- xtabs(I ~ m+n , C, sparse=TRUE))
> 4 x 4 sparse Matrix of class "dgCMatrix"
> n
> m n1  n2  n3 n4
>   m1 100 300   .  .
>   m3   .   . 400  .
>   m4   .   .   .  .
>   m5   .   .   .  0
>
> (In the example I forgot to repeat the lines that augmented the factor
> levels so m2 is not seen.
>
> --
> Davod
> >
> >
> > Jim
> >
> > On Tue, Jun 6, 2017 at 3:51 PM, Bogdan Tanasa <tan...@gmail.com> wrote:
> >> Dear Bert,
> >>
> >> thank you for your response. here it is the piece of R code : given 3
> data
> >> frames below ---
> >>
> >> N <- data.frame(N=c("n1","n2","n3","n4"))
> >>
> >> M <- data.frame(M=c("m1","m2","m3","m4","m5"))
> >>
> >> C <- data.frame(n=c("n1","n2","n3"), m=c("m1","m1","m3"),
> I=c(100,300,400))
> >>
> >> how shall I integrate N, and M, and C in such a way that at the end we
> have
> >> a data frame with :
> >>
> >>
> >>   - list N as the columns names
> >>   - list M as the rows names
> >>   - the values in the cells of N * M, corresponding to the numerical
> >>   values in the data frame C.
> >>
> >> more precisely, the result shall be :
> >>
> >> n1  n2  n3 n4
> >> m1  100  200   -   -
> >> m2   -   -   -   -
> >> m3   -   -   300   -
> >> m4   -   -   -   -
> >> m5   -   -   -   -
> >>
> >> thank you !
> >>
> >>
> >> On Mon, Jun 5, 2017 at 6:57 PM, Bert Gunter <bgunter.4...@gmail.com>
> wrote:
> >>
> >>> Reproducible example, please. -- In particular, what exactly does C
> look
> >>> ilike?
> >>>
> >>> (You should know this by now).
> >>>
> >>> -- Bert
> >>> Bert Gunter
> >>>
> >>> "The trouble with having an open mind is that people keep coming along
> >>> and sticking things into it."
> >>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >>>
> >>>
> >>> On Mon, Jun 5, 2017 at 6:45 PM, Bogdan Tanasa <tan...@gmail.com>
> wrote:
> >>>> Dear all,
> >>>>
> >>>> please could you advise on the R code I could use in order to do the
> >>>> following operation :
> >>>>
> >>

Re: [R] integrating 2 lists and a data frame in R

2017-06-06 Thread Bogdan Tanasa

Thank you David for the code, as I am learning about xtabs operation. That
works great too ;)

On Tue, Jun 6, 2017 at 7:34 AM, David L Carlson <dcarl...@tamu.edu> wrote:

> Here's another approach:
>
> N <- data.frame(N=c("n1","n2","n3","n4"))
> M <- data.frame(M=c("m1","m2","m3","m4","m5"))
> C <- data.frame(n=c("n1","n2","n3"), m=c("m1","m1","m3"),
> I=c(100,300,400))
>
> # Rebuild the factors using M and N
> C$m <- factor(as.character(C$m), levels=levels(M$M))
> C$n <- factor(as.character(C$n), levels=levels(N$N))
> MN <- xtabs(I~m+n, C)
> print(MN, zero.print="-")
> # n
> # m n1  n2  n3 n4
> #   m1 100 300   -  -
> #   m2   -   -   -  -
> #   m3   -   - 400  -
> #   m4   -   -   -  -
> #   m5   -   -   -  -
>
> class(MN)
> # [1] "xtabs" "table"
> # MN is a table. If you want a data.frame
> MN <- as.data.frame.matrix(MN)
> class(MN)
> # [1] "data.frame"
>
> -----
> David L Carlson
> Department of Anthropology
> Texas A University
> College Station, TX 77840-4352
>
> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Jim Lemon
> Sent: Tuesday, June 6, 2017 6:02 AM
> To: Bogdan Tanasa <tan...@gmail.com>; r-help mailing list <
> r-help@r-project.org>
> Subject: Re: [R] integrating 2 lists and a data frame in R
>
> Hi Bogdan,
> Kinda messy, but:
>
> N <- data.frame(N=c("n1","n2","n3","n4"))
> M <- data.frame(M=c("m1","m2","m3","m4","m5"))
> C <- data.frame(n=c("n1","n2","n3"), m=c("m1","m1","m3"),
> I=c(100,300,400))
> MN<-as.data.frame(matrix(NA,nrow=length(N[,1]),ncol=length(M[,1])))
> names(MN)<-M[,1]
> rownames(MN)<-N[,1]
> C[,1]<-as.character(C[,1])
> C[,2]<-as.character(C[,2])
> for(row in 1:dim(C)[1]) MN[C[row,1],C[row,2]]<-C[row,3]
>
> Jim
>
> On Tue, Jun 6, 2017 at 3:51 PM, Bogdan Tanasa <tan...@gmail.com> wrote:
> > Dear Bert,
> >
> > thank you for your response. here it is the piece of R code : given 3
> data
> > frames below ---
> >
> > N <- data.frame(N=c("n1","n2","n3","n4"))
> >
> > M <- data.frame(M=c("m1","m2","m3","m4","m5"))
> >
> > C <- data.frame(n=c("n1","n2","n3"), m=c("m1","m1","m3"),
> I=c(100,300,400))
> >
> > how shall I integrate N, and M, and C in such a way that at the end we
> have
> > a data frame with :
> >
> >
> >- list N as the columns names
> >    - list M as the rows names
> >- the values in the cells of N * M, corresponding to the numerical
> >values in the data frame C.
> >
> > more precisely, the result shall be :
> >
> >  n1  n2  n3 n4
> > m1  100  200   -   -
> > m2   -   -   -   -
> > m3   -   -   300   -
> > m4   -   -   -   -
> > m5   -   -   -   -
> >
> > thank you !
> >
> >
> > On Mon, Jun 5, 2017 at 6:57 PM, Bert Gunter <bgunter.4...@gmail.com>
> wrote:
> >
> >> Reproducible example, please. -- In particular, what exactly does C look
> >> ilike?
> >>
> >> (You should know this by now).
> >>
> >> -- Bert
> >> Bert Gunter
> >>
> >> "The trouble with having an open mind is that people keep coming along
> >> and sticking things into it."
> >> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >>
> >>
> >> On Mon, Jun 5, 2017 at 6:45 PM, Bogdan Tanasa <tan...@gmail.com> wrote:
> >> >  Dear all,
> >> >
> >> > please could you advise on the R code I could use in order to do the
> >> > following operation :
> >> >
> >> > a. -- I have 2 lists of "genome coordinates" : a list is composed by
> >> > numbers that represent genome coordinates;
> >> >
> >> > let's say list N :
> >> >
> >> > n1
> >> >
> >> > n2
> >> >
> >> > n3
> >> >
> >> > n4
> >> >
> >> > and a list M:
> >> >
> >> > m1
> >> >
&

Re: [R] integrating 2 lists and a data frame in R

2017-06-06 Thread Bogdan Tanasa

Thank you Jim !

On Tue, Jun 6, 2017 at 4:01 AM, Jim Lemon <drjimle...@gmail.com> wrote:

> Hi Bogdan,
> Kinda messy, but:
>
> N <- data.frame(N=c("n1","n2","n3","n4"))
> M <- data.frame(M=c("m1","m2","m3","m4","m5"))
> C <- data.frame(n=c("n1","n2","n3"), m=c("m1","m1","m3"),
> I=c(100,300,400))
> MN<-as.data.frame(matrix(NA,nrow=length(N[,1]),ncol=length(M[,1])))
> names(MN)<-M[,1]
> rownames(MN)<-N[,1]
> C[,1]<-as.character(C[,1])
> C[,2]<-as.character(C[,2])
> for(row in 1:dim(C)[1]) MN[C[row,1],C[row,2]]<-C[row,3]
>
> Jim
>
> On Tue, Jun 6, 2017 at 3:51 PM, Bogdan Tanasa <tan...@gmail.com> wrote:
> > Dear Bert,
> >
> > thank you for your response. here it is the piece of R code : given 3
> data
> > frames below ---
> >
> > N <- data.frame(N=c("n1","n2","n3","n4"))
> >
> > M <- data.frame(M=c("m1","m2","m3","m4","m5"))
> >
> > C <- data.frame(n=c("n1","n2","n3"), m=c("m1","m1","m3"),
> I=c(100,300,400))
> >
> > how shall I integrate N, and M, and C in such a way that at the end we
> have
> > a data frame with :
> >
> >
> >- list N as the columns names
> >- list M as the rows names
> >- the values in the cells of N * M, corresponding to the numerical
> >values in the data frame C.
> >
> > more precisely, the result shall be :
> >
> >  n1  n2  n3 n4
> > m1  100  200   -   -
> > m2   -   -   -   -
> > m3   -   -   300   -
> > m4   -   -   -   -
> > m5   -   -   -   -
> >
> > thank you !
> >
> >
> > On Mon, Jun 5, 2017 at 6:57 PM, Bert Gunter <bgunter.4...@gmail.com>
> wrote:
> >
> >> Reproducible example, please. -- In particular, what exactly does C look
> >> ilike?
> >>
> >> (You should know this by now).
> >>
> >> -- Bert
> >> Bert Gunter
> >>
> >> "The trouble with having an open mind is that people keep coming along
> >> and sticking things into it."
> >> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >>
> >>
> >> On Mon, Jun 5, 2017 at 6:45 PM, Bogdan Tanasa <tan...@gmail.com> wrote:
> >> >  Dear all,
> >> >
> >> > please could you advise on the R code I could use in order to do the
> >> > following operation :
> >> >
> >> > a. -- I have 2 lists of "genome coordinates" : a list is composed by
> >> > numbers that represent genome coordinates;
> >> >
> >> > let's say list N :
> >> >
> >> > n1
> >> >
> >> > n2
> >> >
> >> > n3
> >> >
> >> > n4
> >> >
> >> > and a list M:
> >> >
> >> > m1
> >> >
> >> > m2
> >> >
> >> > m3
> >> >
> >> > m4
> >> >
> >> > m5
> >> >
> >> > 2 -- and a data frame C, where for some pairs of coordinates (n,m)
> from
> >> the
> >> > lists above, we have a numerical intensity;
> >> >
> >> > for example :
> >> >
> >> > n1; m1; 100
> >> >
> >> > n1; m2; 300
> >> >
> >> > The question would be : what is the most efficient R code I could use
> in
> >> > order to integrate the list N, the list M, and the data frame C, in
> order
> >> > to obtain a DATA FRAME,
> >> >
> >> > -- list N as the columns names
> >> > -- list M as the rows names
> >> > -- the values in the cells of N * M, corresponding to the numerical
> >> values
> >> > in the data frame C.
> >> >
> >> > A little example would be :
> >> >
> >> >   n1  n2  n3 n4
> >> >
> >> >   m1  100  -   -   -
> >> >
> >> >   m2  300  -   -   -
> >> >
> >> >   m3   -   -   -   -
> >> >
> >> >   m4   -   -   -   -
> >> >
> >> >   m5   -   -   -   -
> >> > I wrote a script in perl, although i would like to do this in R
> >> > Many thanks ;)
> >> > -- bogdan
> >> >
> >> > [[alternative HTML version deleted]]
> >> >
> >> > __
> >> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> > https://stat.ethz.ch/mailman/listinfo/r-help
> >> > PLEASE do read the posting guide http://www.R-project.org/
> >> posting-guide.html
> >> > and provide commented, minimal, self-contained, reproducible code.
> >>
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] integrating 2 lists and a data frame in R

2017-06-05 Thread Bogdan Tanasa

Dear Bert,

thank you for your response. here it is the piece of R code : given 3 data
frames below ---

N <- data.frame(N=c("n1","n2","n3","n4"))

M <- data.frame(M=c("m1","m2","m3","m4","m5"))

C <- data.frame(n=c("n1","n2","n3"), m=c("m1","m1","m3"), I=c(100,300,400))

how shall I integrate N, and M, and C in such a way that at the end we have
a data frame with :


   - list N as the columns names
   - list M as the rows names
   - the values in the cells of N * M, corresponding to the numerical
   values in the data frame C.

more precisely, the result shall be :

 n1  n2  n3 n4
m1  100  200   -   -
m2   -   -   -   -
m3   -   -   300   -
m4   -   -   -   -
m5   -   -   -   -

thank you !


On Mon, Jun 5, 2017 at 6:57 PM, Bert Gunter <bgunter.4...@gmail.com> wrote:

> Reproducible example, please. -- In particular, what exactly does C look
> ilike?
>
> (You should know this by now).
>
> -- Bert
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Mon, Jun 5, 2017 at 6:45 PM, Bogdan Tanasa <tan...@gmail.com> wrote:
> >  Dear all,
> >
> > please could you advise on the R code I could use in order to do the
> > following operation :
> >
> > a. -- I have 2 lists of "genome coordinates" : a list is composed by
> > numbers that represent genome coordinates;
> >
> > let's say list N :
> >
> > n1
> >
> > n2
> >
> > n3
> >
> > n4
> >
> > and a list M:
> >
> > m1
> >
> > m2
> >
> > m3
> >
> > m4
> >
> > m5
> >
> > 2 -- and a data frame C, where for some pairs of coordinates (n,m) from
> the
> > lists above, we have a numerical intensity;
> >
> > for example :
> >
> > n1; m1; 100
> >
> > n1; m2; 300
> >
> > The question would be : what is the most efficient R code I could use in
> > order to integrate the list N, the list M, and the data frame C, in order
> > to obtain a DATA FRAME,
> >
> > -- list N as the columns names
> > -- list M as the rows names
> > -- the values in the cells of N * M, corresponding to the numerical
> values
> > in the data frame C.
> >
> > A little example would be :
> >
> >   n1  n2  n3 n4
> >
> >   m1  100  -   -   -
> >
> >   m2  300  -   -   -
> >
> >   m3   -   -   -   -
> >
> >   m4   -   -   -   -
> >
> >   m5   -   -   -   -
> > I wrote a script in perl, although i would like to do this in R
> > Many thanks ;)
> > -- bogdan
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] integrating 2 lists and a data frame in R

2017-06-05 Thread Bogdan Tanasa

 Dear all,

please could you advise on the R code I could use in order to do the
following operation :

a. -- I have 2 lists of "genome coordinates" : a list is composed by
numbers that represent genome coordinates;

let's say list N :

n1

n2

n3

n4

and a list M:

m1

m2

m3

m4

m5

2 -- and a data frame C, where for some pairs of coordinates (n,m) from the
lists above, we have a numerical intensity;

for example :

n1; m1; 100

n1; m2; 300

The question would be : what is the most efficient R code I could use in
order to integrate the list N, the list M, and the data frame C, in order
to obtain a DATA FRAME,

-- list N as the columns names
-- list M as the rows names
-- the values in the cells of N * M, corresponding to the numerical values
in the data frame C.

A little example would be :

  n1  n2  n3 n4

  m1  100  -   -   -

  m2  300  -   -   -

  m3   -   -   -   -

  m4   -   -   -   -

  m5   -   -   -   -
I wrote a script in perl, although i would like to do this in R
Many thanks ;)
-- bogdan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] about Rstudio

2017-05-29 Thread Bogdan Tanasa

thank you Jeff. I shall re-install R using the recommendations from the
page you've sent.

initially, I wanted to set up the ./configure manually, in the following
way :

./configure --enable-R-shlib --prefix=/home/bogdan/R

On Mon, May 29, 2017 at 6:04 PM, Jeff Newmiller <jdnew...@dcn.davis.ca.us>
wrote:

> Did you follow the instructions at https://cran.r-project.org/
> bin/linux/ubuntu/README.html?
> --
> Sent from my phone. Please excuse my brevity.
>
> On May 29, 2017 2:07:27 PM PDT, Bogdan Tanasa <tan...@gmail.com> wrote:
> >Hi Bert, thank you for your email. yes, of course, i did the google
> >searches before posting, although the results did not help too much. At
> >the
> >end, I've copied the R executable from the installation folder to
> >/usr/local/lib/R/lib, and apparently it worked ...
> >
> >On Mon, May 29, 2017 at 2:04 PM, Bert Gunter <bgunter.4...@gmail.com>
> >wrote:
> >
> >> 1, SHouldn't you be posting this on the R Studio support site, not
> >here?
> >>
> >> 2. I googled on :
> >>
> >>  "R lib path(/usr/local/lib/R/lib) not found" Ubuntu
> >>
> >> and got what looked like relevant hits.
> >>
> >> So I'd say it's time for you to do some homework...
> >>
> >> -- Bert
> >>
> >>
> >> Bert Gunter
> >>
> >> "The trouble with having an open mind is that people keep coming
> >along
> >> and sticking things into it."
> >> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >>
> >>
> >> On Mon, May 29, 2017 at 1:53 PM, Bogdan Tanasa <tan...@gmail.com>
> >wrote:
> >> > Dear all,
> >> >
> >> > please could you help with an advice : I have installed Rstudio on
> >my
> >> > Ubuntu PC, and when I initiate the application, it says :
> >> >
> >> >  "R lib path(/usr/local/lib/R/lib) not found"
> >> >
> >> > on my computer, at the path "/usr/local/lib/R/lib" there are the
> >folders
> >> :
> >> > bin
> >> > etc
> >> > site-library
> >> >
> >> > R is installed in another folder that is "/home/bogdan/R".
> >> >
> >> > how could I fix the error please ? many thanks,
> >> >
> >> > --bogdan
> >> >
> >> > [[alternative HTML version deleted]]
> >> >
> >> > __
> >> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> > https://stat.ethz.ch/mailman/listinfo/r-help
> >> > PLEASE do read the posting guide http://www.R-project.org/
> >> posting-guide.html
> >> > and provide commented, minimal, self-contained, reproducible code.
> >>
> >
> >   [[alternative HTML version deleted]]
> >
> >__
> >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] about Rstudio

2017-05-29 Thread Bogdan Tanasa

Hi Bert, thank you for your email. yes, of course, i did the google
searches before posting, although the results did not help too much. At the
end, I've copied the R executable from the installation folder to
/usr/local/lib/R/lib, and apparently it worked ...

On Mon, May 29, 2017 at 2:04 PM, Bert Gunter <bgunter.4...@gmail.com> wrote:

> 1, SHouldn't you be posting this on the R Studio support site, not here?
>
> 2. I googled on :
>
>  "R lib path(/usr/local/lib/R/lib) not found" Ubuntu
>
> and got what looked like relevant hits.
>
> So I'd say it's time for you to do some homework...
>
> -- Bert
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Mon, May 29, 2017 at 1:53 PM, Bogdan Tanasa <tan...@gmail.com> wrote:
> > Dear all,
> >
> > please could you help with an advice : I have installed Rstudio on my
> > Ubuntu PC, and when I initiate the application, it says :
> >
> >  "R lib path(/usr/local/lib/R/lib) not found"
> >
> > on my computer, at the path "/usr/local/lib/R/lib" there are the folders
> :
> > bin
> > etc
> > site-library
> >
> > R is installed in another folder that is "/home/bogdan/R".
> >
> > how could I fix the error please ? many thanks,
> >
> > --bogdan
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] about Rstudio

2017-05-29 Thread Bogdan Tanasa

Dear all,

please could you help with an advice : I have installed Rstudio on my
Ubuntu PC, and when I initiate the application, it says :

 "R lib path(/usr/local/lib/R/lib) not found"

on my computer, at the path "/usr/local/lib/R/lib" there are the folders :
bin
etc
site-library

R is installed in another folder that is "/home/bogdan/R".

how could I fix the error please ? many thanks,

--bogdan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] using "dcast" function ?

2017-05-26 Thread Bogdan Tanasa

Dear all, I would like to double-check with you please the use of "acast"
or "dcast" function from "reshape2"package.

I am starting with a data frame Y of GENES and SAMPLES,eg :

  Cancer_Gene Sample
1ABL2  WT_10T
2ABL2   WT_6T
3  ADGRA2   HB_8R
4AFF4 EWS_13R

and I would like to have a dataframe/matrix of CANCER_GENES * SAMPLES.

Shall I do " dcast(Y, Cancer_Gene ~ Sample)", would it be correct ? thank
you !

-- bogdan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] differential use of R version

2017-04-05 Thread Bogdan Tanasa

Thank you Rui. I have been using a package (TitanCNA) in BioC on an older
version of R (3.3.1);
now after switching to R3.3.3, I am getting an error (below), and I do not
know how to fix it. Any suggestions are welcome.

Error in `[<-.data.frame`(`*tmp*`, indRef, "refCount", value = NULL) :
  replacement has length zero

On Wed, Apr 5, 2017 at 2:52 AM, Rui Barradas <ruipbarra...@sapo.pt> wrote:

> Hello,
>
> Unless you have strong reasons why not, use the most recent one, R 3.3.
>
> Hope this helps,
>
> Rui barradas
>
>
> Em 05-04-2017 03:47, Bogdan Tanasa escreveu:
>
>> Dear all,
>>
>> please could you advise me on the following :
>>
>> on a server, in a folder "x86_64-redhat-linux-gnu-library", i have 2
>> versions of R (below), with the corresponding BioC libraries :
>>
>> 3.2
>>> 3.3
>>>
>>
>> how could i preferentially use an R version or the other (with the related
>> BioC libraries) ?
>>
>> thank you,
>>
>> -- bogdan
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posti
>> ng-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] differential use of R version

2017-04-04 Thread Bogdan Tanasa

Dear all,

please could you advise me on the following :

on a server, in a folder "x86_64-redhat-linux-gnu-library", i have 2
versions of R (below), with the corresponding BioC libraries :

> 3.2
> 3.3

how could i preferentially use an R version or the other (with the related
BioC libraries) ?

thank you,

-- bogdan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] tutorials on bootstrap, jackknife, permutation, randomization tests

2017-02-26 Thread Bogdan Tanasa

Dear all,

please could anyone recommend a good website/resource describing tutorials
on bootstrap, jackknife, permutation, and randomization tests in R, with
applications to biology (molecular biology). thanks,

-- bogdan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] comparing 2 long lists in R

2015-09-24 Thread Bogdan Tanasa

Dear all,

please could you advise on a computationally quick way to compare and merge
2 long lists in R;
the lists are of the following type, for example :

<> in list 1 :

chromosome, coordinateA, coordinateB, value1
chromosome, coordinateC, coordinateC, value2,
etc

<> in list 2 :

chromosome, coordinateX, coordinateY, value6
chromosome, coordinateZ, coordinateT, value8,
etc

In the unified list, if coordinateA=coordinateX, and
coordinateB=coordinateY, then we write :

chromosome, coordinateA, coordinateB, value1, coordinateX, coordinateY,
value6,

otherwise, we write the individual values :

chromosome, coordinateA, coordinateB, value1,
chromosome, coordinateX, coordinateY, value6,

thanks,

bogdan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] comparing 2 long lists in R

2015-09-24 Thread Bogdan Tanasa

Dear Bert and Sarah, thank you for your suggestions. Yes, I came across
"dplyr" that has a few functions already implemented, thanks again !

On Thu, Sep 24, 2015 at 1:17 PM, Bert Gunter <bgunter.4...@gmail.com> wrote:

> Also, in addition to what Sarah told you, have you checked on the
> Bioconductor site, as this sounds like the sort of thing that they may
> well have something for already.
>
> ... and you've posted here often enough that you shouldn't still be
> posting HTML and you should know about toy examples!
>
> Cheers,
> Bert
>
> Bert Gunter
>
> "Data is not information. Information is not knowledge. And knowledge
> is certainly not wisdom."
>-- Clifford Stoll
>
>
> On Thu, Sep 24, 2015 at 12:57 PM, Sarah Goslee <sarah.gos...@gmail.com>
> wrote:
> > merge() most likely, but: are these really lists in the R sense?
> >
> > The correct answer depends on what the format actually is; you need to
> > use dput() or some other unambiguous way of providing sample data.
> >
> > Without a reproducible example that includes some sample data provided
> > using dput() (fake is fine), the code you used, and some clear idea of
> > what output you expect, it's impossible to figure out how to help you.
> > Here are some suggestions for creating a good reproducible example:
> >
> http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example
> >
> > Sarah
> >
> >
> > On Thu, Sep 24, 2015 at 3:43 PM, Bogdan Tanasa <tan...@gmail.com> wrote:
> >> Dear all,
> >>
> >> please could you advise on a computationally quick way to compare and
> merge
> >> 2 long lists in R;
> >> the lists are of the following type, for example :
> >>
> >> <> in list 1 :
> >>
> >> chromosome, coordinateA, coordinateB, value1
> >> chromosome, coordinateC, coordinateC, value2,
> >> etc
> >>
> >> <> in list 2 :
> >>
> >> chromosome, coordinateX, coordinateY, value6
> >> chromosome, coordinateZ, coordinateT, value8,
> >> etc
> >>
> >> In the unified list, if coordinateA=coordinateX, and
> >> coordinateB=coordinateY, then we write :
> >>
> >> chromosome, coordinateA, coordinateB, value1, coordinateX, coordinateY,
> >> value6,
> >>
> >> otherwise, we write the individual values :
> >>
> >> chromosome, coordinateA, coordinateB, value1,
> >> chromosome, coordinateX, coordinateY, value6,
> >>
> >> thanks,
> >>
> >> bogdan
> >>
> >> [[alternative HTML version deleted]]
> >>
> > and please don't post in HTML.
> >
> > --
> > Sarah Goslee
> > http://www.functionaldiversity.org
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] scaling loess curves

2015-09-11 Thread Bogdan Tanasa

Hi Petr,

thank you for your reply regarding the scaling of loess curves. Our
situation is the following :

we do have 2 experiments, and for each experiment, the set of data is in
the following format : "nodeA (chr, start, end) - node B (chr, start, end)
- interaction intensity (between A and B)".

We are trying to SCALE the LOESS curves ( for the graphs "distance between
node A and node B" vs "intensity") for experiment1 vs experiment2, in order
to make the experiments directly comparable.

I have attached 2 figures with the LOESS curves for experiment1 and
experiment2 to my email. Shall you have any suggestions, please let me
know. Thanks a lot,


-- bogdan

On Mon, Sep 7, 2015 at 7:34 AM, PIKAL Petr <petr.pi...@precheza.cz> wrote:

> Hi
>
> what about xlim or ylim?
>
> Cheers
> Petr
>
>
> > -Original Message-
> > From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Bogdan
> > Tanasa
> > Sent: Monday, September 07, 2015 8:00 AM
> > To: r-help
> > Subject: [R] scaling loess curves
> >
> > Dear all,
> >
> > please could you advise about a method to scale 2 plots of LOESS
> > curves.
> > More specifically, we do have 2 sets of 5C data, and the loess plots
> > reflect the relationship between INTENSITY and DISTANCE (please see the
> > R code below).
> >
> > I am looking for a method/formula to scale these 2 LOESS plots and make
> > them directly comparable.
> >
> > many thanks,
> >
> > -- bogdan
> >
> >
> >
> > -- the R code --
> >
> >
> >
> > a <- read.delim("a",header=T)
> > qplot(data=a,distance,intensity)+geom_smooth(method = "loess", size =
> > 1,
> > span=0.01)+xlab("distance")+ylab("intensity")
> >
> >
> >
> > b <- read.delim("b",header=T)
> > qplot(data=b,distance,intensity)+geom_smooth(method = "loess", size =
> > 1,
> > span=0.01)+xlab("distance")+ylab("intensity")
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-
> > guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> 
> Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou
> určeny pouze jeho adresátům.
> Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě
> neprodleně jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho kopie
> vymažte ze svého systému.
> Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento email
> jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat.
> Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou modifikacemi
> či zpožděním přenosu e-mailu.
>
> V případě, že je tento e-mail součástí obchodního jednání:
> - vyhrazuje si odesílatel právo ukončit kdykoliv jednání o uzavření
> smlouvy, a to z jakéhokoliv důvodu i bez uvedení důvodu.
> - a obsahuje-li nabídku, je adresát oprávněn nabídku bezodkladně přijmout;
> Odesílatel tohoto e-mailu (nabídky) vylučuje přijetí nabídky ze strany
> příjemce s dodatkem či odchylkou.
> - trvá odesílatel na tom, že příslušná smlouva je uzavřena teprve
> výslovným dosažením shody na všech jejích náležitostech.
> - odesílatel tohoto emailu informuje, že není oprávněn uzavírat za
> společnost žádné smlouvy s výjimkou případů, kdy k tomu byl písemně zmocněn
> nebo písemně pověřen a takové pověření nebo plná moc byly adresátovi tohoto
> emailu případně osobě, kterou adresát zastupuje, předloženy nebo jejich
> existence je adresátovi či osobě jím zastoupené známá.
>
> This e-mail and any documents attached to it may be confidential and are
> intended only for its intended recipients.
> If you received this e-mail by mistake, please immediately inform its
> sender. Delete the contents of this e-mail with all attachments and its
> copies from your system.
> If you are not the intended recipient of this e-mail, you are not
> authorized to use, disseminate, copy or disclose this e-mail in any manner.
> The sender of this e-mail shall not be liable for any possible damage
> caused by modifications of the e-mail or by delay with transfer of the
> email.
>
> In case that this e-mail forms part of business dealings:
> - the sender reserves the right to end negotiations about entering into a
> contract in any time, for any reason, and with

Re: [R] scaling loess curves

2015-09-11 Thread Bogdan Tanasa

thanks Petr. It shall work ;)

On Fri, Sep 11, 2015 at 4:34 AM, PIKAL Petr <petr.pi...@precheza.cz> wrote:

> Hi
>
>
>
> you need to merge those two data frames with a column indicating given set.
>
>
>
> Without data it is only a guess but
>
>
>
> a$set<-„a“
>
> b$set<-„b“
>
>
>
> complete <- rbind(a,b)
>
>
>
> p <-ggplot(complete, aes(x=distance, y=intensity, colour=set))
>
> p+geom_smooth(method = "loess", size = 1,
> span=0.01)+xlab("distance")+ylab("intensity")
>
> shall do it.
>
>
>
> Cheers
>
> Petr
>
>
>
> *From:* Bogdan Tanasa [mailto:tan...@gmail.com]
> *Sent:* Friday, September 11, 2015 10:03 AM
>
> *To:* PIKAL Petr; r-help
> *Subject:* Re: [R] scaling loess curves
>
>
>
> Dear Petr,
>
> thank you very much, it helped. On a side note, shall I have 2 plots and 2
> loess curves (as below), is there any way in ggplot2 to overlay these 2
> graphs for "a" and "b" ? much thanks again !
>
> qplot(data=a,distance,intensity)+geom_smooth(method = "loess", size = 1,
> span=0.01)+xlab("distance")+ylab("intensity")
>
> qplot(data=b,distance,intensity)+geom_smooth(method = "loess", size = 1,
> span=0.01)+xlab("distance")+ylab("intensity")
>
> -- bogdan
>
>
>
> On Fri, Sep 11, 2015 at 3:06 AM, PIKAL Petr <petr.pi...@precheza.cz>
> wrote:
>
> Hi
>
>
>
> based on your data maybe using logarithmic y scale shall give you desired
> result.
>
>
>
>
> http://stackoverflow.com/questions/4699493/transform-only-one-axis-to-log10-scale-with-ggplot2
>
>
>
> Or you can recalculate intensity to scale 100-0 (or any other suitable
> scale).
>
>
>
> ?rescale
>
>
>
> Cheers
>
> Petr
>
>
>
> *From:* Bogdan Tanasa [mailto:tan...@gmail.com]
> *Sent:* Friday, September 11, 2015 8:14 AM
> *To:* PIKAL Petr; r-help
> *Subject:* Re: [R] scaling loess curves
>
>
>
> Hi Petr,
>
> thank you for your reply regarding the scaling of loess curves. Our
> situation is the following :
>
> we do have 2 experiments, and for each experiment, the set of data is in
> the following format : "nodeA (chr, start, end) - node B (chr, start, end)
> - interaction intensity (between A and B)".
>
> We are trying to SCALE the LOESS curves ( for the graphs "distance between
> node A and node B" vs "intensity") for experiment1 vs experiment2, in order
> to make the experiments directly comparable.
>
> I have attached 2 figures with the LOESS curves for experiment1 and
> experiment2 to my email. Shall you have any suggestions, please let me
> know. Thanks a lot,
>
>
>
> -- bogdan
>
>
>
> On Mon, Sep 7, 2015 at 7:34 AM, PIKAL Petr <petr.pi...@precheza.cz> wrote:
>
> Hi
>
> what about xlim or ylim?
>
> Cheers
> Petr
>
>
>
> > -Original Message-
> > From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Bogdan
> > Tanasa
> > Sent: Monday, September 07, 2015 8:00 AM
> > To: r-help
> > Subject: [R] scaling loess curves
> >
> > Dear all,
> >
> > please could you advise about a method to scale 2 plots of LOESS
> > curves.
> > More specifically, we do have 2 sets of 5C data, and the loess plots
> > reflect the relationship between INTENSITY and DISTANCE (please see the
> > R code below).
> >
> > I am looking for a method/formula to scale these 2 LOESS plots and make
> > them directly comparable.
> >
> > many thanks,
> >
> > -- bogdan
> >
> >
> >
> > -- the R code --
> >
> >
> >
> > a <- read.delim("a",header=T)
> > qplot(data=a,distance,intensity)+geom_smooth(method = "loess", size =
> > 1,
> > span=0.01)+xlab("distance")+ylab("intensity")
> >
> >
> >
> > b <- read.delim("b",header=T)
> > qplot(data=b,distance,intensity)+geom_smooth(method = "loess", size =
> > 1,
> > span=0.01)+xlab("distance")+ylab("intensity")
> >
>
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-
> > guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>

Re: [R] scaling loess curves

2015-09-11 Thread Bogdan Tanasa

Dear Petr,

thank you very much, it helped. On a side note, shall I have 2 plots and 2
loess curves (as below), is there any way in ggplot2 to overlay these 2
graphs for "a" and "b" ? much thanks again !

qplot(data=a,distance,intensity)+geom_smooth(method = "loess", size = 1,
span=0.01)+xlab("distance")+ylab("intensity")

qplot(data=b,distance,intensity)+geom_smooth(method = "loess", size = 1,
span=0.01)+xlab("distance")+ylab("intensity")
-- bogdan

On Fri, Sep 11, 2015 at 3:06 AM, PIKAL Petr <petr.pi...@precheza.cz> wrote:

> Hi
>
>
>
> based on your data maybe using logarithmic y scale shall give you desired
> result.
>
>
>
>
> http://stackoverflow.com/questions/4699493/transform-only-one-axis-to-log10-scale-with-ggplot2
>
>
>
> Or you can recalculate intensity to scale 100-0 (or any other suitable
> scale).
>
>
>
> ?rescale
>
>
>
> Cheers
>
> Petr
>
>
>
> *From:* Bogdan Tanasa [mailto:tan...@gmail.com]
> *Sent:* Friday, September 11, 2015 8:14 AM
> *To:* PIKAL Petr; r-help
> *Subject:* Re: [R] scaling loess curves
>
>
>
> Hi Petr,
>
> thank you for your reply regarding the scaling of loess curves. Our
> situation is the following :
>
> we do have 2 experiments, and for each experiment, the set of data is in
> the following format : "nodeA (chr, start, end) - node B (chr, start, end)
> - interaction intensity (between A and B)".
>
> We are trying to SCALE the LOESS curves ( for the graphs "distance between
> node A and node B" vs "intensity") for experiment1 vs experiment2, in order
> to make the experiments directly comparable.
>
> I have attached 2 figures with the LOESS curves for experiment1 and
> experiment2 to my email. Shall you have any suggestions, please let me
> know. Thanks a lot,
>
>
>
> -- bogdan
>
>
>
> On Mon, Sep 7, 2015 at 7:34 AM, PIKAL Petr <petr.pi...@precheza.cz> wrote:
>
> Hi
>
> what about xlim or ylim?
>
> Cheers
> Petr
>
>
>
> > -Original Message-
> > From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Bogdan
> > Tanasa
> > Sent: Monday, September 07, 2015 8:00 AM
> > To: r-help
> > Subject: [R] scaling loess curves
> >
> > Dear all,
> >
> > please could you advise about a method to scale 2 plots of LOESS
> > curves.
> > More specifically, we do have 2 sets of 5C data, and the loess plots
> > reflect the relationship between INTENSITY and DISTANCE (please see the
> > R code below).
> >
> > I am looking for a method/formula to scale these 2 LOESS plots and make
> > them directly comparable.
> >
> > many thanks,
> >
> > -- bogdan
> >
> >
> >
> > -- the R code --
> >
> >
> >
> > a <- read.delim("a",header=T)
> > qplot(data=a,distance,intensity)+geom_smooth(method = "loess", size =
> > 1,
> > span=0.01)+xlab("distance")+ylab("intensity")
> >
> >
> >
> > b <- read.delim("b",header=T)
> > qplot(data=b,distance,intensity)+geom_smooth(method = "loess", size =
> > 1,
> > span=0.01)+xlab("distance")+ylab("intensity")
> >
>
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-
> > guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> 
> Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou
> určeny pouze jeho adresátům.
> Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě
> neprodleně jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho kopie
> vymažte ze svého systému.
> Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento email
> jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat.
> Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou modifikacemi
> či zpožděním přenosu e-mailu.
>
> V případě, že je tento e-mail součástí obchodního jednání:
> - vyhrazuje si odesílatel právo ukončit kdykoliv jednání o uzavření
> smlouvy, a to z jakéhokoliv důvodu i bez uvedení důvodu.
> - a obsahuje-li nabídku, je adresát oprávněn nabídku bezodkladně přijmout;
> Odesílatel tohoto e-mailu (nabídky) vylučuje přijetí nabídky ze strany
> příjemce s do

[R] scaling loess curves

2015-09-07 Thread Bogdan Tanasa

Dear all,

please could you advise about a method to scale 2 plots of LOESS curves.
More specifically, we do have 2 sets of 5C data, and the loess plots
reflect the relationship between INTENSITY and DISTANCE (please see the R
code below).

I am looking for a method/formula to scale these 2 LOESS plots and make
them directly comparable.

many thanks,

-- bogdan



-- the R code --



a <- read.delim("a",header=T)
qplot(data=a,distance,intensity)+geom_smooth(method = "loess", size = 1,
span=0.01)+xlab("distance")+ylab("intensity")



b <- read.delim("b",header=T)
qplot(data=b,distance,intensity)+geom_smooth(method = "loess", size = 1,
span=0.01)+xlab("distance")+ylab("intensity")

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] reading files with name columns and row columns

2015-09-02 Thread Bogdan Tanasa

that is great, thank you Bill for time and help ;) !

On Wed, Sep 2, 2015 at 4:36 PM, William Dunlap <wdun...@tibco.com> wrote:

>   y <- as.matrix(read.table("FILE_NAME",header=T,row.names=1))
>   colnames(y) <- gsub("X","", colnames(y))
>
> Use read.table's check.names=FALSE argument so it won't mangle
> the column names instead of trying to demangle them with gsub()
> afterwards.
>
> E.g.,
>   txt <- "   50%  100%\nA   5 8\nB  1314\n"
>   cat(txt)
>   #   50%  100%
>   #A   5 8
>   #B  1314
>   read.table(text=txt, head=TRUE, row.names=1)
>   #  X50. X100.
>   #A5 8
>   #B   1314
>   read.table(text=txt, head=TRUE, row.names=1, check.names=FALSE)
>   #  50% 100%
>   #A   58
>   #B  13   14
>
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
> On Wed, Sep 2, 2015 at 4:08 PM, Bogdan Tanasa <tan...@gmail.com> wrote:
>
>> Thanks, Bert ! I solved the situation in the meanwhile, by using :
>>
>> y <- as.matrix(read.table("FILE_NAME",header=T,row.names=1))
>>
>> colnames(y) <- gsub("X","", colnames(y))
>>
>>
>> On Wed, Sep 2, 2015 at 3:59 PM, Bert Gunter <bgunter.4...@gmail.com>
>> wrote:
>>
>> > Please read the Help file carefully before posting:
>> >
>> > "read.table is not the right tool for reading large matrices,
>> > especially those with many columns: it is designed to read data frames
>> > which may have columns of very different classes. Use scan instead for
>> > matrices."
>> >
>> > But the answer to your question can be found in
>> >
>> > ?make.names
>> >
>> > for what constitutes a syntactically valid name in R.
>> >
>> >
>> > Cheers,
>> > Bert
>> >
>> > Bert Gunter
>> >
>> > "Data is not information. Information is not knowledge. And knowledge
>> > is certainly not wisdom."
>> >-- Clifford Stoll
>> >
>> >
>> > On Wed, Sep 2, 2015 at 3:11 PM, Bogdan Tanasa <tan...@gmail.com> wrote:
>> > > Dear all,
>> > >
>> > > would appreciate a piece of help with a simple question: I am reading
>> in
>> > R
>> > > a file that is formatted as a matrix (an example is shown below,
>> although
>> > > it is more complex, a matrix of 1000 * 1000 ):
>> > >
>> > > the names of the columns are 0, 1, 4, 8, etc
>> > > the names of the rows are 0, 1, 4, 8, etc
>> > >
>> > >0 20 40
>> > > 0  0   0   0
>> > > 20  0   0   0
>> > > 40  0   0   0
>> > >
>> > > shall I use the command :
>> > >
>> > > y <- read.table("file",row.names=1, header=T)
>> > >
>> > > the results is :
>> > >
>> > >> y[1:3,1:3]
>> > >X0 X20 X40
>> > > 0   0   0   0
>> > > 20  0   0   0
>> > > 40  0   0   0
>> > >
>> > > The question is : why R adds an X to the names of the columns eg X0,
>> > > X2, X4, when it shall be only 0, 2, 4 ? thanks !
>> > >
>> > > -- bogdan
>> > >
>> > > [[alternative HTML version deleted]]
>> > >
>> > > __
>> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > > https://stat.ethz.ch/mailman/listinfo/r-help
>> > > PLEASE do read the posting guide
>> > http://www.R-project.org/posting-guide.html
>> > > and provide commented, minimal, self-contained, reproducible code.
>> >
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] reading files with name columns and row columns

2015-09-02 Thread Bogdan Tanasa

Thanks, Bert ! I solved the situation in the meanwhile, by using :

y <- as.matrix(read.table("FILE_NAME",header=T,row.names=1))

colnames(y) <- gsub("X","", colnames(y))


On Wed, Sep 2, 2015 at 3:59 PM, Bert Gunter <bgunter.4...@gmail.com> wrote:

> Please read the Help file carefully before posting:
>
> "read.table is not the right tool for reading large matrices,
> especially those with many columns: it is designed to read data frames
> which may have columns of very different classes. Use scan instead for
> matrices."
>
> But the answer to your question can be found in
>
> ?make.names
>
> for what constitutes a syntactically valid name in R.
>
>
> Cheers,
> Bert
>
> Bert Gunter
>
> "Data is not information. Information is not knowledge. And knowledge
> is certainly not wisdom."
>-- Clifford Stoll
>
>
> On Wed, Sep 2, 2015 at 3:11 PM, Bogdan Tanasa <tan...@gmail.com> wrote:
> > Dear all,
> >
> > would appreciate a piece of help with a simple question: I am reading in
> R
> > a file that is formatted as a matrix (an example is shown below, although
> > it is more complex, a matrix of 1000 * 1000 ):
> >
> > the names of the columns are 0, 1, 4, 8, etc
> > the names of the rows are 0, 1, 4, 8, etc
> >
> >0 20 40
> > 0  0   0   0
> > 20  0   0   0
> > 40  0   0   0
> >
> > shall I use the command :
> >
> > y <- read.table("file",row.names=1, header=T)
> >
> > the results is :
> >
> >> y[1:3,1:3]
> >X0 X20 X40
> > 0   0   0   0
> > 20  0   0   0
> > 40  0   0   0
> >
> > The question is : why R adds an X to the names of the columns eg X0,
> > X2, X4, when it shall be only 0, 2, 4 ? thanks !
> >
> > -- bogdan
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] reading files with name columns and row columns

2015-09-02 Thread Bogdan Tanasa

Dear all,

would appreciate a piece of help with a simple question: I am reading in R
a file that is formatted as a matrix (an example is shown below, although
it is more complex, a matrix of 1000 * 1000 ):

the names of the columns are 0, 1, 4, 8, etc
the names of the rows are 0, 1, 4, 8, etc

   0 20 40
0  0   0   0
20  0   0   0
40  0   0   0

shall I use the command :

y <- read.table("file",row.names=1, header=T)

the results is :

> y[1:3,1:3]
   X0 X20 X40
0   0   0   0
20  0   0   0
40  0   0   0

The question is : why R adds an X to the names of the columns eg X0,
X2, X4, when it shall be only 0, 2, 4 ? thanks !

-- bogdan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] sorting a dataframe

2015-06-21 Thread Bogdan Tanasa

Thank you all again. It works with :

library(gtools)
dat[mixedorder(A),]

considering :

A = c(A1,A10,A11,A2)
B = c(1,2,3,4)
 dat = data.frame(A,B


On Sun, Jun 21, 2015 at 10:52 AM, Bert Gunter bgunter.4...@gmail.com
wrote:

 Diego:

 Nonsense! Look at the results of your code -- you have failed to order
 the results as had been requested by the OP. It's also unnecessarily
 complicated. The following suffices (where I have used regular
 expressions rather substring() to get the numeric part of the strings
 -- **assuming** that the strings always consist of letters followed by
 numeric digits).

  A = c(A1,A10,A11,A2)
   B = c(1,2,3,4)
   dat = data.frame(A,B)
 
  numbs - as.numeric(gsub([^[:digit:]]+,,dat$A))
  newdat - dat[order(numbs),]
  newdat
 A B
 1  A1 1
 4  A2 4
 2 A10 2
 3 A11 3

 Cheers,
 Bert



 Bert Gunter

 Data is not information. Information is not knowledge. And knowledge
 is certainly not wisdom.
-- Clifford Stoll


 On Sat, Jun 20, 2015 at 6:29 PM, Diego Miro d.miro1...@gmail.com wrote:
  Bogdan,
 
  Follow my suggestion.
 
  letter - substring(A, 1, 1)
  number - substring(A, 2, nchar(A))
  new.data - paste0(letter, formatC(as.numeric(number), width = 2, flag =
  0))
  Em 20/06/2015 21:21, Bogdan Tanasa tan...@gmail.com escreveu:
 
  Dear all,
 
  I am looking for a suggestion please regarding sorting a dataframe with
  alphanumerical components :
 
  let's assume we have :
 
  A = c(A1,A10,A11,A2)
  B = c(1,2,3,4)
 
  C = data.frame(A,B)
 
  how could I sort C data.frame in such a way that we have at the end :
 
  C$A in the order : A1, A2, A10, A11. thank you very much,
 
  -- bogdan
 
  [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
  [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] sorting a dataframe

2015-06-20 Thread Bogdan Tanasa

Dear all,

I am looking for a suggestion please regarding sorting a dataframe with
alphanumerical components :

let's assume we have :

A = c(A1,A10,A11,A2)
B = c(1,2,3,4)

C = data.frame(A,B)

how could I sort C data.frame in such a way that we have at the end :

C$A in the order : A1, A2, A10, A11. thank you very much,

-- bogdan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] sorting a dataframe

2015-06-20 Thread Bogdan Tanasa

thank you all, it is working fine. happy weekend ;) !

On Sat, Jun 20, 2015 at 6:15 PM, David Winsemius dwinsem...@comcast.net
wrote:


 On Jun 20, 2015, at 5:18 PM, Bogdan Tanasa wrote:

  Dear all,
 
  I am looking for a suggestion please regarding sorting a dataframe with
  alphanumerical components :
 
  let's assume we have :
 
  A = c(A1,A10,A11,A2)
  B = c(1,2,3,4)
 
  C = data.frame(A,B)
 
  how could I sort C data.frame in such a way that we have at the end :
 
  C$A in the order : A1, A2, A10, A11. thank you very much,

 Do a search on `mixedorder` and` mixedsort`. They are function names in
 pkg:gtools

 --

 David Winsemius
 Alameda, CA, USA



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] display a matrix in colors

2015-06-12 Thread Bogdan Tanasa

Thanks, Jim. Yes, a good idea, shall I find more time, typically all these
little projects I am doing or asking are in big rush. Found some good
tutorials about heatmap.2 too:

http://www.inside-r.org/packages/cran/gplots/docs/heatmap.2

http://sebastianraschka.com/Articles/heatmaps_in_r.html

On Fri, Jun 12, 2015 at 8:41 PM, Jim Lemon drjimle...@gmail.com wrote:

 Hi Bogdan,
 Have a look at color2D.matplot in the plotrix package - it's a bit
 different from image and heatmap.

 Jim


 On Sat, Jun 13, 2015 at 10:39 AM, Bogdan Tanasa tan...@gmail.com wrote:
  Hi David, thanks, yes,
 
  heatmap provides clustered heatmap, and I am looking for an unclustered
  display of a matrix, and only to set up the color ranges. thanks for your
  help !
 
  -- bogdan
 
  On Fri, Jun 12, 2015 at 5:36 PM, David Winsemius dwinsem...@comcast.net
 
  wrote:
 
 
 
  On Jun 12, 2015, at 5:28 PM, Bogdan Tanasa wrote:
 
   Dear all,
  
   please could you advise about the most convenient functions or
 libraries
  to
   use in order to display a matrix as a heatmap/a color matrix ?
  
   the matrix contains the values of 0, 10, 20, 30 or 100. thank you !
 
  What? You must have looked at `?heatmap` and the links on its help page,
  so what really is the question?
 
  
   -- bogdan
  
 [[alternative HTML version deleted]]
  
   __
   R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained, reproducible code.
 
  David Winsemius
  Alameda, CA, USA
 
 
 
  [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

1 2 >

1 - 100 of 127 matches

Mail list logo