Re: [R] ggplot2 error bars and convex hulls

2020-04-21 Thread Ivan Calandra
Thanks Rui for these 2 possibilities; I'll have a look.

Any one with pointers for the error bars issue?

Ivan

--
Dr. Ivan Calandra
TraCEr, laboratory for Traceology and Controlled Experiments
MONREPOS Archaeological Research Centre and
Museum for Human Behavioural Evolution
Schloss Monrepos
56567 Neuwied, Germany
+49 (0) 2631 9772-243
https://www.researchgate.net/profile/Ivan_Calandra

On 21/04/2020 19:06, Rui Barradas wrote:
> Hello,
>
> As for convex hulls, there is an example of how to construct a
> stat_hull in
>
> vignette("extending-ggplot2", package = "ggplot2")
>
> There is also a geom_hull in a GitHub package:
>
> devtools::install_github("cmartin/ggConvexHull")
>
>
> Hope this helps,
>
> Rui Barradas
>
> Às 17:02 de 21/04/20, Ivan Calandra escreveu:
>> Dear useRs,
>>
>> I would like to have horizontal and vertical error bars extending from
>> the means on two continuous variables.
>>
>> This would be the "manual" way of doing it, computing the mean and sd
>> (or whatever stats) beforehand and then calling geom_errorbar() and
>> geom_errorbarh() with appropriate coordinates:
>> https://stackoverflow.com/questions/12570816/ggplot-scatter-plot-of-two-groups-with-superimposed-means-with-x-and-y-error-bar
>>
>>
>> But I am a bit surprised that there is no "built-in" way of doing it
>> with ggplot2. I mean not having to compute mean and sd beforehand and
>> not having to call both geom_errorbar() and geom_errorbarh() with a new
>> set of aesthetics.
>>
>> In the same idea, I am looking at convex hulls and I was also expecting
>> to have a built-in way to do this in ggplot2. But I have only found this
>> "manual" way:
>> https://stats.stackexchange.com/questions/22805/how-to-draw-neat-polygons-around-scatterplot-regions-in-ggplot2
>>
>>
>> Thank you in advance for any pointer.
>> Ivan
>>
>

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Subtracting Data Frame With a Different Number of Rows

2020-04-21 Thread William Michels via R-help
Hi Phillip,

You have two choices here: 1. Manually enter the missing rows into
your individual.df using rbind(), and cbind() the overall.df and
individual.df dataframes together (assuming the rows line up
properly), or 2. Use merge() to perform an SQL-like "Left Join", and
copy values from the "overall" columns to fill in missing values in
the "indiv" columns (imputation). Below is code starting from a .tsv
files showing the second (merge) method. Note: I've only included the
first 4 rows of data after the merge command (there are 24 rows
total):

> overall <- read.delim("overall.R", sep="\t")
> indiv <- read.delim("individual.R", sep="\t")
> merge(overall, indiv, all.x=TRUE, by.x=c("RunnerCode", "Outs"), 
> by.y=c("RunnerCode", "Outs"))

RunnerCode Outs X.x MeanRuns.x X.y MeanRuns.y
1   BasesEmpty0   1  0.5137615   1  0.4262295
2   BasesEmpty1   9  0.3963801   8  0.5238095
3   BasesEmpty2  17  0.4191011  15  0.3469388
4  BasesLoaded0   8  3.2173913  NA NA


HTH, Bill.

W. Michels, Ph.D.


On Tue, Apr 21, 2020 at 1:47 PM Phillip Heinrich  wrote:
>
> I have two small data frames of baseball data.  The first one is the mean
> number of runs that will score in each half inning for the 2018 Arizona
> Diamondbacks.  The second data frame is the same information but for only
> one player.  As you will see the individual player did not come up to bat
> any time during the season:
> with the bases loaded and no outs
> runners on first and third with one out
>
> Overall
>
> RunnerCodeOuts MeanRuns
> 1 Bases Empty 0   0.5137615
> 2 Runner:1st0   0.8967391
> 3 Runner:2nd   0   1.3018868
> 4 Runners:1st & 2nd0   1.6551724
> 5 Runner:3rd0   1.9545455
> 6 Runners:1st & 3rd 0   2.0571429
> 7 Runners:2nd & 3rd0   2.1578947
> 8 Bases Loaded0   3.2173913
> 9 Bases Empty  1   0.3963801
> 10 Runner:1st   1   0.6952596
> 11 Runner:2nd  1   0.9580838
> 12 Runners:1st & 2nd   1   1.4397163
> 13 Runner:3rd   1   1.5352113
> 14 Runners:1st & 3rd   11.5882353
> 15 Runners:2nd & 3rd  11.9215686
> 16 Bases Loaded  11.9193548
> 17 Bases Empty20.4191011
> 18 Runner:1st   20.5531915
> 19 Runner:2nd  20.8777293
> 20 Runners:1st & 2nd  2 0.9553073
> 21 Runner:3rd  2 1.2783505
> 22 Runners:1st & 3rd   2 1.5851064
> 23 Runners:2nd & 3rd  2 1.2794118
> 24 Bases Loaded 2  1.388235
>
> Individual Player
>
>   RunnerCode  Outs   MeanRuns
> 1 Bases Empty 0 0.4262295
> 2 Runner:1st0 1.320
> 3 Runner:2nd   0 1.2857143
> 4 Runners:1st & 2nd   0  0.5714286
> 5 Runner:3rd   0  2.000
> 6 Runners:1st & 3rd0  3.500
> 7 Runners:2nd & 3rd   0  1.000
> 8 Bases Empty 1  0.5238095
> 9 Runner:1st1  0.6578947
> 10 Runner:2nd 1  0.375
> 11 Runners:1st & 2nd 1   1.4285714
> 12 Runner:3rd 1   1.4285714
> 13 Runners:2nd & 3rd 1   0.667
> 14 Bases Loaded 1   3.000
> 15 Bases Empty   2   0.3469388
> 16 Runner:1st  2   0.1363636
> 17 Runner:2nd 2   0.7142857
> 18 Runners:1st & 2nd  2   1.667
> 19 Runner:3rd  2   1.250
> 20 Runners:1st & 3rd  22.1428571
> 21 Runners:2nd & 3rd 21.500
> 22 Bases Loaded 22.200
>
> RunnersCode is a factor
> Outs are integers
> MeanRuns is numerical data
>
> I would like to subtract the second from the first as a way to evaluate the
> players ability to produce runs. As part of this analysis I I would like to
> input the mean number of runs from the overall data frame into the two
> missing cells for the individual player:Bases Loaded no outs and 1st and 3rd
> one out.
>
> Can anyone give me some advise?
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinf

Re: [R] NA command in a 'for' loop

2020-04-21 Thread Helen Sawaya
Thank you all. Your suggestions worked. As you said, the problem appeared to 
have been the commas that were part of the data frame.

Thanks again 🙂

From: Rui Barradas 
Sent: Tuesday, April 21, 2020 9:38 PM
To: Helen Sawaya 
Cc: Jim Lemon ; Michael Dewey ; 
r-help@R-project.org 
Subject: Re: [R] NA command in a 'for' loop

Hello,

Much better, you have "," at the end of your data elements so nothing is
working.

The following 3 instructions

1. remove those commas,
2. create a logical vector trying to guess which columns are numeric
3. coerce those columns to numeric.


d[] <- lapply(d, function(x){sub(",$", "", x)})
not_num <- sapply(d, function(x) all(is.na(as.numeric(as.character(x)
d[!not_num] <- lapply(d[!not_num], function(x) as.numeric(as.character(x)))



Then, if you want just d$V13 == 0 to become NA, this will do it.


is.na(d[["V13"]]) <- d[["V13"]] == 0


If you want to do this to all numeric columns, try


d[!not_num] <- lapply(d[!not_num], function(x){
   is.na(x) <- x == 0
   x
})


Hope this helps,

Rui Barradas


Às 18:11 de 21/04/20, Helen Sawaya escreveu:
> Thank you for your patience.
>
> This is the output of dput(head(d, 10))
>
> structure(list(V1 = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
> 1L, 1L), .Label = "9.9761E+11,", class = "factor"), V2 = structure(c(1L,
> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "threat,", class = "factor"),
>  V3 = structure(c(1L, 28L, 37L, 48L, 55L, 63L, 73L, 88L, 2L,
>  20L), .Label = c("1,", "10,", "100,", "101,", "102,", "104,",
>  "107,", "108,", "109,", "110,", "111,", "112,", "113,", "114,",
>  "115,", "116,", "117,", "118,", "119,", "12,", "13,", "14,",
>  "15,", "16,", "17,", "18,", "19,", "2,", "20,", "21,", "22,",
>  "23,", "24,", "27,", "28,", "29,", "3,", "30,", "31,", "32,",
>  "33,", "34,", "35,", "36,", "37,", "38,", "39,", "4,", "42,",
>  "44,", "46,", "47,", "48,", "49,", "5,", "50,", "52,", "53,",
>  "54,", "55,", "57,", "59,", "6,", "60,", "61,", "62,", "63,",
>  "64,", "65,", "66,", "68,", "69,", "7,", "71,", "74,", "75,",
>  "76,", "78,", "81,", "82,", "83,", "84,", "85,", "86,", "87,",
>  "88,", "89,", "9,", "90,", "91,", "92,", "94,", "95,", "96,",
>  "97,", "98,"), class = "factor"), V4 = structure(c(1L, 2L,
>  1L, 2L, 2L, 2L, 2L, 2L, 1L, 1L), .Label = c("1,", "2,"), class = 
> "factor"),
>  V5 = structure(c(2L, 2L, 2L, 1L, 2L, 1L, 1L, 2L, 2L, 2L), .Label = 
> c("1,",
>  "2,"), class = "factor"), V6 = structure(c(2L, 1L, 2L, 2L,
>  1L, 2L, 2L, 1L, 2L, 2L), .Label = c("1,", "2,"), class = "factor"),
>  V7 = structure(c(2L, 1L, 2L, 2L, 2L, 2L, 1L, 2L, 1L, 2L), .Label = 
> c("1,",
>  "2,"), class = "factor"), V8 = structure(c(41L, 92L, 63L,
>  36L, 2L, 81L, 12L, 14L, 23L, 33L), .Label = c("abduction,",
>  "abortion,", "abuse,", "accident,", "addicted,", "agony,",
>  "anger,", "angry,", "anguish,", "assault,", "bankrupt,",
>  "bullet,", "burial,", "cancer,", "cemetery,", "coffin,",
>  "corpse,", "crash,", "crisis,", "cruel,", "death,", "defeated,",
>  "depressed,", "deserted,", "despair,", "destroy,", "disaster,",
>  "disloyal,", "distress,", "dreadful,", "drown,", "dull,",
>  "dump,", "emaciated,", "failure,", "fatigue,", "fault,",
>  "feeble,", "fever,", "filth,", "forlorn,", "germs,", "gloomy,",
>  "hardship,", "hell,", "helpless,", "horror,", "hostage,",
>  "hostile,", "hurt,", "idiot,", "infest,", "injury,", "irritable,",
>  "jail,", "killer,", "lonely,", "malaria,", "messy,", "misery,",
>  "mistake,", "morbid,", "murder,", "mutilate,", "pain,", "panic,",
>  "poison,", "prison,", "pus,", "rape,", "rat,", "rejected,",
>  "sad,", "scum,", "shame,", "sick,", "slap,", "snake,", "spider,",
>  "suicide,", "surgery,", "terrible,", "tormented,", "trash,",
>  "trauma,", "ugly,", "ulcer,", "unease,", "unhappy,", "useless,",
>  "victim,", "wasp,", "weep,", "worm,", "wound,"), class = "factor"),
>  V9 = structure(c(24L, 90L, 73L, 10L, 92L, 33L, 84L, 96L,
>  70L, 57L), .Label = c("alley,", "ankle,", "appliance,", "audience,",
>  "bandage,", "bathroom,", "bookcase,", "border,", "branch,",
>  "cabinet,", "category,", "clean,", "cliff,", "cold,", "consider,",
>  "consoled,", "context,", "country,", "crop,", "dentist,",
>  "detail,", "dinner,", "doctor,", "dynamic,", "easygoing,",
>  "elbow,", "energetic,", "farm,", "faucet,", "flat,", "flowing,",
>  "fork,", "freezer,", "glass,", "grass,", "guess,", "humble,",
>  "icebox,", "industry,", "invisible,", "jug,", "lighting,",
>  "lion,", "listen,", "little,", "machine,", "metal,", "month,",
>  "mushroom,", "napkin,", "news,", "noisy,", "north,", "nudge,",
>  "number,", "numerous,", "obey,", "odd,", "oval,", "plant,",
>  "possible,", "pot,", "public,", "puzzled,", "quarter,", "rational,",
>  "ready,", "reflect,", "reliable,", "repentant,", "sand,",
>  "sc

Re: [R] NA command in a 'for' loop

2020-04-21 Thread Jim Lemon
Hi Helen,
>From you last post, I think the best strategy is to make sure that the
operations you are performing are giving you the results you want. If so,
then we can tackle the multiple input files. As I don't have the library
you are using, I cannot access the function "get_tbls", so please replace:

# load whatever library you are using here
with
library()
where  is the name of the library

Then run the following script and tell us if you get your expected output

d<-read.table(
text="2.90546E+11,threat,1,2,1,2,1,death,stove,NA,NA,205,0,394
2.90546E+11,threat,2,2,2,1,1,emaciated,shortened,NA,NA,205,0,502
2.90546E+11,threat,3,1,1,1,2,mutilate,consider,NA,NA,205,1,468
2.90546E+11,threat,6,1,2,2,1,weep,shop,NA,NA,203,1,345
2.90546E+11,threat,9,2,1,2,2,tormented,easygoing,NA,NA,205,1,373
2.90546E+11,threat,10,1,2,2,2,snake,table,NA,NA,205,1,343
2.90546E+11,threat,11,2,2,1,1,crisis,faucet,NA,NA,203,1,437
2.90546E+11,threat,12,1,1,1,1,victim,utensil,NA,NA,203,1,343
2.90546E+11,threat,14,1,2,2,1,depressed,repentant,NA,NA,203,1,441
2.90546E+11,threat,15,2,2,1,2,scum,shoe,NA,NA,205,1,475",
header=FALSE,sep=",",stringsAsFactors=FALSE)
# look at at d, is it what you expect?
d
# let d2 be the rows of d where V13 is non-zero
d2<-d[d$V13!=0,]
# look at at d2, is it what you expect?
d2
congruent <-(d2$V4 == 1)
# look at at congruent, is it what you expect?
congruent
# load whatever library you are using here
x<-get_tlbs(d2$V14,congruent,prior_weights=NULL,method="weighted",
 fill_gaps = FALSE)
# look at at x, is it what you expect?
x
write.table(x,file="test_output.txt",quote=FALSE,row.names=FALSE)}
# open "test_output.txt" in a text editor. Is it what you want?

Jim

On Wed, Apr 22, 2020 at 3:11 AM Helen Sawaya 
wrote:

> Thank you for your patience.
>
> You're welcome.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Subtracting Data Frame With a Different Number of Rows

2020-04-21 Thread Phillip Heinrich
I have two small data frames of baseball data.  The first one is the mean 
number of runs that will score in each half inning for the 2018 Arizona 
Diamondbacks.  The second data frame is the same information but for only 
one player.  As you will see the individual player did not come up to bat 
any time during the season:

   with the bases loaded and no outs
   runners on first and third with one out

Overall

RunnerCodeOuts MeanRuns
1 Bases Empty 0   0.5137615
2 Runner:1st0   0.8967391
3 Runner:2nd   0   1.3018868
4 Runners:1st & 2nd0   1.6551724
5 Runner:3rd0   1.9545455
6 Runners:1st & 3rd 0   2.0571429
7 Runners:2nd & 3rd0   2.1578947
8 Bases Loaded0   3.2173913
9 Bases Empty  1   0.3963801
10 Runner:1st   1   0.6952596
11 Runner:2nd  1   0.9580838
12 Runners:1st & 2nd   1   1.4397163
13 Runner:3rd   1   1.5352113
14 Runners:1st & 3rd   11.5882353
15 Runners:2nd & 3rd  11.9215686
16 Bases Loaded  11.9193548
17 Bases Empty20.4191011
18 Runner:1st   20.5531915
19 Runner:2nd  20.8777293
20 Runners:1st & 2nd  2 0.9553073
21 Runner:3rd  2 1.2783505
22 Runners:1st & 3rd   2 1.5851064
23 Runners:2nd & 3rd  2 1.2794118
24 Bases Loaded 2  1.388235

Individual Player

 RunnerCode  Outs   MeanRuns
1 Bases Empty 0 0.4262295
2 Runner:1st0 1.320
3 Runner:2nd   0 1.2857143
4 Runners:1st & 2nd   0  0.5714286
5 Runner:3rd   0  2.000
6 Runners:1st & 3rd0  3.500
7 Runners:2nd & 3rd   0  1.000
8 Bases Empty 1  0.5238095
9 Runner:1st1  0.6578947
10 Runner:2nd 1  0.375
11 Runners:1st & 2nd 1   1.4285714
12 Runner:3rd 1   1.4285714
13 Runners:2nd & 3rd 1   0.667
14 Bases Loaded 1   3.000
15 Bases Empty   2   0.3469388
16 Runner:1st  2   0.1363636
17 Runner:2nd 2   0.7142857
18 Runners:1st & 2nd  2   1.667
19 Runner:3rd  2   1.250
20 Runners:1st & 3rd  22.1428571
21 Runners:2nd & 3rd 21.500
22 Bases Loaded 22.200

RunnersCode is a factor
Outs are integers
MeanRuns is numerical data

I would like to subtract the second from the first as a way to evaluate the 
players ability to produce runs. As part of this analysis I I would like to 
input the mean number of runs from the overall data frame into the two 
missing cells for the individual player:Bases Loaded no outs and 1st and 3rd 
one out.


Can anyone give me some advise?

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] NA command in a 'for' loop

2020-04-21 Thread Rui Barradas

Hello,

Much better, you have "," at the end of your data elements so nothing is 
working.


The following 3 instructions

1. remove those commas,
2. create a logical vector trying to guess which columns are numeric
3. coerce those columns to numeric.


d[] <- lapply(d, function(x){sub(",$", "", x)})
not_num <- sapply(d, function(x) all(is.na(as.numeric(as.character(x)
d[!not_num] <- lapply(d[!not_num], function(x) as.numeric(as.character(x)))



Then, if you want just d$V13 == 0 to become NA, this will do it.


is.na(d[["V13"]]) <- d[["V13"]] == 0


If you want to do this to all numeric columns, try


d[!not_num] <- lapply(d[!not_num], function(x){
  is.na(x) <- x == 0
  x
})


Hope this helps,

Rui Barradas


Às 18:11 de 21/04/20, Helen Sawaya escreveu:

Thank you for your patience.

This is the output of dput(head(d, 10))

structure(list(V1 = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L), .Label = "9.9761E+11,", class = "factor"), V2 = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "threat,", class = "factor"),
 V3 = structure(c(1L, 28L, 37L, 48L, 55L, 63L, 73L, 88L, 2L,
 20L), .Label = c("1,", "10,", "100,", "101,", "102,", "104,",
 "107,", "108,", "109,", "110,", "111,", "112,", "113,", "114,",
 "115,", "116,", "117,", "118,", "119,", "12,", "13,", "14,",
 "15,", "16,", "17,", "18,", "19,", "2,", "20,", "21,", "22,",
 "23,", "24,", "27,", "28,", "29,", "3,", "30,", "31,", "32,",
 "33,", "34,", "35,", "36,", "37,", "38,", "39,", "4,", "42,",
 "44,", "46,", "47,", "48,", "49,", "5,", "50,", "52,", "53,",
 "54,", "55,", "57,", "59,", "6,", "60,", "61,", "62,", "63,",
 "64,", "65,", "66,", "68,", "69,", "7,", "71,", "74,", "75,",
 "76,", "78,", "81,", "82,", "83,", "84,", "85,", "86,", "87,",
 "88,", "89,", "9,", "90,", "91,", "92,", "94,", "95,", "96,",
 "97,", "98,"), class = "factor"), V4 = structure(c(1L, 2L,
 1L, 2L, 2L, 2L, 2L, 2L, 1L, 1L), .Label = c("1,", "2,"), class = "factor"),
 V5 = structure(c(2L, 2L, 2L, 1L, 2L, 1L, 1L, 2L, 2L, 2L), .Label = c("1,",
 "2,"), class = "factor"), V6 = structure(c(2L, 1L, 2L, 2L,
 1L, 2L, 2L, 1L, 2L, 2L), .Label = c("1,", "2,"), class = "factor"),
 V7 = structure(c(2L, 1L, 2L, 2L, 2L, 2L, 1L, 2L, 1L, 2L), .Label = c("1,",
 "2,"), class = "factor"), V8 = structure(c(41L, 92L, 63L,
 36L, 2L, 81L, 12L, 14L, 23L, 33L), .Label = c("abduction,",
 "abortion,", "abuse,", "accident,", "addicted,", "agony,",
 "anger,", "angry,", "anguish,", "assault,", "bankrupt,",
 "bullet,", "burial,", "cancer,", "cemetery,", "coffin,",
 "corpse,", "crash,", "crisis,", "cruel,", "death,", "defeated,",
 "depressed,", "deserted,", "despair,", "destroy,", "disaster,",
 "disloyal,", "distress,", "dreadful,", "drown,", "dull,",
 "dump,", "emaciated,", "failure,", "fatigue,", "fault,",
 "feeble,", "fever,", "filth,", "forlorn,", "germs,", "gloomy,",
 "hardship,", "hell,", "helpless,", "horror,", "hostage,",
 "hostile,", "hurt,", "idiot,", "infest,", "injury,", "irritable,",
 "jail,", "killer,", "lonely,", "malaria,", "messy,", "misery,",
 "mistake,", "morbid,", "murder,", "mutilate,", "pain,", "panic,",
 "poison,", "prison,", "pus,", "rape,", "rat,", "rejected,",
 "sad,", "scum,", "shame,", "sick,", "slap,", "snake,", "spider,",
 "suicide,", "surgery,", "terrible,", "tormented,", "trash,",
 "trauma,", "ugly,", "ulcer,", "unease,", "unhappy,", "useless,",
 "victim,", "wasp,", "weep,", "worm,", "wound,"), class = "factor"),
 V9 = structure(c(24L, 90L, 73L, 10L, 92L, 33L, 84L, 96L,
 70L, 57L), .Label = c("alley,", "ankle,", "appliance,", "audience,",
 "bandage,", "bathroom,", "bookcase,", "border,", "branch,",
 "cabinet,", "category,", "clean,", "cliff,", "cold,", "consider,",
 "consoled,", "context,", "country,", "crop,", "dentist,",
 "detail,", "dinner,", "doctor,", "dynamic,", "easygoing,",
 "elbow,", "energetic,", "farm,", "faucet,", "flat,", "flowing,",
 "fork,", "freezer,", "glass,", "grass,", "guess,", "humble,",
 "icebox,", "industry,", "invisible,", "jug,", "lighting,",
 "lion,", "listen,", "little,", "machine,", "metal,", "month,",
 "mushroom,", "napkin,", "news,", "noisy,", "north,", "nudge,",
 "number,", "numerous,", "obey,", "odd,", "oval,", "plant,",
 "possible,", "pot,", "public,", "puzzled,", "quarter,", "rational,",
 "ready,", "reflect,", "reliable,", "repentant,", "sand,",
 "school,", "secret,", "series,", "shark,", "shoe,", "shop,",
 "shortened,", "skyline,", "stable,", "storm,", "stove,",
 "table,", "theory,", "tower,", "truck,", "upgrade,", "upright,",
 "utensil,", "vest,", "vision,", "volcano,", "walk,", "watchful,",
 "window,", "winter,"), class = "factor"), V10 = structure(c(1L,
 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "NA,", class = "factor"),
 V11 = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "

Re: [R] NA command in a 'for' loop

2020-04-21 Thread William Dunlap via R-help
Read the files with read.csv(filename) or read.table(sep=",", filename) so
the commas don't become part of the R data.frame.

Bill Dunlap
TIBCO Software
wdunlap tibco.com


On Tue, Apr 21, 2020 at 10:17 AM Helen Sawaya 
wrote:

> Thank you for your patience.
>
> This is the output of dput(head(d, 10))
>
> structure(list(V1 = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
> 1L, 1L), .Label = "9.9761E+11,", class = "factor"), V2 = structure(c(1L,
> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "threat,", class =
> "factor"),
> V3 = structure(c(1L, 28L, 37L, 48L, 55L, 63L, 73L, 88L, 2L,
> 20L), .Label = c("1,", "10,", "100,", "101,", "102,", "104,",
> "107,", "108,", "109,", "110,", "111,", "112,", "113,", "114,",
> "115,", "116,", "117,", "118,", "119,", "12,", "13,", "14,",
> "15,", "16,", "17,", "18,", "19,", "2,", "20,", "21,", "22,",
> "23,", "24,", "27,", "28,", "29,", "3,", "30,", "31,", "32,",
> "33,", "34,", "35,", "36,", "37,", "38,", "39,", "4,", "42,",
> "44,", "46,", "47,", "48,", "49,", "5,", "50,", "52,", "53,",
> "54,", "55,", "57,", "59,", "6,", "60,", "61,", "62,", "63,",
> "64,", "65,", "66,", "68,", "69,", "7,", "71,", "74,", "75,",
> "76,", "78,", "81,", "82,", "83,", "84,", "85,", "86,", "87,",
> "88,", "89,", "9,", "90,", "91,", "92,", "94,", "95,", "96,",
> "97,", "98,"), class = "factor"), V4 = structure(c(1L, 2L,
> 1L, 2L, 2L, 2L, 2L, 2L, 1L, 1L), .Label = c("1,", "2,"), class =
> "factor"),
> V5 = structure(c(2L, 2L, 2L, 1L, 2L, 1L, 1L, 2L, 2L, 2L), .Label =
> c("1,",
> "2,"), class = "factor"), V6 = structure(c(2L, 1L, 2L, 2L,
> 1L, 2L, 2L, 1L, 2L, 2L), .Label = c("1,", "2,"), class = "factor"),
> V7 = structure(c(2L, 1L, 2L, 2L, 2L, 2L, 1L, 2L, 1L, 2L), .Label =
> c("1,",
> "2,"), class = "factor"), V8 = structure(c(41L, 92L, 63L,
> 36L, 2L, 81L, 12L, 14L, 23L, 33L), .Label = c("abduction,",
> "abortion,", "abuse,", "accident,", "addicted,", "agony,",
> "anger,", "angry,", "anguish,", "assault,", "bankrupt,",
> "bullet,", "burial,", "cancer,", "cemetery,", "coffin,",
> "corpse,", "crash,", "crisis,", "cruel,", "death,", "defeated,",
> "depressed,", "deserted,", "despair,", "destroy,", "disaster,",
> "disloyal,", "distress,", "dreadful,", "drown,", "dull,",
> "dump,", "emaciated,", "failure,", "fatigue,", "fault,",
> "feeble,", "fever,", "filth,", "forlorn,", "germs,", "gloomy,",
> "hardship,", "hell,", "helpless,", "horror,", "hostage,",
> "hostile,", "hurt,", "idiot,", "infest,", "injury,", "irritable,",
> "jail,", "killer,", "lonely,", "malaria,", "messy,", "misery,",
> "mistake,", "morbid,", "murder,", "mutilate,", "pain,", "panic,",
> "poison,", "prison,", "pus,", "rape,", "rat,", "rejected,",
> "sad,", "scum,", "shame,", "sick,", "slap,", "snake,", "spider,",
> "suicide,", "surgery,", "terrible,", "tormented,", "trash,",
> "trauma,", "ugly,", "ulcer,", "unease,", "unhappy,", "useless,",
> "victim,", "wasp,", "weep,", "worm,", "wound,"), class = "factor"),
> V9 = structure(c(24L, 90L, 73L, 10L, 92L, 33L, 84L, 96L,
> 70L, 57L), .Label = c("alley,", "ankle,", "appliance,", "audience,",
> "bandage,", "bathroom,", "bookcase,", "border,", "branch,",
> "cabinet,", "category,", "clean,", "cliff,", "cold,", "consider,",
> "consoled,", "context,", "country,", "crop,", "dentist,",
> "detail,", "dinner,", "doctor,", "dynamic,", "easygoing,",
> "elbow,", "energetic,", "farm,", "faucet,", "flat,", "flowing,",
> "fork,", "freezer,", "glass,", "grass,", "guess,", "humble,",
> "icebox,", "industry,", "invisible,", "jug,", "lighting,",
> "lion,", "listen,", "little,", "machine,", "metal,", "month,",
> "mushroom,", "napkin,", "news,", "noisy,", "north,", "nudge,",
> "number,", "numerous,", "obey,", "odd,", "oval,", "plant,",
> "possible,", "pot,", "public,", "puzzled,", "quarter,", "rational,",
> "ready,", "reflect,", "reliable,", "repentant,", "sand,",
> "school,", "secret,", "series,", "shark,", "shoe,", "shop,",
> "shortened,", "skyline,", "stable,", "storm,", "stove,",
> "table,", "theory,", "tower,", "truck,", "upgrade,", "upright,",
> "utensil,", "vest,", "vision,", "volcano,", "walk,", "watchful,",
> "window,", "winter,"), class = "factor"), V10 = structure(c(1L,
> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "NA,", class =
> "factor"),
> V11 = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label =
> "NA,", class = "factor"),
> V12 = structure(c(2L, 1L, 2L, 2L, 2L, 2L, 1L, 2L, 1L, 2L), .Label =
> c("203,",
> "205,"), class = "factor"), V13 = structure(c(1L, 1L, 1L,
> 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "1,", class = "factor"),
> V14 = c(4063L, 4914L, 1508L, 1819L, 1228L, 992L, 1898L, 1174L,
> 1294L, 1417L)), row.names = c(NA, 10L), class = "data.frame”)
>
> When I use the following:
>
> all.files <- list.files(".")
> txt.files <- gre

Re: [R] Can I use R for comertial projects for free?

2020-04-21 Thread Jeff Newmiller
The "contamination" of other code by GPL is not absolute... it _is_ possible to 
use GPL code without releasing your code similarly, and blithely suggesting 
otherwise perpetuates myths about GPL.

That said, it is very tricky to do so while presenting a clean user experience, 
and doing so is likely to raise objections on the part of consumers of your 
commercial product unless you do it right. The appropriate response here is to 
reccommend consultation with a lawyer familiar with these issues.

Informally, I would recommend against doing this from a user experience 
perspective... but not because of contamination.

On April 21, 2020 9:12:51 AM PDT, Duncan Murdoch  
wrote:
>On 21/04/2020 11:11 a.m., dmitry sergey wrote:
>> Hi,
>> 
>> I kindly interesting can i use R for commercial projects for free? I
>am
>> going to get statistics in my commercial project with R and wanted to
>know
>> will it be legal or no?
>
>If you are distributing R as part of your project, then you will need
>to 
>license your project in a compatible way, e.g. GPL, and distribute its 
>full source.  Nothing stopping you from doing that commercially.
>
>Duncan Murdoch
>
>> 
>> Thanks in advance.
>> 
>> Best Regards,
>> Dmitry
>> 
>>  [[alternative HTML version deleted]]
>> 
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

-- 
Sent from my phone. Please excuse my brevity.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] NA command in a 'for' loop

2020-04-21 Thread Helen Sawaya
Thank you for your patience.

This is the output of dput(head(d, 10))

structure(list(V1 = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L), .Label = "9.9761E+11,", class = "factor"), V2 = structure(c(1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "threat,", class = "factor"), 
V3 = structure(c(1L, 28L, 37L, 48L, 55L, 63L, 73L, 88L, 2L, 
20L), .Label = c("1,", "10,", "100,", "101,", "102,", "104,", 
"107,", "108,", "109,", "110,", "111,", "112,", "113,", "114,", 
"115,", "116,", "117,", "118,", "119,", "12,", "13,", "14,", 
"15,", "16,", "17,", "18,", "19,", "2,", "20,", "21,", "22,", 
"23,", "24,", "27,", "28,", "29,", "3,", "30,", "31,", "32,", 
"33,", "34,", "35,", "36,", "37,", "38,", "39,", "4,", "42,", 
"44,", "46,", "47,", "48,", "49,", "5,", "50,", "52,", "53,", 
"54,", "55,", "57,", "59,", "6,", "60,", "61,", "62,", "63,", 
"64,", "65,", "66,", "68,", "69,", "7,", "71,", "74,", "75,", 
"76,", "78,", "81,", "82,", "83,", "84,", "85,", "86,", "87,", 
"88,", "89,", "9,", "90,", "91,", "92,", "94,", "95,", "96,", 
"97,", "98,"), class = "factor"), V4 = structure(c(1L, 2L, 
1L, 2L, 2L, 2L, 2L, 2L, 1L, 1L), .Label = c("1,", "2,"), class = "factor"), 
V5 = structure(c(2L, 2L, 2L, 1L, 2L, 1L, 1L, 2L, 2L, 2L), .Label = c("1,", 
"2,"), class = "factor"), V6 = structure(c(2L, 1L, 2L, 2L, 
1L, 2L, 2L, 1L, 2L, 2L), .Label = c("1,", "2,"), class = "factor"), 
V7 = structure(c(2L, 1L, 2L, 2L, 2L, 2L, 1L, 2L, 1L, 2L), .Label = c("1,", 
"2,"), class = "factor"), V8 = structure(c(41L, 92L, 63L, 
36L, 2L, 81L, 12L, 14L, 23L, 33L), .Label = c("abduction,", 
"abortion,", "abuse,", "accident,", "addicted,", "agony,", 
"anger,", "angry,", "anguish,", "assault,", "bankrupt,", 
"bullet,", "burial,", "cancer,", "cemetery,", "coffin,", 
"corpse,", "crash,", "crisis,", "cruel,", "death,", "defeated,", 
"depressed,", "deserted,", "despair,", "destroy,", "disaster,", 
"disloyal,", "distress,", "dreadful,", "drown,", "dull,", 
"dump,", "emaciated,", "failure,", "fatigue,", "fault,", 
"feeble,", "fever,", "filth,", "forlorn,", "germs,", "gloomy,", 
"hardship,", "hell,", "helpless,", "horror,", "hostage,", 
"hostile,", "hurt,", "idiot,", "infest,", "injury,", "irritable,", 
"jail,", "killer,", "lonely,", "malaria,", "messy,", "misery,", 
"mistake,", "morbid,", "murder,", "mutilate,", "pain,", "panic,", 
"poison,", "prison,", "pus,", "rape,", "rat,", "rejected,", 
"sad,", "scum,", "shame,", "sick,", "slap,", "snake,", "spider,", 
"suicide,", "surgery,", "terrible,", "tormented,", "trash,", 
"trauma,", "ugly,", "ulcer,", "unease,", "unhappy,", "useless,", 
"victim,", "wasp,", "weep,", "worm,", "wound,"), class = "factor"), 
V9 = structure(c(24L, 90L, 73L, 10L, 92L, 33L, 84L, 96L, 
70L, 57L), .Label = c("alley,", "ankle,", "appliance,", "audience,", 
"bandage,", "bathroom,", "bookcase,", "border,", "branch,", 
"cabinet,", "category,", "clean,", "cliff,", "cold,", "consider,", 
"consoled,", "context,", "country,", "crop,", "dentist,", 
"detail,", "dinner,", "doctor,", "dynamic,", "easygoing,", 
"elbow,", "energetic,", "farm,", "faucet,", "flat,", "flowing,", 
"fork,", "freezer,", "glass,", "grass,", "guess,", "humble,", 
"icebox,", "industry,", "invisible,", "jug,", "lighting,", 
"lion,", "listen,", "little,", "machine,", "metal,", "month,", 
"mushroom,", "napkin,", "news,", "noisy,", "north,", "nudge,", 
"number,", "numerous,", "obey,", "odd,", "oval,", "plant,", 
"possible,", "pot,", "public,", "puzzled,", "quarter,", "rational,", 
"ready,", "reflect,", "reliable,", "repentant,", "sand,", 
"school,", "secret,", "series,", "shark,", "shoe,", "shop,", 
"shortened,", "skyline,", "stable,", "storm,", "stove,", 
"table,", "theory,", "tower,", "truck,", "upgrade,", "upright,", 
"utensil,", "vest,", "vision,", "volcano,", "walk,", "watchful,", 
"window,", "winter,"), class = "factor"), V10 = structure(c(1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "NA,", class = "factor"), 
V11 = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "NA,", 
class = "factor"), 
V12 = structure(c(2L, 1L, 2L, 2L, 2L, 2L, 1L, 2L, 1L, 2L), .Label = 
c("203,", 
"205,"), class = "factor"), V13 = structure(c(1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "1,", class = "factor"), 
V14 = c(4063L, 4914L, 1508L, 1819L, 1228L, 992L, 1898L, 1174L, 
1294L, 1417L)), row.names = c(NA, 10L), class = "data.frame”)

When I use the following:

all.files <- list.files(".")
txt.files <- grep("threat.txt",all.files,value=T)

for(i in txt.files) {
  d<-read.table(i, header=FALSE)
  d[] <- lapply(d, function(x) {is.na(x) <- x == 0; x})
  write.table(d,paste0(i, "trial.txt"), quote=FALSE, row.names=FALSE)}

I get this (an example of one of the output files with zeros in V13):

V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V

Re: [R] ggplot2 error bars and convex hulls

2020-04-21 Thread Rui Barradas

Hello,

As for convex hulls, there is an example of how to construct a stat_hull in

vignette("extending-ggplot2", package = "ggplot2")

There is also a geom_hull in a GitHub package:

devtools::install_github("cmartin/ggConvexHull")


Hope this helps,

Rui Barradas

Às 17:02 de 21/04/20, Ivan Calandra escreveu:

Dear useRs,

I would like to have horizontal and vertical error bars extending from
the means on two continuous variables.

This would be the "manual" way of doing it, computing the mean and sd
(or whatever stats) beforehand and then calling geom_errorbar() and
geom_errorbarh() with appropriate coordinates:
https://stackoverflow.com/questions/12570816/ggplot-scatter-plot-of-two-groups-with-superimposed-means-with-x-and-y-error-bar

But I am a bit surprised that there is no "built-in" way of doing it
with ggplot2. I mean not having to compute mean and sd beforehand and
not having to call both geom_errorbar() and geom_errorbarh() with a new
set of aesthetics.

In the same idea, I am looking at convex hulls and I was also expecting
to have a built-in way to do this in ggplot2. But I have only found this
"manual" way:
https://stats.stackexchange.com/questions/22805/how-to-draw-neat-polygons-around-scatterplot-regions-in-ggplot2

Thank you in advance for any pointer.
Ivan



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Can I use R for comertial projects for free?

2020-04-21 Thread Jeff Newmiller
If you comply with the relevant licenses... sure... open source can be 
compatible with commercial activity. But your description of your use case is 
way too deficient for anyone to even comment on. Since this is not a legal 
advice forum, go ask your question of a lawyer familiar with open source 
software licenses.

On April 21, 2020 8:11:32 AM PDT, dmitry sergey  wrote:
>Hi,
>
>I kindly interesting can i use R for commercial projects for free? I am
>going to get statistics in my commercial project with R and wanted to
>know
>will it be legal or no?
>
>Thanks in advance.
>
>Best Regards,
>Dmitry
>
>   [[alternative HTML version deleted]]
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

-- 
Sent from my phone. Please excuse my brevity.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Can I use R for comertial projects for free?

2020-04-21 Thread Duncan Murdoch

On 21/04/2020 11:11 a.m., dmitry sergey wrote:

Hi,

I kindly interesting can i use R for commercial projects for free? I am
going to get statistics in my commercial project with R and wanted to know
will it be legal or no?


If you are distributing R as part of your project, then you will need to 
license your project in a compatible way, e.g. GPL, and distribute its 
full source.  Nothing stopping you from doing that commercially.


Duncan Murdoch



Thanks in advance.

Best Regards,
Dmitry

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ggplot2 error bars and convex hulls

2020-04-21 Thread Ivan Calandra
Dear useRs,

I would like to have horizontal and vertical error bars extending from
the means on two continuous variables.

This would be the "manual" way of doing it, computing the mean and sd
(or whatever stats) beforehand and then calling geom_errorbar() and
geom_errorbarh() with appropriate coordinates:
https://stackoverflow.com/questions/12570816/ggplot-scatter-plot-of-two-groups-with-superimposed-means-with-x-and-y-error-bar

But I am a bit surprised that there is no "built-in" way of doing it
with ggplot2. I mean not having to compute mean and sd beforehand and
not having to call both geom_errorbar() and geom_errorbarh() with a new
set of aesthetics.

In the same idea, I am looking at convex hulls and I was also expecting
to have a built-in way to do this in ggplot2. But I have only found this
"manual" way:
https://stats.stackexchange.com/questions/22805/how-to-draw-neat-polygons-around-scatterplot-regions-in-ggplot2

Thank you in advance for any pointer.
Ivan

-- 

Dr. Ivan Calandra
TraCEr, laboratory for Traceology and Controlled Experiments
MONREPOS Archaeological Research Centre and
Museum for Human Behavioural Evolution
Schloss Monrepos
56567 Neuwied, Germany
+49 (0) 2631 9772-243
https://www.researchgate.net/profile/Ivan_Calandra

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Can I use R for comertial projects for free?

2020-04-21 Thread dmitry sergey
Hi,

I kindly interesting can i use R for commercial projects for free? I am
going to get statistics in my commercial project with R and wanted to know
will it be legal or no?

Thanks in advance.

Best Regards,
Dmitry

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to merge two files while preserving the number of rows of one file in merged one?

2020-04-21 Thread Jeff Newmiller
Read about the all.x and all.y arguments to ?merge.

On April 21, 2020 7:53:33 AM PDT, Ana Marija  
wrote:
>Hello,
>
>> head(a)
>   ID_1 pheno
>1 0 B
>2 fam1000_G1000 0
>3 fam1001_G1001 0
>4 fam1003_G1003 1
>5 fam1005_G1005 0
>6 fam1009_G1009 0
>> head(b)
>   ID_1  ID_2 missing
>1 0 0   0
>2 fam1000_G1000 fam1000_G1000   0
>3 fam1001_G1001 fam1001_G1001   0
>4 fam1003_G1003 fam1003_G1003   0
>5 fam1005_G1005 fam1005_G1005   0
>6 fam1009_G1009 fam1009_G1009   0
>> dim(b)
>[1] 16023
>> dim(a)
>[1] 16522
>> m=merge(a,b,by="ID_1")
>> dim(m)
>[1] 14994
>> head(m)
>  ID_1 pheno ID_2 missing
>10 B0   0
>2 fam0110_G110 1 fam0110_G110   0
>3 fam0117_G117 1 fam0117_G117   0
>4 fam0124_G124   fam0124_G124   0
>
>I would like my merged file (m) to have the same number of lines like
>(b), that is 1602. Can you please let me know how would I do that?
>
>Thanks
>Ana
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

-- 
Sent from my phone. Please excuse my brevity.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to merge two files while preserving the number of rows of one file in merged one?

2020-04-21 Thread Ana Marija
this solved it:
 m=merge(a,b,by="ID_1",all.y = T)


On Tue, Apr 21, 2020 at 9:53 AM Ana Marija  wrote:
>
> Hello,
>
> > head(a)
>ID_1 pheno
> 1 0 B
> 2 fam1000_G1000 0
> 3 fam1001_G1001 0
> 4 fam1003_G1003 1
> 5 fam1005_G1005 0
> 6 fam1009_G1009 0
> > head(b)
>ID_1  ID_2 missing
> 1 0 0   0
> 2 fam1000_G1000 fam1000_G1000   0
> 3 fam1001_G1001 fam1001_G1001   0
> 4 fam1003_G1003 fam1003_G1003   0
> 5 fam1005_G1005 fam1005_G1005   0
> 6 fam1009_G1009 fam1009_G1009   0
> > dim(b)
> [1] 16023
> > dim(a)
> [1] 16522
> > m=merge(a,b,by="ID_1")
> > dim(m)
> [1] 14994
> > head(m)
>   ID_1 pheno ID_2 missing
> 10 B0   0
> 2 fam0110_G110 1 fam0110_G110   0
> 3 fam0117_G117 1 fam0117_G117   0
> 4 fam0124_G124   fam0124_G124   0
>
> I would like my merged file (m) to have the same number of lines like
> (b), that is 1602. Can you please let me know how would I do that?
>
> Thanks
> Ana

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to merge two files while preserving the number of rows of one file in merged one?

2020-04-21 Thread Ana Marija
Hello,

> head(a)
   ID_1 pheno
1 0 B
2 fam1000_G1000 0
3 fam1001_G1001 0
4 fam1003_G1003 1
5 fam1005_G1005 0
6 fam1009_G1009 0
> head(b)
   ID_1  ID_2 missing
1 0 0   0
2 fam1000_G1000 fam1000_G1000   0
3 fam1001_G1001 fam1001_G1001   0
4 fam1003_G1003 fam1003_G1003   0
5 fam1005_G1005 fam1005_G1005   0
6 fam1009_G1009 fam1009_G1009   0
> dim(b)
[1] 16023
> dim(a)
[1] 16522
> m=merge(a,b,by="ID_1")
> dim(m)
[1] 14994
> head(m)
  ID_1 pheno ID_2 missing
10 B0   0
2 fam0110_G110 1 fam0110_G110   0
3 fam0117_G117 1 fam0117_G117   0
4 fam0124_G124   fam0124_G124   0

I would like my merged file (m) to have the same number of lines like
(b), that is 1602. Can you please let me know how would I do that?

Thanks
Ana

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Survival analysis

2020-04-21 Thread Göran Broström

Dear dr Medic,

Den 2020-04-17 kl. 23:03, skrev Medic:

On 2020-04-17 20:06, Medic wrote:

I can't understand how to do a survival analysis (?Surv ()) when some
event occurred before the start of observation (left censored). If I
understand correctly, there are two methods. I chose a method with: 1)
time from the start of treatment to the event and 2) the indicator of
the event. I did (in my data) the event indicator so:
1 - event, 2 - event before the start of observation, 0 - no event


I have no experience of left censoring beyond the text book.  Is your
left censored data the SAME event or a different event?

YES, THE SAME!


---
library(survival)
left_censor_data <- read.table("left.csv", header = TRUE, sep = ";")
#sep = ";" it's right!
dput(left_censor_data, file="left_censor_data") #file attached
left_censor_data
'data.frame':   11 obs. of  2 variables:
$ timee : int  5 151 33 37 75 14 7 9 1 45 ...
$ eventt: int  2 0 0 0 0 0 0 2 0 1 ...
# 1—event, 2 – event before the start of observation , 0 – no event


So if I read this data correctly the first observation is left censored.
What does the time "5" refer to?  Is that 5 days BEFORE observation the
event happened?

YES, EXACTLY!

My text book understanding of left censored data was that your
censored points would
have time 0.

I TRIED TO SET TIME 0 NOW (for censored points), AND RECEIVED THE SAME
WARNING (AND THE CURVE TURNED OUT WRONG)


sur <- Surv(time = left_censor_data$timee,  event =
left_censor_data$eventt, type = "left")
   WARNING message:
   In Surv(time = left_censor_data$timee, event =
left_censor_data$eventt,  :
   Invalid status value, converted to NA

#Why such a WARNING message?


Because the choice "type = 'left'" is incompatible with an event 
variable taking three values, see the help page for survival::Surv. My 
guess from your (vague) description is that you want "type = 
'counting'", as indicated below.



#Then everything turns out wrong


Is the censoring type you want LEFT TRUNCATION rather than LEFT.
If they are also right censored I think R Surv calls these Counting.

I SAY ABOUT LEFT CENSORING (NOT ABOUT LEFT TRUNCATION)!
(COUNTING? I DO NOT UNDERSTAND THIS.)


Then you should read an elementary text on survival analysis or consult 
a local statistician.


G,



THANKS! I HOPE SOMEONE EXPLAIN TO ME
1) HOW TO COMPILE THE DATA and
2) WRITE A CODE

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Web-scraping newbie - dynamic table into R?

2020-04-21 Thread Ivan Krylov
On Sun, 19 Apr 2020 at 22:34, Julio Farach  wrote:

> But, I'm seeking the last 10 draws shown on the "Winning Numbers," or
> 4th tab.

The "Network" tab in browser developer tools (usually accessible by
pressing F12) demonstrates that the "Winning Numbers" are fetched in
JSON format by means of an XHR from
.

The server checks the User-Agent: header and returns a 403 error to
clients that don't look like browsers, which probably means that the
website ToS forbids programmatic access.

-- 
Best regards,
Ivan

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Web-scraping newbie - dynamic table into R?

2020-04-21 Thread John Kane
Hi Julio,

I am just working on my first cup of tea of the morning so I am not
functioning all that well but I finally noticed that we have dropped the
R-help list.  I have put it back as a recipient as there are a lot of
people that know about 99%+ more than I do about the topic.

I'll keep poking around and see what I can find.

On Sun, 19 Apr 2020 at 22:34, Julio Farach  wrote:

> John,
>
> I again thank you for the reply and continued support.  After a few hours,
> I arrived at the point you describe below; namely extracting elements, but
> from a different tab than the Last 10 Draws, or Winning Numbers tab.
>
> On the website, there are 5 tabs.  The elements you describe below are
> from the 3rd tab, "Odds & Prizes."  Instead of results, that tab describes
> the general odds of the Keno game.  But, I'm seeking the last 10 draws
> shown on the "Winning Numbers," or 4th tab.  I've played around with a CSS
> Selector tool, but I'm unable to extract any details (e.g., a draw number
> or Keno number) from the 4th tab.  I could extract elements of other tabs,
> like you did below, from the 3rd tab.
>
> Please let me know if you learn more or if you have other ideas for me to
> consider.
>
> Regards,
> Julio
>
> On Sun, Apr 19, 2020 at 7:00 PM John Kane  wrote:
>
>> I am a comple newbie too but try this
>> library(rvest)
>>Kenopage <- "
>> https://www.galottery.com/en-us/games/draw-games/keno.html#tab-winningNumbers
>> "
>>
>> Keno <- read_html(Kenopage)
>>
>> tt  <-  html_table(Keno, fill= TRUE)
>>
>> This should give you a list with 10 elements, each of which should be a
>> data.frame
>> Example
>>
>> ken1  <-  tt[[1]]
>> str(ken1)
>>
>> > str(ken1)
>> 'data.frame': 12 obs. of  4 variables:
>>  $ Numbers Matched : chr  "10" "9" "8" "7" ...
>>  $ Base Keno! Prize: chr  "$100,000*" "$5,000" "$500" "$50" ...
>>  $ + Bulls-Eye Prize   : chr  "$200,000*" "$20,000" "$1,500" "$100"
>> ...
>>  $ Keno! w/ Bulls-Eye Prize: chr  "$300,000" "$25,000" "$2,000" "$150" ...
>> >
>>
>> I figured this out a little a few ago and just manually stepped through
>> the data.frames to get what I wanted. Brute force and stupidity but it
>> worked
>>
>> Someday I may figure out how to use things like SelectorGadget!
>>
>>
>>
>>
>> On Sun, 19 Apr 2020 at 17:46, Julio Farach  wrote:
>>
>>> John - I corrected my email below for typos.
>>>
>>> On Sun, Apr 19, 2020 at 5:42 PM Julio Farach  wrote:
>>>
 John,

 Yes, while I can execute the line of code that I provided, I am still
 unable to capture the table shown in the browser.  The last 10 draws are
 shown in a table if you view the page:

 https://www.galottery.com/en-us/games/draw-games/keno.html#tab-winningNumbers


 But, despite using CSS and XPath combinations of
 >html_nodes(x, CSS or XPath)
 I am unable to copy that table into R.

 One commenter on another forum received an error and suggested that
 perhaps bots lack permission to access the page.  But, I've used the
 Robotstxt package to ensure that bots are indeed permitted.

 Any thoughts?

 Regards,
 Julio

 On Sun, Apr 19, 2020 at 4:38 PM John Kane  wrote:

> Keno <- read_html(Kenopage) ?
>
> Or Am I misunderstanding the problem?
>
> On Sun, 19 Apr 2020 at 15:10, Julio Farach  wrote:
>
>> How do I scrape the last 10 Keno draws from the Georgia lottery into
>> R?
>>
>>
>> I'm trying to pull the last 10 draws of a Keno lottery game into R.
>> I've
>> read several tutorials on how to scrape websites using the rvest
>> package,
>> Chrome's Inspect Element, and CSS or XPath, but I'm likely stuck
>> because
>> the table I seek is dynamically generated using Javascript.
>>
>>
>>
>> I started with:
>>
>> >install.packages("rvest")
>>
>> >   library(rvest)
>>
>> >Kenopage <- "
>>
>> https://www.galottery.com/en-us/games/draw-games/keno.html#tab-winningNumbers
>> "
>>
>> > Keno <- Read.hmtl(Kenopage)
>>
>> From there, I've been unable to progress, despite hours spend on
>> combinations of CSS and XPath calls with "html_notes."
>>
>> Failed example: DrawNumber <- Keno %>% rvest::html_nodes("body") %>%
>> xml2::xml_find_all("//span[contains(@class,'Draw Number')]") %>%
>> rvest::html_text()
>>
>>
>>
>> Someone mentioned using the V8 package in R, but it's new to me.
>>
>> How do I get started?
>>
>> --
>>
>> Julio Farach
>> https://www.linkedin.com/in/farach
>> cell phone:  804/363-2161
>> email:  jfar...@gmail.com
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>>>

Re: [R] Help to download data from multiple URLs with API key

2020-04-21 Thread Eric Berger
Hi Bhaskar,
Why not just create a function that does the repetitive work, such as

doOne <- function( suffix ) {
   base_url <- "abcd" # This remains constant
   b <-  "api_key"# the api key - this remains constant
   c <-  paste("series_id=",suffix,sep="")
   full_url = paste0(base_url, b, c)
   d3 <- lapply(fromJSON(file=full_url)[[2]], function(x) c(x["data"]))
   d3 <- do.call(rbind, d3)
   b <- as.data.frame(unlist(d3))
   write.csv(b)
}

Then,
suffixes <- ... (whatever)
for ( s in suffixes )
doOne( s )

You might need to also think about the filenames that you want to use in
the write.csv() command in the function doOne.

HTH,
Eric


On Tue, Apr 21, 2020 at 9:30 AM Bhaskar Mitra 
wrote:

> Hello Everyone,
>
> I am trying to download data from multiple websites using API key.
> The  code to download from one URL is given below.
>
> I have a list of multiple URLs' where the suffix URL 'c' keeps changing.
>
> I would appreciate any help on how i can modify the code below that will
> allow
>  me to read multiple URLs and save the data from each URL as separate
> csv file.
>
> thanks,
> bhaskar
>
>
> #---
> library(rjson)
> setwd(Input)
>
> base_url <- "abcd" # This remains constant
>
> b <-  "api_key"# the api key - this remains constant
>
> c <-  "series_id=1"# Only this suffix URL changes.  I have a list
> of multiple such URL's with different series ids.
>
>
> full_url = paste0(base_url,
>   b,
>   c)
>
>
> d3 <- lapply(fromJSON(file=full_url)[[2]], function(x) c(x["data"]))
> d3 <- do.call(rbind, d3)
>
> b <- as.data.frame(unlist(d3))
> write.csv(b)
>
> #---
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.