from:"Ulrik Stervbo"

Re: [R] facet_wrap(nrow) ignored

2020-09-10 Thread Ulrik Stervbo via R-help


Dear Ivan,

I don't think it is possible to force a number of rows - but I'm 
honestly just guessing.


What you can do is to add an empty plot. Here I use cowplot, but 
gridExtra should also work well.


I add an indication of the row number for the plot to the initial 
data.frame, and loop over these.


In the first variant, I add an unused factor to the grp which creates an 
empty facet. I personally think this looks a little confusing, so in the 
second variant, I add a number of empty plots.


HTH
Ulrik

```
mydf <- data.frame(
  grp = rep(letters[1:6], each = 15),
  cat = rep(1:3, 30),
  var = rnorm(90),
  row_num = rep(c(1, 1, 2, 3, 4, 5), each = 15)
)

s_mydf <- split(mydf, mydf$row_num)

plots_mydf <- lapply(s_mydf, function(x){
  # Ensure no unused factors
  x$grp <- droplevels.factor(x$grp)
  if(length(unique(x$grp)) == 1){
x$grp <- factor(x$grp, levels = c(unique(x$grp), ""))
  }
  ggplot(data = x, aes(x = cat, y = var)) + geom_point() +
facet_wrap(~grp, drop=FALSE)
})

cowplot::plot_grid(plotlist = plots_mydf, nrow = 5)

# Maybe more elegant output
plots_mydf <- lapply(s_mydf, function(x, ncol = 2){
  # Ensure no unused factors
  x$grp <- droplevels.factor(x$grp)
  x <- split(x, x$grp)

  p <- lapply(x, function(x){
ggplot(data = x, aes(x = cat, y = var)) + geom_point() +
  facet_wrap(~grp)
  })

  if(length(p) < ncol){
pe <- rep(list(ggplot() + theme_void()), ncol - length(p))
p <- c(p, pe)
  }
  cowplot::plot_grid(plotlist = p, ncol = ncol)
})

cowplot::plot_grid(plotlist = plots_mydf, ncol = 1)

# Or if you prefer not to split the plots on the same row
plots_mydf <- lapply(s_mydf, function(x, ncol = 2){

  p <- list(ggplot(data = x, aes(x = cat, y = var)) + geom_point() +
facet_wrap(~grp))

  if(length(unique(x$grp)) < ncol){
pe <- rep(list(ggplot() + theme_void()), ncol - length(p))
p <- c(p, pe)
  }else{
ncol <- 1
  }
  cowplot::plot_grid(plotlist = p, ncol = ncol)
})

cowplot::plot_grid(plotlist = plots_mydf, ncol = 1)

```

On 2020-09-09 17:30, Ivan Calandra wrote:

Dear useRs,

I have an issue with the argument nrow of ggplot2::facet_wrap().

Let's consider some sample data:
mydf <- data.frame(grp = rep(letters[1:6], each = 15), cat = rep(1:3,
30), var = rnorm(90))

And let's try to plot with 5 rows:
library(ggplot2)
ggplot(data = mydf, aes(x = cat, y = var)) + geom_point() +
facet_wrap(~grp, nrow = 5)
It plots 2 rows and 3 columns rather than 5 rows and 2 columns as 
wanted.


These plots are as expected:
ggplot(data = mydf, aes(x = cat, y = var)) + geom_point() +
facet_wrap(~grp, nrow = 2)
ggplot(data = mydf, aes(x = cat, y = var)) + geom_point() +
facet_wrap(~grp, nrow = 6)

My guess is that 5 rows is not ideal for 6 facets (5 facets in 1st
column and only 1 facet for 2nd column) so it overrides the value of
nrow. In the case of 2 or 6 rows, the facets are well distributed in 
the

layout.

The reason why I need 5 rows with 6 facets is that this facet plot is
part of a patchwork and I would like to have the same number of rows 
for

all facet plots of the patchwork (so that they all align well).

Is there a way to force the number of rows in the facet_wrap()?

Thank you in advance.
Best,
Ivan

--


--
Dr. Ivan Calandra
TraCEr, laboratory for Traceology and Controlled Experiments
MONREPOS Archaeological Research Centre and
Museum for Human Behavioural Evolution
Schloss Monrepos
56567 Neuwied, Germany
+49 (0) 2631 9772-243
https://www.researchgate.net/profile/Ivan_Calandra

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] readxl question

2020-08-27 Thread Ulrik Stervbo via R-help

I clearly didn't read well enough. As Petr pointed out, there is also 
the col_names argument.


```
# Solution 4a

map_dfr(files, function(cur_file, ranges){
  map_dfc(ranges, function(cur_range, df){
read_excel(cur_file, sheet = 1, col_names = cur_range, range = 
cur_range)

  }, df = df)
}, ranges = ranges, .id = "filename")

```

On 2020-08-27 17:33, Ulrik Stervbo via R-help wrote:

Hi Thomas,

I am not familiar with the use of the range argument, but it seems to
me that the cell value becomes the column name. This might be fine,
but you might get into trouble if you have repeated cell values since
as.data.frame() will fix these.

I am also not sure about what you want, but this seems to capture your
example (reading the same cells in a number of files):

```
library(readxl)

# Create test set
path <- readxl_example("geometry.xls")

read_xls(path) # See the content

example_file1 <- tempfile(fileext = ".xls")
example_file2 <- tempfile(fileext = ".xls")

file.copy(path, example_file1, overwrite = TRUE)
file.copy(path, example_file2, overwrite = TRUE)

# Solve the problem using loops
files <- c(example_file1, example_file2)
ranges <- c("B4", "C5", "D6")

fr <- lapply(ranges, function(cur_range, files){
  x <- lapply(files, read_xls, sheet = 1, range = cur_range)
  t(as.data.frame(x))
}, files = files)

# Loop over fr and save content if needed
```

A couple of variations over the theme, where the cell content is
accessed after reading the file. This will not work well if the data
in the excel files does not start at A1, but if you can adjust for
this it should work just fine

```
# Solution #2

# Read the whole excel file, and access just the column - row
# This will give really unexpected results if the data does not start 
in the
# cell A1 as is the case for geometry.xls. Also, it does not work with 
ranges

# spaning more than a single cell
files <- rep(readxl_example("datasets.xlsx"), 3)
ranges <- c("B4", "C5", "D6")

# Loop over the files to avoid re-reading
fr <- lapply(files, function(cur_file, ranges){
  df <- read_excel(cur_file, sheet = 1)
  x <- lapply(ranges, function(cur_range, df){
cr <- cellranger::as.cell_addr(cur_range, strict = FALSE)
df[cr$row, cr$col][[1]]
  }, df = df)
  as.data.frame(setNames(x, ranges))

}, ranges = ranges)

# Solution 3
# Like solution 2 but using purr

library(purrr)

files <- rep(readxl_example("datasets.xlsx"), 3)
ranges <- c("B4", "C5", "D6")

map_dfr(files, function(cur_file, ranges){
  map_dfc(ranges, function(cur_range, df){
df <- read_excel(cur_file, sheet = 1)
cr <- cellranger::as.cell_addr(cur_range, strict = FALSE)
setNames(df[cr$row, cr$col], cur_range)
  }, df = df)

}, ranges = ranges)

# Solution 4
# Like solution 3, but with the addition of the file name and producing 
a single

# data.frame at the end

library(purrr)

path <- readxl_example("datasets.xls")
example_file1 <- tempfile(fileext = "_1.xls")
example_file2 <- tempfile(fileext = "_2.xls")
example_file3 <- tempfile(fileext = "_3.xls")

file.copy(path, example_file1, overwrite = TRUE)
file.copy(path, example_file2, overwrite = TRUE)
file.copy(path, example_file3, overwrite = TRUE)

files <- c(example_file1, example_file2, example_file3)

# Name the file paths with the file names. We can them make use of the 
.id

# argument to map_dfr()
files <- setNames(files, basename(files))
ranges <- c("B4", "C5", "D6")

map_dfr(files, function(cur_file, ranges){
  map_dfc(ranges, function(cur_range, df){
df <- read_excel(cur_file, sheet = 1)
cr <- cellranger::as.cell_addr(cur_range, strict = FALSE)
setNames(df[cr$row, cr$col], cur_range)
  }, df = df)
}, ranges = ranges, .id = "filename")
```

HTH
Ulrik

On 2020-08-26 15:38, PIKAL Petr wrote:

Hi

As OP has only about 250 files and in read_excel you cannot specify 
several

ranges at once, reading those values separately and concatenating them
together in one step seems to be the most efficient way. One probably 
could
design such function, but time spent on the function performing the 
task

only once is probably bigger than performing 250*3 reads.

I see inefficiency in writing each column into separate text file and
coppying it back to Excel file.

Cheers
Petr


-Original Message-
From: Upton, Stephen (Steve) (CIV) 
Sent: Wednesday, August 26, 2020 2:44 PM
To: PIKAL Petr ; Thomas Subia 


Cc: r-help@r-project.org
Subject: RE: [R] readxl question

From your example, it appears you are reading in the same excel file 
for
each function to get a value. I would look at creating a function 
that
extracts what you need from each file all at once, rather than 
separate

reads.

Stephen C. Upton
SEED (Simulation

Re: [R] readxl question

2020-08-27 Thread Ulrik Stervbo via R-help

Hi Thomas,

I am not familiar with the use of the range argument, but it seems to me 
that the cell value becomes the column name. This might be fine, but you 
might get into trouble if you have repeated cell values since 
as.data.frame() will fix these.

I am also not sure about what you want, but this seems to capture your 
example (reading the same cells in a number of files):

```
library(readxl)

# Create test set
path <- readxl_example("geometry.xls")

read_xls(path) # See the content

example_file1 <- tempfile(fileext = ".xls")
example_file2 <- tempfile(fileext = ".xls")

file.copy(path, example_file1, overwrite = TRUE)
file.copy(path, example_file2, overwrite = TRUE)

# Solve the problem using loops
files <- c(example_file1, example_file2)
ranges <- c("B4", "C5", "D6")

fr <- lapply(ranges, function(cur_range, files){
  x <- lapply(files, read_xls, sheet = 1, range = cur_range)
  t(as.data.frame(x))
}, files = files)

# Loop over fr and save content if needed
```

A couple of variations over the theme, where the cell content is 
accessed after reading the file. This will not work well if the data in 
the excel files does not start at A1, but if you can adjust for this it 
should work just fine

```
# Solution #2

# Read the whole excel file, and access just the column - row
# This will give really unexpected results if the data does not start in 
the
# cell A1 as is the case for geometry.xls. Also, it does not work with 
ranges

# spaning more than a single cell
files <- rep(readxl_example("datasets.xlsx"), 3)
ranges <- c("B4", "C5", "D6")

# Loop over the files to avoid re-reading
fr <- lapply(files, function(cur_file, ranges){
  df <- read_excel(cur_file, sheet = 1)
  x <- lapply(ranges, function(cur_range, df){
cr <- cellranger::as.cell_addr(cur_range, strict = FALSE)
df[cr$row, cr$col][[1]]
  }, df = df)
  as.data.frame(setNames(x, ranges))

}, ranges = ranges)

# Solution 3
# Like solution 2 but using purr

library(purrr)

files <- rep(readxl_example("datasets.xlsx"), 3)
ranges <- c("B4", "C5", "D6")

map_dfr(files, function(cur_file, ranges){
  map_dfc(ranges, function(cur_range, df){
df <- read_excel(cur_file, sheet = 1)
cr <- cellranger::as.cell_addr(cur_range, strict = FALSE)
setNames(df[cr$row, cr$col], cur_range)
  }, df = df)

}, ranges = ranges)

# Solution 4
# Like solution 3, but with the addition of the file name and producing 
a single

# data.frame at the end

library(purrr)

path <- readxl_example("datasets.xls")
example_file1 <- tempfile(fileext = "_1.xls")
example_file2 <- tempfile(fileext = "_2.xls")
example_file3 <- tempfile(fileext = "_3.xls")

file.copy(path, example_file1, overwrite = TRUE)
file.copy(path, example_file2, overwrite = TRUE)
file.copy(path, example_file3, overwrite = TRUE)

files <- c(example_file1, example_file2, example_file3)

# Name the file paths with the file names. We can them make use of the 
.id

# argument to map_dfr()
files <- setNames(files, basename(files))
ranges <- c("B4", "C5", "D6")

map_dfr(files, function(cur_file, ranges){
  map_dfc(ranges, function(cur_range, df){
df <- read_excel(cur_file, sheet = 1)
cr <- cellranger::as.cell_addr(cur_range, strict = FALSE)
setNames(df[cr$row, cr$col], cur_range)
  }, df = df)
}, ranges = ranges, .id = "filename")
```

HTH
Ulrik

On 2020-08-26 15:38, PIKAL Petr wrote:

Hi

As OP has only about 250 files and in read_excel you cannot specify 
several

ranges at once, reading those values separately and concatenating them
together in one step seems to be the most efficient way. One probably 
could
design such function, but time spent on the function performing the 
task

only once is probably bigger than performing 250*3 reads.

I see inefficiency in writing each column into separate text file and
coppying it back to Excel file.

Cheers
Petr

-Original Message-
From: Upton, Stephen (Steve) (CIV) 
Sent: Wednesday, August 26, 2020 2:44 PM
To: PIKAL Petr ; Thomas Subia 

Cc: r-help@r-project.org
Subject: RE: [R] readxl question

From your example, it appears you are reading in the same excel file 
for

each function to get a value. I would look at creating a function that
extracts what you need from each file all at once, rather than 
separate

reads.

Stephen C. Upton
SEED (Simulation Experiments & Efficient Designs) Center for Data 
Farming

SEED Center website: https://harvest.nps.edu

-Original Message-
From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of PIKAL 
Petr

Sent: Wednesday, August 26, 2020 3:50 AM
To: Thomas Subia 
Cc: r-help@r-project.org
Subject: Re: [R] readxl question

NPS WARNING: *external sender* verify before acting.

Hi

Are you sure that your command read values from respective cells?

I tried it and got empty data frame with names
> WO <- lapply(files, read_excel, sheet=1, range=("B3"))
> as.data.frame(WO)
[1] ano TP303   X96
[4] X0  X3.7518 X26.7
<0 rows> (or

Re: [R] Binomial PCA Using pcr()

2020-08-19 Thread Ulrik Stervbo via R-help


Hi Prasad,

I think this might be a problem with the package, and you can try to 
contact the package author.


The error seem to arise because the pcr() cannot find the 
'negative-binomial' distribution


```
library(qualityTools)
x <- rnbinom(500, mu = 4, size = 100)
pcr(x, distribution = "negative-binomial")
```

When I look in the code of pcr(), there is some testing against the 
words 'negative binomial' (note the missing -), although the 
documentation clearly lists 'negative-binomial' as a possible 
distribution.


Unfortunately changing 'negative-binomial' to 'negative binomial' does 
not help, as


```
pcr(x, distribution = "negative binomial")
```

throws the error "object '.confintnbinom' not found" and a lot of 
warnings.


Best,
Ulrik


On 2020-08-12 12:50, Prasad DN wrote:

Hi All,

i am very new to R and need guidance.

Need help in doing process capability Analysis for my data set (6 
months of

data) given in below format:

Date   |   Opportunities  |  Defectives | DefectivesInPercent

I searched and found that pcr() from QualityTools package can be used 
for

this purpose.  The USL is 2% defectives.

MyData = read.csv(file.choose())   #select  CSV file that has data in 
above

mentioned format.
x <- MyData$DefectivesInPercent

pcr(x, distribution = "negative-binomial", usl=0.02)

I get error message as:
Error in pcr(x, distribution = "negative-binomial", usl = 0.02) :
  y distribution could not be found!

Please advise, how to proceed?

Regards,
Prasad DN

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help with read.csv.sql()

2020-07-29 Thread Ulrik Stervbo via R-help

True, but the question was also how to control for formats and naming columns 
while loading the file.

The only way I know how to do this (sans work on my part) is through the 
functions in readr. So, 50% on topic :-)

Best,
Ulrik

On 29 Jul 2020, 17:59, at 17:59, Rasmus Liland  wrote:
>Dear Ulrik,
>
>On 2020-07-29 17:14 +0200, Ulrik Stervbo via R-help wrote:
>> library(readr)
>> read_csv(
>
>This thread was about
>sqldf::read.csv.sql ...
>
>What is the purpose of bringing up
>readr::read_csv?  I am unfamilliar with
>it, so it might be a good one.
>
>Best,
>Rasmus
>
>
>
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help with read.csv.sql()

2020-07-29 Thread Ulrik Stervbo via R-help


You might achieve this using readr:

```
library(readr)

lines <- "Id, Date, Time, Quality, Lat, Long
STM05-1, 2005/02/28, 17:35, Good, -35.562, 177.158
STM05-1, 2005/02/28, 19:44, Good, -35.487, 177.129
STM05-1, 2005/02/28, 23:01, Unknown, -35.399, 177.064
STM05-1, 2005/03/01, 07:28, Unknown, -34.978, 177.268
STM05-1, 2005/03/01, 18:06, Poor, -34.799, 177.027
STM05-1, 2005/03/01, 18:47, Poor, -34.85, 177.059
STM05-2, 2005/02/28, 12:49, Good, -35.928, 177.328
STM05-2, 2005/02/28, 21:23, Poor, -35.926, 177.314"

read_csv(lines)

read_csv(
  lines,
  skip = 1, # Ignore the header row
  col_names = c("myId", "myDate", "myTime", "myQuality", "myLat", 
"myLong"),

  col_types = cols(
myDate = col_date(format = ""),
myTime = col_time(format = ""),
myLat = col_number(),
myLong = col_number(),
.default = col_character()
  )
  )

read_csv(
  lines,
  col_types = cols_only(
Id = col_character(),
Date = col_date(format = ""),
Time = col_time(format = "")
  )
)

read_csv(
  lines,
  skip = 1, # Ignore the header row
  col_names = c("myId", "myDate", "myTime", "myQuality", "myLat", 
"myLong"),

  col_types = cols_only(
myId = col_character(),
myDate = col_date(format = ""),
myTime = col_time(format = "")
  )
)
```

HTH
Ulrik

On 2020-07-20 02:07, H wrote:

On 07/18/2020 01:38 PM, William Michels wrote:

Do either of the postings/threads below help?

https://r.789695.n4.nabble.com/read-csv-sql-to-select-from-a-large-csv-file-td4650565.html#a4651534
https://r.789695.n4.nabble.com/using-sqldf-s-read-csv-sql-to-read-a-file-with-quot-NA-quot-for-missing-td4642327.html

Otherwise you can try reading through the FAQ on Github:

https://github.com/ggrothendieck/sqldf

HTH, Bill.

W. Michels, Ph.D.



On Sat, Jul 18, 2020 at 9:59 AM H  wrote:

On 07/18/2020 11:54 AM, Rui Barradas wrote:

Hello,

I don't believe that what you are asking for is possible but like 
Bert suggested, you can do it after reading in the data.
You could write a convenience function to read the data, then change 
what you need to change.

Then the function would return this final object.

Rui Barradas

Às 16:43 de 18/07/2020, H escreveu:


On 07/17/2020 09:49 PM, Bert Gunter wrote:
Is there some reason that you can't make the changes to the data 
frame (column names, as.date(), ...) *after* you have read all 
your data in?


Do all your csv files use the same names and date formats?


Bert Gunter

"The trouble with having an open mind is that people keep coming 
along and sticking things into it."

-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Fri, Jul 17, 2020 at 6:28 PM H > wrote:


 I have created a dataframe with columns that are characters, 
integers and numeric and with column names assigned by me. I am 
using read.csv.sql() to read portions of a number of large csv 
files into this dataframe, each csv file having a header row with 
columb names.


 The problem I am having is that the csv files have header 
rows with column names that are slightly different from the column 
names I have assigned in the dataframe and it seems that when I 
read the csv data into the dataframe, the column names from the 
csv file replace the column names I chose when creating the 
dataframe.


 I have been unable to figure out if it is possible to assign 
column names of my choosing in the read.csv.sql() function? I have 
tried various variations but none seem to work. I tried colClasses 
= c() but that did not work, I tried field.types = c(...) but 
could not get that to work either.


 It seems that the above should be feasible but I am missing 
something? Does anyone know?


 A secondary issue is that the csv files have a column with a 
date in mm/dd/ format that I would like to make into a Date 
type column in my dataframe. Again, I have been unable to find a 
way - if at all possible - to force a conversion into a Date 
format when importing into the dataframe. The best I have so far 
is to import is a character column and then use as.Date() to later 
force the conversion of the dataframe column.


 Is it possible to do this when importing using 
read.csv.sql()?


 __
 R-help@r-project.org  mailing 
list -- To UNSUBSCRIBE and more, see

 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible 
code.


Yes, the files use the same column names and date format (at least 
as far as I know now.) I agree I could do it as you suggest above 
but from a purist perspective I would rather do it when importing 
the data using read.csv.sql(), particularly if column names and/or 
date format might change, or be different between different files. 
I am indeed selecting rows from a large number of

Re: [R] Help with read.csv.sql()

2020-07-29 Thread Ulrik Stervbo via R-help


You might achieve this using readr:

```
library(readr)

lines <- "Id, Date, Time, Quality, Lat, Long
STM05-1, 2005/02/28, 17:35, Good, -35.562, 177.158
STM05-1, 2005/02/28, 19:44, Good, -35.487, 177.129
STM05-1, 2005/02/28, 23:01, Unknown, -35.399, 177.064
STM05-1, 2005/03/01, 07:28, Unknown, -34.978, 177.268
STM05-1, 2005/03/01, 18:06, Poor, -34.799, 177.027
STM05-1, 2005/03/01, 18:47, Poor, -34.85, 177.059
STM05-2, 2005/02/28, 12:49, Good, -35.928, 177.328
STM05-2, 2005/02/28, 21:23, Poor, -35.926, 177.314"

read_csv(lines)

read_csv(
  lines,
  skip = 1, # Ignore the header row
  col_names = c("myId", "myDate", "myTime", "myQuality", "myLat", 
"myLong"),

  col_types = cols(
myDate = col_date(format = ""),
myTime = col_time(format = ""),
myLat = col_number(),
myLong = col_number(),
.default = col_character()
  )
  )

read_csv(
  lines,
  col_types = cols_only(
Id = col_character(),
Date = col_date(format = ""),
Time = col_time(format = "")
  )
)

read_csv(
  lines,
  skip = 1, # Ignore the header row
  col_names = c("myId", "myDate", "myTime", "myQuality", "myLat", 
"myLong"),

  col_types = cols_only(
myId = col_character(),
myDate = col_date(format = ""),
myTime = col_time(format = "")
  )
)
```

HTH
Ulrik

On 2020-07-20 02:07, H wrote:

On 07/18/2020 01:38 PM, William Michels wrote:

Do either of the postings/threads below help?

https://r.789695.n4.nabble.com/read-csv-sql-to-select-from-a-large-csv-file-td4650565.html#a4651534
https://r.789695.n4.nabble.com/using-sqldf-s-read-csv-sql-to-read-a-file-with-quot-NA-quot-for-missing-td4642327.html

Otherwise you can try reading through the FAQ on Github:

https://github.com/ggrothendieck/sqldf

HTH, Bill.

W. Michels, Ph.D.



On Sat, Jul 18, 2020 at 9:59 AM H  wrote:

On 07/18/2020 11:54 AM, Rui Barradas wrote:

Hello,

I don't believe that what you are asking for is possible but like 
Bert suggested, you can do it after reading in the data.
You could write a convenience function to read the data, then change 
what you need to change.

Then the function would return this final object.

Rui Barradas

Às 16:43 de 18/07/2020, H escreveu:


On 07/17/2020 09:49 PM, Bert Gunter wrote:
Is there some reason that you can't make the changes to the data 
frame (column names, as.date(), ...) *after* you have read all 
your data in?


Do all your csv files use the same names and date formats?


Bert Gunter

"The trouble with having an open mind is that people keep coming 
along and sticking things into it."

-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Fri, Jul 17, 2020 at 6:28 PM H > wrote:


 I have created a dataframe with columns that are characters, 
integers and numeric and with column names assigned by me. I am 
using read.csv.sql() to read portions of a number of large csv 
files into this dataframe, each csv file having a header row with 
columb names.


 The problem I am having is that the csv files have header 
rows with column names that are slightly different from the column 
names I have assigned in the dataframe and it seems that when I 
read the csv data into the dataframe, the column names from the 
csv file replace the column names I chose when creating the 
dataframe.


 I have been unable to figure out if it is possible to assign 
column names of my choosing in the read.csv.sql() function? I have 
tried various variations but none seem to work. I tried colClasses 
= c() but that did not work, I tried field.types = c(...) but 
could not get that to work either.


 It seems that the above should be feasible but I am missing 
something? Does anyone know?


 A secondary issue is that the csv files have a column with a 
date in mm/dd/ format that I would like to make into a Date 
type column in my dataframe. Again, I have been unable to find a 
way - if at all possible - to force a conversion into a Date 
format when importing into the dataframe. The best I have so far 
is to import is a character column and then use as.Date() to later 
force the conversion of the dataframe column.


 Is it possible to do this when importing using 
read.csv.sql()?


 __
 R-help@r-project.org  mailing 
list -- To UNSUBSCRIBE and more, see

 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible 
code.


Yes, the files use the same column names and date format (at least 
as far as I know now.) I agree I could do it as you suggest above 
but from a purist perspective I would rather do it when importing 
the data using read.csv.sql(), particularly if column names and/or 
date format might change, or be different between different files. 
I am indeed selecting rows from a large number of

Re: [R] Dataframe with different lengths

2020-07-29 Thread Ulrik Stervbo via R-help


Hi Pedro,

I see you use dplyr and ggplot2. Are you looking for something like 
this:


```
library(ggplot2)
library(dplyr)

test_data <- data.frame(
  year = c(rep("2018", 10), rep("2019", 8), rep("2020", 6)),
  value = sample(c(1:100), 24)
)

test_data <- test_data %>%
  group_by(year) %>%
  mutate(cumsum_value = cumsum(value),
 x_pos = 1:n())

ggplot(test_data) +
  aes(x = x_pos, y = cumsum_value, colour = year) +
  geom_point()
```

Best,
Ulrik

On 2020-07-22 13:16, Pedro páramo wrote:

Hi all,

I am trying to draw a plot with cumsum values but each "line" has 
different

lengths

Ilibrary(dplyr)
library(tibble)
library(lubridate)
library(PerformanceAnalytics)
library(quantmod)
library(ggplot2)

getSymbols('TSLA')

I want to create the variables:

a<-cumsum(dailyReturn(TSLA, subset = c('2019')) )
b<-cumsum(dailyReturn(TSLA, subset = c('2020')) )
c<-cumsum(dailyReturn(TSLA, subset = c('2018')) )

Each value, on a,b,c has two columns date, and values.

The thing is I want to plot the three lines in one plot with the 
maximum
values of a,b,c in this case a has 252 values, and plot the other two 
lines
could be in the axis I should put (x <- 1:252) on the axis but I was 
not

able for the moment.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Arranging ggplot2 objects with ggplotGrob()

2020-07-29 Thread Ulrik Stervbo via R-help


Then this should work:

```
library(ggplot2)
library(cowplot)

p1 <- ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width)) +  
geom_point()
p2 <- ggplot(iris, aes(x = Petal.Length, y = Petal.Width * 1000)) + 
geom_point()


plot_grid(p1, p2, ncol = 1, align = "hv", rel_heights = c(2, 1), axis = 
"t")


p1 <- p1 + theme(
  axis.text.x = element_blank(),
  axis.title.x = element_blank(),
  axis.ticks.x = element_blank()
)

plot_grid(p1, p2, ncol = 1, align = "hv", rel_heights = c(2, 1), axis = 
"t")


# You can play around with ggplot2 plot.margin to further reduce the 
space

p1 <- p1 + theme(
  plot.margin = margin(b = -6)
)

p2 <- p2 + theme(
  plot.margin = margin(t = -6)
)

plot_grid(p1, p2, ncol = 1, align = "hv", rel_heights = c(2, 1), axis = 
"t")

```

Best,
Ulrik

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Arranging ggplot2 objects with ggplotGrob()

2020-07-28 Thread Ulrik Stervbo via R-help


Would this work:

```
library(ggplot2)
library(cowplot)

p1 <- ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width)) + 
geom_point()
p2 <- ggplot(iris, aes(x = Petal.Length, y = Petal.Width * 1000)) + 
geom_point()


plot_grid(p1, p2, ncol = 1, align = "hv", rel_heights = c(2, 1))
```

Best,
Ulrik

On 2020-07-24 21:58, Bert Gunter wrote:

?grid.frame, etc. should be straightforward for this I would think.
But of course you have to resort to the underlying grid framework 
rather

than the ggplot2 interface.

Bert Gunter

"The trouble with having an open mind is that people keep coming along 
and

sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Fri, Jul 24, 2020 at 12:11 PM H  wrote:


On 07/24/2020 02:50 PM, H wrote:
> On 07/24/2020 02:03 PM, Jeff Newmiller wrote:
>> The set of people interested in helping when you supply a minimal
reproducible example is rather larger than the set of people willing 
to
read the documentation for you (hint) and guess what aspect of 
alignment

you are having trouble with.
>>
>> On July 24, 2020 10:46:57 AM PDT, H  wrote:
>>> On 07/24/2020 01:14 PM, John Kane wrote:
 Well, I am not looking for help debugging my code but for
>>> information to better understand arranging plots vertically. The code
>>> above aligns them horizontally as expected.
 Sigh, we know the code works but we do not know what the plots are
>>> and we cannot play around with them to see if we can help you if we
>>> have nothing to work with.
 On Fri, 24 Jul 2020 at 12:12, H >> > wrote:
 On 07/24/2020 05:29 AM, Erich Subscriptions wrote:
 > Hav a look at the packages cowplot and patchwork
 >
 >> On 24.07.2020, at 02:36, H >> > wrote:
 >>
 >> I am trying to arrange two plots vertically, ie plot 2 below
>>> plot 1, where I want the plots to align columnwise but have a height
>>> ratio of eg 3:1.
 >>
 >> My attempts so far after consulting various webpages is that
>>> the following code aligns them columnwise correctly but I have, so far,
>>> failed in setting the relative heights...
 >>
 >> g2<-ggplotGrob(s)
 >> g3<-ggplotGrob(v)
 >> g<-rbind(g2, g3, size = "first")
 >> g$widths<-unit.pmax(g2$widths, g3$widths)
 >>
 >> what would the appropriate statement for the relative heights
>>> to add here be?
 >>
 >> grid.newpage()
 >> grid.draw(g)
 >>
 >> Thank you!
 >>
 >> __
 >> R-help@r-project.org  mailing
>>> list -- To UNSUBSCRIBE and more, see
 >> https://stat.ethz.ch/mailman/listinfo/r-help
 >> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
 >> and provide commented, minimal, self-contained, reproducible
>>> code.
 So this is not possible without using one of those two packages?
>>> I got the impression I should be able to use grid.arrange to do so but
>>> was not able to get it to work without disturbing the width alignment
>>> above...
 __
 R-help@r-project.org  mailing list
>>> -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible
>>> code.
 --
 John Kane
 Kingston ON Canada
>>> No need to play around with anything. I am simply looking for
>>> assistance on how to use eg arrangeGrob to not only align two plots
>>> columnwise but also adjust their heights relative to each other rather
>>> than 1:1.
>>>
>>> Can arrangeGrob() be used for that?
>>>
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>
> Look at
https://cran.r-project.org/web/packages/egg/vignettes/Ecosystem.html
where there are two mpg charts, one above the other. What would I need 
to

add to:
>
> |library(gtable) g2 <-ggplotGrob(p2) g3 <-ggplotGrob(p3) g <-rbind(g2,
g3, size = "first") g$widths <-unit.pmax(g2$widths, g3$widths)
grid.newpage() grid.draw(g) |
>
> |to make the second chart 1/2 the size of the top one?|
>
> ||
>
The following code aligns the two plot areas of the two charts 
perfectly
but they are the same height whereas I want to make the bottom one 1/2 
as

tall as the top one:

g2<-ggplotGrob(s)
g3<-ggplotGrob(v)
g<-rbind(g2, g3, size = "first")
g$widths<-unit.pmax(g2$widths,

Re: [R] Filtering using multiple rows in dplyr

2018-05-31 Thread Ulrik Stervbo via R-help


Hi Sumitrajit,

dplyr has a function for this - it's called filter.

For each group you can count the number of SNR > 3 (you can use sum on 
true/false). You can filter on the results directly or add a column as 
you plan. The latter might make your intention more clear.


HTH
Ulrik

On 2018-05-30 18:18, Sumitrajit Dhar wrote:


Hi Folks,

I have just started using dplyr and could use some help getting
unstuck. It could well be that dplyr is not the package to be using,
but let me just pose the question and seek your advice.

Here is my basic data frame.

head(h)
subject ageGrp ear hearingGrp sex freq L2   Ldp Phidp
NF   SNR
1 HALAF032  A   L  A   F2  0 -23.54459  55.56005
-43.08282 19.538232
2 HALAF032  A   L  A   F2  2 -32.64881  86.22040
-23.31558 -9.333224
3 HALAF032  A   L  A   F2  4 -18.91058  42.12168
-35.60250 16.691919
4 HALAF032  A   L  A   F2  6 -23.85937 297.94499
-20.70452 -3.154846
5 HALAF032  A   L  A   F2  8 -14.45381 181.75329
-24.17094  9.717128
6 HALAF032  A   L  A   F2 10 -20.42384  67.12998
-35.77357 15.349728

'subject' and 'freq' together make a set of data and I am interested
in how the last four columns vary as a function of L2. So I grouped by
'subject' and 'freq' and can look at basic summaries.

h_byFunc <- h %>% group_by(subject, freq)


h_byFunc %>% summarize(l = mean(Ldp), s = sd(Ldp) )


# A tibble: 1,175 x 4
# Groups:   subject [?]
subject   freq   l s

1 HALAF032 2 -13.88.39
2 HALAF032 4 -15.8   11.0
3 HALAF032 8 -23.46.51
4 HALAF033 2 -14.29.64
5 HALAF033 4 -12.38.92
6 HALAF033 8  -6.55  12.3
7 HALAF036 2 -14.9   12.6
8 HALAF036 4 -16.7   11.2
9 HALAF036 8 -21.76.56
10 HALAF039 2   0.242 12.4
# ... with 1,165 more rows

What  I would like to do is filter some groups out based on various
criteria. For example, if SNR > 3 in three consecutive L2 within a
group, that group qualifies and I would add a column, say "clean" and
assign it a value "Y." Is there a way to do this in dplyr or should I
be looking at a different way.

Thanks in advance for your help.

Regards,
Sumit

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] removing part of a string

2018-05-21 Thread Ulrik Stervbo

I would use

sub("\\(.*\\)", "()", s)

It is essentially the same as Rui's suggestion, but I find the purpose to
be more clear. It might also be a little more efficient.

HTH
Ulrik

On Mon, 21 May 2018, 15:38 Rui Barradas,  wrote:

> Hello,
>
> Try this.
>
>
> ss1 <- "z:f(5, a=3, b=4, c='1:4', d=2)"
> ss2 <- "f(5, a=3, b=4, c=\"1:4\", d=2)*z"
>
> fun <- function(s) sub("(\\().*(\\))", "\\1\\2", s)
>
> fun(ss1)
> #[1] "z:f()"
>
> fun(ss2)
> #[1] "f()*z"
>
>
> Hope this helps,
>
> Rui Barradas
>
> On 5/21/2018 2:33 PM, Vito M. R. Muggeo wrote:
> > dear all,
> > I am stuck on the following problem. Give a string like
> >
> > ss1<- "z:f(5, a=3, b=4, c='1:4', d=2)"
> >
> > or
> >
> > ss2<- "f(5, a=3, b=4, c=\"1:4\", d=2)*z"
> >
> > I would like to remove all entries within parentheses.. Namely, I aim to
> > obtain respectively
> >
> > "z:f()" or "f()*z"
> >
> > I played with sub() and gsub() but without success..
> > Thank you very much for your time,
> >
> > best,
> > vito
> >
> >
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] questions on subscores.

2018-05-14 Thread Ulrik Stervbo

I have no idea, but Google pointed me to this
https://cran.r-project.org/web/packages/subscore/index.html

Hth
Ulrik

"Hyunju Kim"  schrieb am Di., 15. Mai 2018, 07:21:

>
>
>
> Hello,
>
>
>
> I want to compute  Wainer et al's augmented subscore(2001) using IRT but I
> can't find any packages or relevant code.
>
> There is a 'subscore' package in R, but I think it only gives functions to
> calculate augmented subscores using CTT(classical test theory).
>
> Is there other package or code to compute subscores using IRT? If it is
> not, is there any plan to develop it in short period?
>
>
>
>
>
>
>
> Best regards,
>
> Hyunju Kim
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Quandl data download error

2018-05-14 Thread Ulrik Stervbo

Hi Christofer,

it works for me. Perhaps you need up update a  package?

Best wishes,
Ulrik

> sessionInfo()
R version 3.4.4 (2018-03-15)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04 LTS

Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1

locale:
[1] LC_CTYPE=de_DE.UTF-8   LC_NUMERIC=C
[3] LC_TIME=de_DE.UTF-8LC_COLLATE=de_DE.UTF-8
[5] LC_MONETARY=de_DE.UTF-8LC_MESSAGES=de_DE.UTF-8
[7] LC_PAPER=de_DE.UTF-8   LC_NAME=C
[9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] Quandl_2.8.0 xts_0.10-2   zoo_1.8-1

loaded via a namespace (and not attached):
[1] httr_1.3.1  compiler_3.4.4  R6_2.2.2tools_3.4.4
[5] curl_3.2grid_3.4.4  jsonlite_1.5lattice_0.20-35


On Mon, 14 May 2018 at 13:53 Christofer Bogaso 
wrote:

> Hi,
>
> I use Quandl package to download data from Quandl https://www.quandl.com
>
> Today when I tried to download data from there, I received below error :
>
> > Quandl('LME/PR_CO')
> Error in curl::curl_fetch_memory(url, handle = handle) :
>gnutls_handshake() failed: An unexpected TLS packet was received.
>
> I am using Quandl_2.8.0 in below platform
>
> R version 3.4.4 (2018-03-15)
> Platform: x86_64-pc-linux-gnu (64-bit)
> Running under: Ubuntu 16.04.3 LTS
>
> Any idea why I am getting above error suddenly?
>
> Thanks for your help
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] using apply

2018-05-02 Thread Ulrik Stervbo

Hi Neha,

Perhaps merge() from base or join from dplyr is what you are looking for.
data. table could also be interesting.

Hth
Ulrik

On Wed, 2 May 2018, 21:28 Neha Aggarwal,  wrote:

>  Hi
>
> I have 3 dataframes, a,b,c with 0/1 values...i have to check a condition
> for dataframe a and b and then input the rows ids to datframe c . In the if
> condition, I AND the 2 rows of from a and b and then see if the result is
> equal to one of them.
> I have done this using a for loop, however, it takes a long time to execute
> with larger dataset..Can you help me do it using apply function so that i
> can do it faster?
>
> a
>   V1.x V2.x V3.x V1.y V2.y V3.y
> 1110101
> 2101101
> 3111101
> 4000101
>
> b
>   V1 V2 V3 V4 V5 V6
> 1  1  0  1  1  0  0
> 2  1  0  1  0  0  0
>
> c
>  xy
> 1  21
> 2  22
> 3  31
> 4  32
>
> for(i in 1:nrow(a)){
>   for(j in 1:nrow(b)){
> if(all((a[i,][j,])==b[j,]))
> { c[nrow(c)+1, ]<-c(paste(i,j)
>   }
>   }
> }
>
>
> Thanks,
> Neha
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Hacked

2018-04-17 Thread Ulrik Stervbo

I asked the moderators about it. This is the reply

"Other moderators have looked into this a bit and may be able to shed more
light on it. This is a "new" tactic where the spammers appear to reply to
the r-help post. They are not, however, going through the r-help server.

It also seems that this does not happen to everyone.

I am not sure how you can automatically block the spammers.

Sorry I cannot be of more help."

--Ulrik

Jeff Newmiller  schrieb am Di., 17. Apr. 2018,
14:59:

> Likely a spammer has joined the mailing list and is auto-replying to posts
> made to the list. Unlikely that the list itself has been "hacked". Agree
> that it is obnoxious.
>
> On April 17, 2018 5:01:10 AM PDT, Neotropical bat risk assessments <
> neotropical.b...@gmail.com> wrote:
> >Hi all,
> >
> >Site has been hacked?
> >Bad SPAM arriving
> >
> >__
> >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
>
> --
> Sent from my phone. Please excuse my brevity.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help with R-Calling forth csv.

2018-04-16 Thread Ulrik Stervbo

There are plenty of options for reading csv files. For built-in solutions
look at ?read.csv or at read_csv from the package reader.

If the measurements are ordered in columns rather than in rows, reading the
data can be very slow.

HTH
Ulrik

Mohammad Areida  schrieb am Mo., 16. Apr. 2018, 13:25:

> Hi, I'm working on R trying to find a distribution that fits data from a
> csv file. The csv contains data on pressure exerted by a certain vehicle in
> terms of pressure [kPa] and I have around 3000 data points.
>
> I want to call forth this csv and by using (fitdistr) or if you could
> recommend a function to use, get a plot of my csv and the distributions I
> can compare it to (Weibull, chi, beta, etc). Now im a complete amateur with
> the program R, and I can't write a code to call forth the csv with every
> data point. I’ve been stuck trying to write a code that works for over a
> week and have not gotten it to work. I've come to the conclusion that this
> is the only program capable enough to help me plot the distributions to my
> data, any help is greatly appreciated!
>
> Kind regards
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] regex for "[2440810] / www.tinyurl.com/hgaco4fha3"

2018-02-20 Thread Ulrik Stervbo

Hi Omar,

you are almost there but! Your first substitution looks 'www' as the
start of the line followed by anything (which then do nothing), so your
second substitution removes everything from the first '.' to be found
(which is the one after www).

What you want to do is
x <- "[2440810] / www.tinyurl.com/hgaco4fha3"

y <- sub('www\\.', '', x) # Note the escape of '.'
y <- sub('\\..*', '', y)
y

Altrenatively, all in one (if all addresses are .com)
gsub("(www\\.|\\.com.*)", "", x)

And the same using stringr
library(stringr)
x %>% str_replace_all("(www\\.|\\.com.*)", "")

HTH
Ulrik

On Wed, 21 Feb 2018 at 06:20 Omar André Gonzáles Díaz <
oma.gonza...@gmail.com> wrote:

> Hi, I need help for cleaning this:
>
> "[2440810] / www.tinyurl.com/hgaco4fha3"
>
> My desired output is:
>
> "[2440810] / tinyurl".
>
> My attemps:
>
> stringa <- "[2440810] / www.tinyurl.com/hgaco4fha3"
>
> b <- sub('^www.', '', stringa) #wanted  to get rid of "www." part. Until
> first dot.
>
> b <- sub('[.].*', '', b) #clean from ".com" until the end.
>
> b #returns ""[2440810] / www"
>
> Thank you.
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help with regular expressions

2018-02-12 Thread Ulrik Stervbo

I think I would replace all , with . and subsequently replace all first .
with , using ^\\.

x <- gsub(",", ".", x)
gsub("^\\.", ",", x)

It's not so elegant, but it is easier to understand than backreferences and
complex regex.

Best,
Ulrik

On Tue, 13 Feb 2018, 03:38 Boris Steipe,  wrote:

> You can either use positive lookahead/lookbehind - but support for that is
> a bit flaky. Or write a proper regex, and use
> backreferences to keep what you need.
>
> R > x <- "abc 1,1 ,1 1, x,y 2,3 "
>
> R > gsub("(\\d),(\\d)", "\\1.\\2", x, perl = TRUE)
> [1] "abc 1.1 ,1 1, x,y 2.3 "
>
>
> B.
>
>
>
> > On Feb 12, 2018, at 9:34 PM, Jim Lemon  wrote:
> >
> > Hi Dennis,
> > How about:
> >
> >
> > # define the two values to search for
> > x<-2
> > y<-3
> > # create your search string and replacement string
> > repstring<-paste(x,y,sep=",")
> > newstring<-paste(x,y,sep=".")
> > # this is the string that you want to change
> > thetastring<-"SIGMA(2,3)"
> > sub(repstring,newstring,thetastring)
> > [1] "SIGMA(2.3)"
> >
> > Use gsub if you want to change multiple values
> >
> > Jim
> >
> > On Tue, Feb 13, 2018 at 1:22 PM, Dennis Fisher 
> wrote:
> >> R 3.4.2
> >> OS X
> >>
> >> Colleagues
> >>
> >> I would appreciate some help with regular expressions.
> >>
> >> I have string that looks like:
> >>" ITERATION  ,THETA1 ,THETA2
>  ,THETA3 ,THETA4 ,THETA5
>  ,THETA6 ,THETA7 ,SIGMA(1,1)
>  ,SIGMA(2,1) ,SIGMA(2,2)”
> >>
> >> In the entries that contain:
> >>(X,Y)   # for example, SIGMA(1,1)
> >> I would like to replace the comma with a period, e.g., SIGMA(1.1) but
> NOT the other commas
> >>
> >> The end-result would be:
> >>" ITERATION  ,THETA1 ,THETA2
>  ,THETA3 ,THETA4 ,THETA5
>  ,THETA6 ,THETA7 ,SIGMA(1.1)
>  ,SIGMA(2.1) ,SIGMA(2.2)”
> >>
> >> Can someone provide the regular expression code to accomplish this?
> >> Thanks.
> >>
> >> Dennis
> >>
> >> Dennis Fisher MD
> >> P < (The "P Less Than" Company)
> >> Phone / Fax: 1-866-PLessThan (1-866-753-7784)
> >> www.PLessThan.com
> >>
> >> __
> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Newbie wants to compare 2 huge RDSs row by row.

2018-01-28 Thread Ulrik Stervbo

The anti_join from the package dplyr might also be handy.

install.package("dplyr")
library(dplyr)
anti_join (x1, x2)

You can get help on the different functions by ?function.name(), so
?anti_join() will bring you help - and examples - on the anti_join
function.

It might be worth testing your approach on a small subset of the data. That
makes it easier for you to follow what happens and evaluate the outcome.

HTH
Ulrik

Marsh Hardy ARA/RISK <mha...@ara.com> schrieb am So., 28. Jan. 2018, 04:14:

> Cool, looks like that'd do it, almost as if converting an entire record to
> a character string and comparing strings.
>
>   --  M. B. Hardy, statistician
> work: Applied Research Associates, S. E. Div.
>   8537 Six Forks Rd., # 6000 / Raleigh, NC 27615
> <https://maps.google.com/?q=8537+Six+Forks+Rd.,+%23+6000+/+Raleigh,+NC+27615=gmail=g>
> -2963
>   (919) 582-3329, fax: 582-3301
> home: 1020 W. South St. / Raleigh, NC 27603
> <https://maps.google.com/?q=1020+W.+South+St.+/+Raleigh,+NC+27603=gmail=g>
> -2162
>   (919) 834-1245
> 
> From: William Dunlap [wdun...@tibco.com]
> Sent: Saturday, January 27, 2018 4:57 PM
> To: Marsh Hardy ARA/RISK
> Cc: Ulrik Stervbo; Eric Berger; r-help@r-project.org
> Subject: Re: [R] Newbie wants to compare 2 huge RDSs row by row.
>
> If your two objects have class "data.frame" (look at class(objectName))
> and they
> both have the same number of columns and the same order of columns and the
> column types match closely enough (use all.equal(x1, x2) for that), then
> you can try
>  which( rowSums( x1 != x2 ) > 0)
> E.g.,
> > x1 <- data.frame(X=1:5, Y=rep(c("A","B"),c(3,2)))
> > x2 <- data.frame(X=c(1,2,-3,-4,5), Y=rep(c("A","B"),c(2,3)))
> > x1
>   X Y
> 1 1 A
> 2 2 A
> 3 3 A
> 4 4 B
> 5 5 B
> > x2
>X Y
> 1  1 A
> 2  2 A
> 3 -3 B
> 4 -4 B
> 5  5 B
> > which( rowSums( x1 != x2 ) > 0)
> [1] 3 4
>
> If you want to allow small numeric differences but exactly character
> matches
> you will have to get a bit fancier.  Splitting the data.frames into
> character and
> numeric parts and comparing each works well.
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com<http://tibco.com>
>
> On Sat, Jan 27, 2018 at 1:18 PM, Marsh Hardy ARA/RISK <mha...@ara.com
> <mailto:mha...@ara.com>> wrote:
> Hi Guys, I apologize for my rank & utter newness at R.
>
> I used summary() and found about 95 variables, both character and numeric,
> all with "Length:368842" I assume is the # of records.
>
> I'd like to know the record number (row #?) of any record where the data
> doesn't match in the 2 files of what should be the same output.
>
> Thanks in advance, M.
>
> //
> 
> From: Ulrik Stervbo [ulrik.ster...@gmail.com ulrik.ster...@gmail.com>]
> Sent: Saturday, January 27, 2018 10:00 AM
> To: Eric Berger
> Cc: Marsh Hardy ARA/RISK; r-help@r-project.org<mailto:r-help@r-project.org
> >
> Subject: Re: [R] Newbie wants to compare 2 huge RDSs row by row.
>
> Also, it will be easier to provide helpful information if you'd describe
> what in your data you want to compare and what you hope to get out of the
> comparison.
>
> Best wishes,
> Ulrik
>
> Eric Berger <ericjber...@gmail.com<mailto:ericjber...@gmail.com> ericjber...@gmail.com<mailto:ericjber...@gmail.com>>> schrieb am Sa., 27.
> Jan. 2018, 08:18:
> Hi Marsh,
> An RDS is not a data structure such as a data.frame. It can be anything.
> For example if I want to save my objects a, b, c I could do:
> > saveRDS( list(a,b,c,), file="tmp.RDS")
> Then read them back later with
> > myList <- readRDS( "tmp.RDS" )
>
> Do you have additional information about your "RDSs" ?
>
> Eric
>
>
> On Sat, Jan 27, 2018 at 6:54 AM, Marsh Hardy ARA/RISK <mha...@ara.com
> <mailto:mha...@ara.com><mailto:mha...@ara.com<mailto:mha...@ara.com>>>
> wrote:
>
> > Each RDS is 40 MBs. What's a slick code to compare them row by row, IDing
> > row numbers with mismatches?
> >
> > Thanks in advance.
> >
> > //
> >
> > __
> > R-help@r-project.org<mailto:R-help@r-project.org> R-help@r-project.org<mailto:R-help@r-project.org>> mailing list -- To
> UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> > posting-guide.html
> > and provide commented

Re: [R] Newbie wants to compare 2 huge RDSs row by row.

2018-01-27 Thread Ulrik Stervbo

Also, it will be easier to provide helpful information if you'd describe
what in your data you want to compare and what you hope to get out of the
comparison.

Best wishes,
Ulrik

Eric Berger  schrieb am Sa., 27. Jan. 2018, 08:18:

> Hi Marsh,
> An RDS is not a data structure such as a data.frame. It can be anything.
> For example if I want to save my objects a, b, c I could do:
> > saveRDS( list(a,b,c,), file="tmp.RDS")
> Then read them back later with
> > myList <- readRDS( "tmp.RDS" )
>
> Do you have additional information about your "RDSs" ?
>
> Eric
>
>
> On Sat, Jan 27, 2018 at 6:54 AM, Marsh Hardy ARA/RISK 
> wrote:
>
> > Each RDS is 40 MBs. What's a slick code to compare them row by row, IDing
> > row numbers with mismatches?
> >
> > Thanks in advance.
> >
> > //
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> > posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Newbie - Scrape Data From PDFs?

2018-01-23 Thread Ulrik Stervbo

I think I would use pdftk to extract the form data. All subsequent
manipulation in R.

HTH
Ulrik

Eric Berger  schrieb am Mi., 24. Jan. 2018, 08:11:

> Hi Scott,
> I have never done this myself but I read something recently on the
> r-help distribution that was related.
> I just did a quick search and found a few hits that might work for you.
>
> 1.
> https://medium.com/@CharlesBordet/how-to-extract-and-clean-data-from-pdf-files-in-r-da11964e252e
> 2. http://bxhorn.com/2016/extract-data-tables-from-pdf-files-in-r/
> 3.
> https://www.rdocumentation.org/packages/textreadr/versions/0.7.0/topics/read_pdf
>
> HTH,
> Eric
>
> On Wed, Jan 24, 2018 at 3:58 AM, Scott Clausen 
> wrote:
> > Hello,
> >
> > I’m new to R and am using it with RStudio to learn the language. I’m
> doing so as I have quite a lot of traffic data I would like to explore. My
> problem is that all the data is located on a number of PDFs. Can someone
> point me to info on gathering data from other sources? I’ve been to the R
> FAQ and didn’t see anything and would appreciate your thoughts.
> >
> >  I am quite sure now that often, very often, in matters concerning
> religion and politics a man's reasoning powers are not above the monkey's.
> >
> > -- Mark Twain
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] file creating

2017-12-10 Thread Ulrik Stervbo

You could loop over the file names, read each excel file and store the
individual data frames in a list using lapply.

I prefer to read excel files with the package readxl.

The code could be along the lines of

library(readxl)
my_files <- c("file1", "file2")

lapply(my_files, read_excel)

HTH
Ulrik

Partha Sinha  schrieb am Mo., 11. Dez. 2017, 08:13:

> I am using R(3.4.3), Win 7(extreme edition) 32 bit,
> I have 10 excel files data1,xls, data2.xls .. till data10.xls.
> I want to create 10 dataframes . How to do ?
> regards
> Parth
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Scatterplot of many variables against a single variable

2017-11-27 Thread Ulrik Stervbo

ggplot and facets might be useful.

Ulrik

Ismail SEZEN  schrieb am Mo., 27. Nov. 2017, 14:06:

>
> > On 27 Nov 2017, at 13:59, Engin YILMAZ  wrote:
> >
> > Dear Berger and Jim
> >
> > Can you see my eviews example in the annex? (scattersample.jpg)
> >
> > Sincerely
> > Engin
>
> Please, use an image hosting service (i.e. https://imgbb.com/) to share
> images in the list and share the link in the email.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] fill histogram in ggplot

2017-11-07 Thread Ulrik Stervbo

Hi Elahe,

You pass 'probable' to the fill aesthetic along the lines of;

ggplot(hist) +
aes(x=mms, fill = probable) +   geom_histogram(binwidth=1)

The NAs might give you three and not two colours.

I'm guessing you want distinct colours. In this case 'probable' should be a
factor and not an integer.

HTH
Ulrik

Elahe chalabi via R-help  schrieb am Di., 7. Nov.
2017, 18:35:

> Hi all,
>
> I have the following data and I have a histogram for mms like
>
>  ggplot(hist,aes(x=hist$mms))+
>  geom_histogram(binwidth=1,fill="white",color="black")and then I want to
> fill the color of histogram by probable=1 and probable=0, could anyone help
> me in this?
>
> My data:
> structure(list(probable = c(1L, 0L, 1L, 1L, 0L, 1L, 0L, 1L, 1L,
> 0L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 1L, NA, 0L, 1L, NA, NA, 0L, 1L,
> NA, 1L, 0L, 0L, 1L, 0L, 1L, 1L, 1L, 1L, 0L, 1L, 0L, 0L, 0L, 1L,
> 1L, 0L, NA, 1L, NA, NA, 1L, 0L, 1L, 0L, 0L, 0L, 1L, 1L, 0L, 1L,
> 1L, NA, 0L, 0L, 1L, 0L, 1L, NA, 1L, 0L, 0L, 0L, 0L, 0L, 1L, NA,
> 0L, 1L, 0L, 1L, NA, 0L, 1L, NA, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L,
> 1L, 1L, 0L, 0L, 1L, 0L, NA, 1L, 0L, 1L, 0L, 1L, 1L, 0L, 0L, 1L,
> 0L, NA, 1L, 0L, 0L, 1L, 1L, 0L, 1L, 1L, NA, 0L, NA, 1L, 0L, 1L,
> 1L, 1L, 1L, NA, 1L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, NA,
> 1L, NA, NA, 0L, 1L, 0L, 0L, 0L, NA, 1L, NA, NA, 1L, 0L, 1L, 1L,
> 0L, 1L, 0L, NA, 1L, 1L, 1L, NA, 0L, 0L, 1L, NA, NA, 1L, 0L, 0L,
> NA, 1L, 1L, 1L, 0L, 0L, NA, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 0L, NA,
> 0L, 0L, 0L, 0L, 0L, NA, 0L, NA, 0L, 1L, NA, 1L, 1L, 1L, 1L, 1L,
> 0L, 1L, 1L, NA, 0L, 1L, NA, 1L, 1L, 1L, 1L, 0L, 1L, NA, 1L, 0L,
> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, NA, NA, 1L, 1L,
> 1L, 1L, NA, NA, 1L, NA, 1L, 1L, 1L, NA, NA, NA, 1L, NA, 1L, NA,
> NA, NA, 1L, 1L, 1L, NA, NA, NA, 1L, 1L, NA, 1L, NA, 1L, 1L, 1L,
> NA, 1L, 1L, NA, NA, NA, NA, NA, 1L, 1L, 1L, 1L, NA, 1L, NA, NA,
> NA, NA, 0L, 0L, 0L, 0L, 0L, NA, 1L, 1L, 0L, 0L, 1L, NA, NA, NA,
> NA, 1L, NA, NA, 1L, NA, 1L, 1L, NA, 1L, 1L, NA, NA, 1L, 1L, NA,
> NA, 1L, NA, 1L, NA, 1L, NA, NA, NA, NA, NA, NA, 1L, NA, 1L, 1L,
> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 0L, NA, NA, 1L, NA, NA, NA, 1L, NA,
> 1L, 1L, 1L, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 1L, NA, NA,
> NA, 1L, 1L, NA, 1L, 1L, 1L, NA, NA, NA, 1L, 1L, NA, NA, 1L, 1L,
> 1L, 1L, NA, NA, NA, NA, 1L, 1L, 1L, 1L, NA, NA, 1L, NA, 1L, NA,
> NA, NA, NA, NA, 1L, NA, NA, NA, 1L, NA, 1L, 1L, 1L, NA, NA, 1L,
> 1L, 1L, 1L, 1L, 1L, 1L, 1L, NA, 1L, NA, 1L, 1L, 1L, 1L, 0L, NA,
> 1L, 1L, NA, 1L, 1L, NA, NA, NA, NA, 1L, 1L, 0L, 1L, 0L, 0L, 1L,
> 1L, NA, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, NA,
> 1L, 1L, 1L, NA, 0L, 1L, NA, 0L, 0L, 0L, 1L, 1L, NA, 1L, 1L, NA,
> 0L, 0L, 1L, NA, 0L, 0L, 1L, 0L, NA, 1L, 0L, 0L, 1L, 1L, 1L, 1L,
> 1L, NA, NA, 0L, 1L, 1L, NA, 1L, NA, NA, 0L, NA, 1L, NA, NA, 1L,
> NA, 1L, 1L, NA, 1L), mms = c(18L, 30L, 20L, 23L, 28L, 22L, 30L,
> 20L, 14L, 30L, 16L, 28L, 28L, 29L, 11L, 28L, 27L, 16L, 29L, 27L,
> 21L, 29L, 26L, 29L, 23L, 28L, 24L, 30L, 30L, 18L, 30L, 23L, 19L,
> 18L, 26L, 30L, 13L, 29L, 30L, 30L, 27L, 26L, 28L, 26L, 25L, 29L,
> 26L, 23L, 29L, 23L, 28L, 29L, 30L, 25L, 17L, 30L, 14L, 24L, 19L,
> 30L, 30L, 27L, 29L, 18L, 22L, 8L, 30L, 28L, 30L, 30L, 30L, 30L,
> 29L, 30L, 17L, 30L, 10L, 24L, 30L, 28L, 30L, 30L, 24L, 29L, 27L,
> 30L, 30L, 29L, 30L, 30L, 23L, 30L, 27L, 10L, 29L, 17L, 28L, 30L,
> 19L, 30L, 24L, 15L, 30L, 29L, 20L, 28L, 27L, 10L, 29L, 30L, 29L,
> 20L, 28L, 25L, 27L, 25L, 29L, 22L, 17L, 28L, 19L, 20L, 16L, 25L,
> 25L, 13L, 30L, 28L, 30L, 30L, 29L, 16L, 20L, 25L, 17L, 17L, 29L,
> 18L, 18L, 16L, 29L, 19L, 26L, 30L, 30L, 16L, 18L, 14L, 28L, 12L,
> 26L, 29L, 25L, 28L, 12L, 30L, 29L, 16L, 22L, 25L, 15L, 30L, 28L,
> 22L, 25L, 27L, 17L, 29L, 30L, 28L, 17L, 23L, 21L, 28L, 30L, 18L,
> 21L, 29L, 26L, 22L, 23L, 26L, 20L, 29L, 15L, 29L, 30L, 28L, 29L,
> 29L, 25L, 27L, 23L, 30L, 14L, 27L, 18L, 16L, 19L, 13L, 18L, 28L,
> 13L, 24L, 30L, 30L, 20L, 25L, 29L, 11L, 20L, 16L, 30L, 16L, 30L,
> 20L, 30L, 20L, 22L, 19L, 23L, 16L, 19L, 13L, 13L, 25L, 19L, 9L,
> 14L, 25L, 23L, 24L, 20L, 24L, 18L, 18L, 19L, 10L, 20L, 11L, 17L,
> 19L, 22L, 6L, 18L, 25L, 17L, 25L, 22L, 29L, 22L, 16L, 8L, 10L,
> 27L, 10L, 18L, 19L, 22L, 28L, 19L, 18L, 25L, 18L, 3L, 10L, 16L,
> 10L, 29L, 22L, 29L, 12L, 15L, 24L, 17L, 6L, 5L, 14L, 27L, 13L,
> 28L, 16L, 20L, 29L, 27L, 30L, 30L, 29L, 30L, 28L, 11L, 30L, 29L,
> 6L, 30L, 16L, 9L, 23L, 10L, 23L, 17L, 24L, 9L, 9L, 18L, 18L,
> 20L, 5L, 27L, 22L, 22L, 8L, 14L, 26L, 24L, 29L, 21L, 23L, 22L,
> 23L, 25L, 6L, 20L, 20L, 2L, 8L, 19L, 20L, 5L, 17L, 20L, 15L,
> 5L, 5L, 20L, 21L, 29L, 25L, 13L, 10L, 21L, 25L, 26L, 13L, 10L,
> 7L, 11L, 18L, 19L, 14L, 27L, 22L, 18L, 7L, 22L, 11L, 12L, 16L,
> 15L, 27L, 27L, 17L, 13L, 16L, 25L, 7L, 19L, 10L, 29L, 20L, 25L,
> 25L, 24L, 15L, 20L, 9L, 5L, 17L, 28L, 24L, 7L, 5L, 13L, 30L,
> 12L, 11L, 15L, 26L, 27L, 19L, 10L, 26L, 29L, 17L, 29L, 18L, 13L,
> 17L, 14L, 14L, 9L, 19L, 7L, 4L, 20L, 14L, 22L, 27L, 16L, 18L,
> 10L, 20L, 12L, 21L, 22L, 5L, 12L, 14L, 13L, 11L, 12L, 20L, 6L,
> 28L, 9L,

Re: [R] Help in R

2017-11-05 Thread Ulrik Stervbo

And

head(test_df$Movie, 10)

For function completeness :-)

Rui Barradas <ruipbarra...@sapo.pt> schrieb am So., 5. Nov. 2017, 20:56:

> Hello,
>
> Also
>
> tail(test_df$Movie, 10)
>
> Hope this helps,
>
> Rui Barradas
>
> Em 05-11-2017 19:18, Ulrik Stervbo escreveu:
> > R can have a bit of a learning curve... There are several ways to achieve
> > your goal - depending on what you want:
> >
> > test_df <- data.frame(Movie = letters, some.value = rnorm(26))
> >
> > test_df$Movie[1:10]
> >
> > test_df$Movie[sample(c(1:26), 10)]
> >
> > test_df[sample(c(1:26), 10), ]
> >
> > Do read a tutorial or two on R - "Introduction to R" as suggested by
> David
> > or something else - so you can explain the code above to yourself.
> >
> > HTH
> > Ulrik
> >
> > On Sun, 5 Nov 2017 at 19:38 David Winsemius <dwinsem...@comcast.net>
> wrote:
> >
> >>
> >>> On Nov 5, 2017, at 9:28 AM, Ahsan Zahir via R-help <
> r-help@r-project.org>
> >> wrote:
> >>>
> >>>
> >>> Hey,
> >>>
> >>> I am a beginner in R.
> >>>
> >>> How do I read last 10 values from column- Movie, from a dataset?
> >>
> >> Some questions are so simple that they strongly suggest no prior effort
> at
> >> self-leanrning. In such cases the usual recommendation given at Rhelp is
> >> that you read an introductory text. Many of us used the "Introduction
> to R"
> >> that is shipped with every copy of R:
> >>
> >> https://cran.r-project.org/doc/manuals/r-release/R-intro.pdf
> >>
> >>
> >>>
> >>> Pls help.
> >>>
> >>> Sent from my iPhone
> >>>
> >>> __
> >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >>> and provide commented, minimal, self-contained, reproducible code.
> >>
> >> David Winsemius
> >> Alameda, CA, USA
> >>
> >> 'Any technology distinguishable from magic is insufficiently advanced.'
> >>   -Gehm's Corollary to Clarke's Third Law
> >>
> >> __
> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help in R

2017-11-05 Thread Ulrik Stervbo

R can have a bit of a learning curve... There are several ways to achieve
your goal - depending on what you want:

test_df <- data.frame(Movie = letters, some.value = rnorm(26))

test_df$Movie[1:10]

test_df$Movie[sample(c(1:26), 10)]

test_df[sample(c(1:26), 10), ]

Do read a tutorial or two on R - "Introduction to R" as suggested by David
or something else - so you can explain the code above to yourself.

HTH
Ulrik

On Sun, 5 Nov 2017 at 19:38 David Winsemius  wrote:

>
> > On Nov 5, 2017, at 9:28 AM, Ahsan Zahir via R-help 
> wrote:
> >
> >
> > Hey,
> >
> > I am a beginner in R.
> >
> > How do I read last 10 values from column- Movie, from a dataset?
>
> Some questions are so simple that they strongly suggest no prior effort at
> self-leanrning. In such cases the usual recommendation given at Rhelp is
> that you read an introductory text. Many of us used the "Introduction to R"
> that is shipped with every copy of R:
>
> https://cran.r-project.org/doc/manuals/r-release/R-intro.pdf
>
>
> >
> > Pls help.
> >
> > Sent from my iPhone
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius
> Alameda, CA, USA
>
> 'Any technology distinguishable from magic is insufficiently advanced.'
>  -Gehm's Corollary to Clarke's Third Law
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Creating Tag

2017-11-01 Thread Ulrik Stervbo

Hi Hermant,

It sounds lile grep from base or str_detect from the Stringr package is
what you want.

Best,
Ulrik

Hemant Sain  schrieb am Mi., 1. Nov. 2017, 08:31:

> i want to tag categories to its menuname.
> i have a csv containing menu item name and in other csv i have a column
> containing some strings,
> i want to pick that strings from categories and look into  menu items if
> any menu item containing that string i want to create a new column next to
> menu item name flagged as 1 otherwise 0
> and the only condition is once a menu item flagged as 1 i don't need to
> consider that menu item again to tag further in case of redundant strings
> in categories only i want to search which are flagged as 0.
> please help me with the R script.
>
> *Menu Name*
> 9\ bobbie"
> 9\ chz steak"
> 9\ tuna"
> provolone
> 20\ bobbie"
> bottled soda 20oz
> cran-slam ww
> american
> small chips
> medium drink
> 9\ meatball"
> capriotti's water
> 20'' chicken cheese steak
> 9\ veg turkey"
> medium chips
> 9\ capastrami"
> 12\ bobbie"
> 12'' chicken cheese steak
> cookie
> 12\ chz steak"
> 9\ cole turkey"
> kid grilled cheese white
> 12\ italian"
> 12\ meatball"
> 12\ capastrami"
> turkey sand w
> 20\ slaw be jo"
> swiss
> 12\ cole turkey"
> large drink
> 9\ ham"
> 9'' chicken cheese steak
> 9\ slaw be jo"
> turkey sand ww
> stuffing
> 12\ turkey"
> 9\ italian"
> 12\ slaw be jo"
> 9\ grld italian"
> 12\ veg burger w/chz"
> extra american
> black salad
> 9\ turkey"
> 20\ turkey"
> 20\ capastrami"
> ham sand w
> 12\ mushroom"
> 12\ grld italia"
> italian salad
> tuna sand ww
> 9\ roast beef"
> 20\ chz steak"
> 20\ mushroom"
> 9\ veg chzstk"
> ham
> genoa
> 12\ veg turkey"
> 12\ veggie cole turkey"
> 9\ mushroom"
> cap's creation
> mushrooms
> salad chicken
> 20\ cole turkey"
> 1 pack chicken
> kr  veg burger w/chz
> 12\ roast beef"
> kid turkey n cheese white
> 20\ italian"
> 12\ ham"
> 9\ employee sub"
> roast beef kr
> 9\ veggie cole turkey"
> 12\ sausage"
> tea
> turkey sand kr
> salad turkey
> tuna sand kr
> brownie
> slice american cheese
> 1 oz pastrami
> 9\ cheese"
> 12\ italian up"
> 12\ capastrami up"
> 1 pack steak
> delaware's finest small
> the sampler sm
> side ranch dressing
> 12\ veg turkey up"
> 20\ roast beef"
> roast beef w
> 1oz turkey
> 12\ tuna"
> 20\ veg turkey"
> 12\ veg chzstk"
> 9\ sausage"
> kid ham n cheese white
> side italian dressing
> salad provolone
> 20\ grld italia"
> sample item 6
> sample item 9
> turkey
> 12\ slaw be jo up"
> 12\ meatball up"
> 1 oz roast beef
> ham sand ww
> delaware's finest large
> side cole slaw large
> large chips
> 20\ meatball"
> 12'' chick cheese stk up
> 12\ chz steak up"
> 12\ grld italia up"
> cran-slam w
> 12\ bobbie up"
> 20\ cheese"
> slice provolone
> the sampler lg
> meatball bar
> slice ham
> wise large chips
> small side
> sm soup
> 12\ tuna up"
> 12\ cole turkey up"
> prosciutini
> 20\ veggie cole turkey"
> soup
> roast beef
> 20\ italian up"
> 20'' chick cheese stk up
> 20\ chz steak up"
> 20\ bobbie up"
> 20\ veg turkey up"
> slice swiss
> 20\ capastrami up"
> sample item 7
> 12\ ham up"
> salad swiss
> 12\ veggie cheese stk up"
> california omelet
> orange juice
> exteme bac boy
> dinner salad
> chef salad
> 12\ turkey up"
> the big cheese
> combo it
> fries cmb
> sm pepsi
> km cheese burger
> #NAME?
> #NAME?
> kid grilled cheese wheat
> kids ham cheese white box
> calif. blt
> bacon turkey melt
> coffee
> 1-pc pancake
> fries
> tuna sand
> biscuits & gravy
> s-1/3 patty
> x avocado
> x chez
> s-bacon
> #NAME?
> xtra egg
> lg bev upcharge
> sm ice tea
> small soup
> roast beef ww
> salad tuna
> med pepsi
> 20\ veg chzstk up"
> day nm egg san
> s-chkn brest
> bell pepper
> fruit
> 1 slice veggie turkey
> s-toast
> x sausage
> 1 pack sausage
> chicken salad
> lg pepsi
> x dressing
> large side
> 9\ firecracker turkey"
> 20\ sausage up"
> 20\ turkey up"
> 20\ veg chzstk"
> lg ice tea
> 12\ roast beef up"
> sample item 8
> catering cap creation sal
> the turkey lover sm
> little italy sm
> cookie tray
> sd chk avo san
> sm fry / 2 pc zucch
> kid turkey n cheese wheat
> junior chz burger
> extra italian meat
> chili chz fries
> sm fry / 2 pc o ring
> extra turkey
>
>
>
>
>
>
>
>
> *CATEGORIES*
> non-veg
> chix
> salmon
> ch
> chkn
> brisket
> brskt
> bacon
> bcn
> chse
> mahi
> dog
> shk
> clam
> parmesan
> asiago
> prosciutto
> prosciutti
> salami
> Angus
> hicken
> chk
> chick
> wings
> prk
> pork
> ham
> bacon
> ribs
> fish
> shrimp
> tuna
> beef
> steak
> stk
> meatball
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and

Re: [R] Regular expression help

2017-10-09 Thread Ulrik Stervbo

Hi Duncan,

why not split on / and take the correct elements? It is not as elegant as
regex but could do the trick.

Best,
Ulrik

On Mon, 9 Oct 2017 at 17:03 Duncan Murdoch  wrote:

> I have a file containing "words" like
>
>
> a
>
> a/b
>
> a/b/c
>
> where there may be multiple words on a line (separated by spaces).  The
> a, b, and c strings can contain non-space, non-slash characters. I'd
> like to use gsub() to extract the c strings (which should be empty if
> there are none).
>
> A real example is
>
> "f 147/1315/587 2820/1320/587 3624/1321/587 1852/1322/587"
>
> which I'd like to transform to
>
> " 587 587 587 587"
>
> Another real example is
>
> "f 1067 28680 24462"
>
> which should transform to "   ".
>
> I've tried a few different regexprs, but am unable to find a way to say
> "transform words by deleting everything up to and including the 2nd
> slash" when there might be zero, one or two slashes.  Any suggestions?
>
> Duncan Murdoch
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] example of geom_contour() with function argument

2017-10-09 Thread Ulrik Stervbo

Hi BFD,

?geom_contour() *does* have helpful examples. Your Google-foo is weak:
Searching for geom_contour brought me:
http://ggplot2.tidyverse.org/reference/geom_contour.html as the first
result.

HTH
Ulrik

On Mon, 9 Oct 2017 at 08:04 Big Floppy Dog  wrote:

> Can someone please point me to an example with geom_contour() that uses a
> function? The help does not have an example of a function, and also  I did
> not find anything from online searches.
>
> TIA,
> BFD
>
>
>
> ---
>
> How about geom_contour()?
>
> Am So., 8. Okt. 2017, 20:52 schrieb Ranjan Maitra :
>
> > Hi,
> >
> > I am no expert on ggplot2 and I do not know the answer to your question.
> I
> > looked around a bit but could not find an answer right away. But one
> > possibility could be, if a direct approach is not possible, to draw
> > ellipses corresponding to the confidence regions of the multivariate t
> > density and use geom_polygon to draw this successively?
> >
> > I will wait for a couple of days to see if there is a better answer
> posted
> > and then write some code, unless you get to it first.
> >
> > Thanks,
> > Ranjan
> >
> >
> > On Sun, 8 Oct 2017 09:30:30 -0500 Big Floppy Dog  >
> > wrote:
> >
> > > Note: I have posted this on SO also but while the question has been
> > > upvoted, there has been no answer yet.
> > >
> > >
> >
>
> https://stackoverflow.com/questions/46622243/ggplot-plot-2d-probability-density-function-on-top-of-points-on-ggplot
> > >
> > > Apologies for those who have seen it there also but I thought that this
> > > list of experts may have someone who knows the answer.
> > >
> > > I have the following example code:
> > >
> > >
> > >
> > > require(mvtnorm)
> > > require(ggplot2)
> > > set.seed(1234)
> > > xx <- data.frame(rmvt(100, df = c(13, 13)))
> > > ggplot(data = xx,  aes(x = X1, y= X2)) + geom_point() +
> geom_density2d()
> > >
> > >
> > >
> > > It yields a scatterplot of X2 against X1 and a KDE contour plot of the
> > > density (as it should).
> > >
> > > My question is: is it possible to change the contour plot to display
> > > the contours
> > >
> > > of a two-dimensional density function (say dmvt), using ggplot2?
> > >
> > > The remaining figures in my document are in ggplot2 and therefore I
> > > am looking for a ggplot2 solution.
> > >
> > > Thanks in advance!
> > >
> > > BFD
> > >
> > >   [[alternative HTML version deleted]]
> > >
> > > __
> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> > >
> >
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to overlay 2d pdf atop scatter plot using ggplot2

2017-10-08 Thread Ulrik Stervbo

How about geom_contour()?

Am So., 8. Okt. 2017, 20:52 schrieb Ranjan Maitra :

> Hi,
>
> I am no expert on ggplot2 and I do not know the answer to your question. I
> looked around a bit but could not find an answer right away. But one
> possibility could be, if a direct approach is not possible, to draw
> ellipses corresponding to the confidence regions of the multivariate t
> density and use geom_polygon to draw this successively?
>
> I will wait for a couple of days to see if there is a better answer posted
> and then write some code, unless you get to it first.
>
> Thanks,
> Ranjan
>
>
> On Sun, 8 Oct 2017 09:30:30 -0500 Big Floppy Dog 
> wrote:
>
> > Note: I have posted this on SO also but while the question has been
> > upvoted, there has been no answer yet.
> >
> >
> https://stackoverflow.com/questions/46622243/ggplot-plot-2d-probability-density-function-on-top-of-points-on-ggplot
> >
> > Apologies for those who have seen it there also but I thought that this
> > list of experts may have someone who knows the answer.
> >
> > I have the following example code:
> >
> >
> >
> > require(mvtnorm)
> > require(ggplot2)
> > set.seed(1234)
> > xx <- data.frame(rmvt(100, df = c(13, 13)))
> > ggplot(data = xx,  aes(x = X1, y= X2)) + geom_point() + geom_density2d()
> >
> >
> >
> > It yields a scatterplot of X2 against X1 and a KDE contour plot of the
> > density (as it should).
> >
> > My question is: is it possible to change the contour plot to display
> > the contours
> >
> > of a two-dimensional density function (say dmvt), using ggplot2?
> >
> > The remaining figures in my document are in ggplot2 and therefore I
> > am looking for a ggplot2 solution.
> >
> > Thanks in advance!
> >
> > BFD
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>
> --
> Important Notice: This mailbox is ignored: e-mails are set to be deleted
> on receipt. Please respond to the mailing list if appropriate. For those
> needing to send personal or professional e-mail, please use appropriate
> addresses.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] rename multiple files by file.rename or other functions

2017-09-28 Thread Ulrik Stervbo

Hi John,

I don't know how to do this with R, but on Linux I'd use rename (or maybe
even by hand if it's a one time event). On Windows I believe there is a
tool called Bulk Rename.

HTH
Ulrik

On Thu, 28 Sep 2017 at 11:37 John  wrote:

> Hi,
>
>I have 50 files whose names are
>
> XYZW01Genesis_ABC.mp3
> XYZW02Genesis_ABC.mp3
> ...
> XYZW50Genesis_ABC.mp3
>
>As you can tell, the only difference across the files are 01, 02,
> 03,50.
>
>I would like to rename them to
> 01Gen01.mp3
> 01Gen02.mp3
> ...
> 01Gen50.mp3
>
>   If I store them in one folder and write an R code in that folder, how can
> it be done?
>
>Thanks,
>
> John
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] disturbed legend in ggplot2

2017-09-27 Thread Ulrik Stervbo

Hi Troels,

Try to move the size argument out of the aesthetic.

Best wishes,
Ulrik

On Mi., 27. Sep. 2017, 08:51 Troels Ring  wrote:

> Dear friends - below is a subset of a much larger material showing two
> ways of generating two "lines". The intention is to have the colour
> reflect a variable, pH, but the legend is disturbed. The little part
> marked "3" above the colour scale is unwelcome. Why did it appear? How
> could I avoid it?
>
> I'm on Windows 7, R version 3.4.1 (2017-06-30) -- "Single Candle"
>
> All best wishes
>
> Troels Ring
> Aalborg, Denmark
>
> library(ggplot2)
> DF1 <-
> structure(list(P = c(0, 0.00222, 0.00444,
> 0.00667, 0.00889, 0.0111,
> 0.0133, 0.0156, 0.0178, 0.02,
> 0, 0.00222, 0.00444, 0.00667,
> 0.00889, 0.0111, 0.0133,
> 0.0156, 0.0178, 0.02), pH = c(12.3979595548431,
> 12.3129161148613, 12.2070984076445, 12.0669463736967, 11.8586790792785,
> 11.4437319273717, 7.64497330556925, 6.98905682614207, 6.63520883742788,
> 6.3229313658492, 12.176061323132, 12.0234712172719, 11.7861230637902,
> 11.2219147985144, 7.14240749824074, 6.53119941380901, 5.95522932117427,
> 3.25184520894594, 2.55614400932465, 2.30097494287507), BC =
> c(0.0576574111386315,
> 0.047331331067055, 0.037206026657832, 0.0268607893098731,
> 0.0166183791472022,
> 0.00639593998967551, 0.0033597279094, 0.00854377959176608,
> 0.00987464693654883, 0.00863636089604445, 0.0343718830720469,
> 0.0242985554593397, 0.0140710602077036, 0.00383913993097999,
> 0.00439784065436743, 0.00582135949288444, 0.00336240952299985,
> 0.00129948001017736, 0.00640073762860721, 0.0115158433720248),
>  SID = c(25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 15, 15, 15,
>  15, 15, 15, 15, 15, 15, 15)), .Names = c("P", "pH", "BC",
> "SID"), row.names = c(NA, -20L), class = "data.frame")
>
> df1 <- subset(DF1,SID==25)
> df2 <- subset(DF1,SID==15)
> v <- ggplot()
> v <- v + geom_line(data=df1, aes(x=P, y=BC,col=pH,size=3))
> v1 <- v + geom_line(data=df2, aes(x=P, y=BC,col=pH,size=3))
>
> v <- ggplot()
> v <- v + geom_line(data=DF1, aes(x=P, y=BC,group=SID,col=pH,size=3))
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Load R data files

2017-09-12 Thread Ulrik Stervbo

The object you load has the same name as the object you saved. In this case
datahs0csv and not the name of the file sans .rda

On Di., 12. Sep. 2017, 21:26 AbouEl-Makarim Aboueissa <
abouelmakarim1...@gmail.com> wrote:

> Dear All:
>
>
> It was saved, but there was a space somewhere. So it works for me now.
>
> I do have another similar problem.
>
> I saved an R data file
>
>
> save(datahs0csv,file="
> F:\Fall_2017\5-STA574\2-Notes\1-R\1-R_new\chapter4-Entering_Data/
> datahs0csv2.rda")
>
> *The new R data file "*datahs0csv2.rda*" is in the directory.*
>
> I tried to load the file "" to R, but I got an error message. Please see
> below.
>
> >
> *load(file="F:/Fall_2017/5-STA574/2-Notes/1-R/1-R_new/chapter4-Entering_Data/datahs0csv2.rda")*
> >
> It seems for me that the file was loaded to R. But when I typed the data
> name, it says that the not found.
>
> > *datahs0csv2*
>
> *Error: object 'datahs0csv2' not found*
>
>
> with many thanks
> abou
>
> On Tue, Sep 12, 2017 at 2:53 PM, Ulrik Stervbo <ulrik.ster...@gmail.com>
> wrote:
>
>> Hi Abou,
>>
>> You haven't saved the datahs0csv.
>>
>> When you are done manipulating datahs0csv you can use save(datahs0csv,
>> file = 'datahs0csv.rda'). Then you should be able to load the data.
>> HTH
>> Ulrik
>>
>> On Tue, 12 Sep 2017, 20:46 AbouEl-Makarim Aboueissa <
>> abouelmakarim1...@gmail.com> wrote:
>>
>>> Dear All:
>>>
>>> I am trying to load an R data set, but I got the following message.
>>> Please
>>> see below. The file is there.
>>>
>>> setwd("F:/Fall_2017/5-STA574/2-Notes/1-R/1-R_new/chapter4-Entering_Data")
>>>
>>> datahs0csv <- read.table("hs0.csv", header=T, sep=",")
>>> attach(datahs0csv)
>>>
>>> detach(datahs0csv)
>>> rm(list=ls())
>>>
>>> Then I tried to reload the data, but I got this error message. I am not
>>> sure what was wrong.
>>>
>>> *> load("datahs0csv.rda")*
>>>
>>> Error in readChar(con, 5L, useBytes = TRUE) : cannot open the connection
>>> In addition: Warning message:
>>> In readChar(con, 5L, useBytes = TRUE) :
>>>   cannot open compressed file 'datahs0csv.rda', probable reason 'No such
>>> file or directory'
>>>
>>>
>>> Any help will be appreciated.
>>>
>>>
>>> with thanks
>>> abou
>>>
>>> __
>>> AbouEl-Makarim Aboueissa, PhD
>>> Professor of Statistics
>>> Department of Mathematics and Statistics
>>> University of Southern Maine
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>
>
> --
> __
> AbouEl-Makarim Aboueissa, PhD
> Department of Mathematics and Statistics
> University of Southern Maine
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Load R data files

2017-09-12 Thread Ulrik Stervbo

Hi Abou,

You haven't saved the datahs0csv.

When you are done manipulating datahs0csv you can use save(datahs0csv, file
= 'datahs0csv.rda'). Then you should be able to load the data.
HTH
Ulrik

On Tue, 12 Sep 2017, 20:46 AbouEl-Makarim Aboueissa <
abouelmakarim1...@gmail.com> wrote:

> Dear All:
>
> I am trying to load an R data set, but I got the following message. Please
> see below. The file is there.
>
> setwd("F:/Fall_2017/5-STA574/2-Notes/1-R/1-R_new/chapter4-Entering_Data")
>
> datahs0csv <- read.table("hs0.csv", header=T, sep=",")
> attach(datahs0csv)
>
> detach(datahs0csv)
> rm(list=ls())
>
> Then I tried to reload the data, but I got this error message. I am not
> sure what was wrong.
>
> *> load("datahs0csv.rda")*
>
> Error in readChar(con, 5L, useBytes = TRUE) : cannot open the connection
> In addition: Warning message:
> In readChar(con, 5L, useBytes = TRUE) :
>   cannot open compressed file 'datahs0csv.rda', probable reason 'No such
> file or directory'
>
>
> Any help will be appreciated.
>
>
> with thanks
> abou
>
> __
> AbouEl-Makarim Aboueissa, PhD
> Professor of Statistics
> Department of Mathematics and Statistics
> University of Southern Maine
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Nested loop R code

2017-09-08 Thread Ulrik Stervbo

Hi Hemant,

please write to the r-help list in the future.

Look at the cut () function to solve your problem.

Also, you have a problem in your example - 5 is placed in two different
categories.

HTH
Ulrik

On Fri, 8 Sep 2017 at 12:16 Hemant Sain  wrote:

> i have a vector containing values ranging from 0 to 24
> i want to create another variable which can categorize those values  like
> this
> please help me with an R code
>
> Thanks
>
> *Value   New_Var*10 -5
> 30 -5
> 50 -5
> 96-10
> 76-10
> 56-10
> 4 0-5
> 11  11-15
> 12 11-15
> 18  16-20
> 23  21 -25
>
>
> --
> hemantsain.com
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Dataframe Manipulation

2017-09-05 Thread Ulrik Stervbo

Hi Hemant,

data_help <- data_help %>%
# Add a dummy index for each purchase to keep a memory of the purchase
since it will dissappear later on. You could also use row number
mutate(Purchase_ID = 1:n()) %>%
# For each purchase id
group_by(Purchase_ID) %>%
# Call the split_items function, which returns a data.frame
do(split_items(.))

cat_help %>%
# Make the data.frame long where the column names are gathered in a dummy
column and the items (the content of each column) in another column called
Item
gather("Foo", "Item") %>%
filter(!is.na(Item)) %>%
left_join(data_help, by = "Item") %>%
group_by(Foo, Purchase_ID) %>%
# Combine the items for each purchase and item type and make a wide
data.frame
summarise(Item = paste(Item, collapse = ", ")) %>%
spread(key = "Foo", value = "Item")

I suggest that you read the book [R for Data Science](http://r4ds.had.co.nz/)
by Garrett Grolemund and Hadley Wickham

Best wishes,
Ulrik

On Mo., 4. Sep. 2017, 09:31 Hemant Sain <hemantsai...@gmail.com> wrote:

> Hello Ulrik,
> Can you please explain this code means how and what this code is doing
> because I'm not able to understand it, if you can explain it i can use it
> in future by doing some Lil bit manipulation.
>
> Thanks
>
>
> data_help <-
>   data_help %>%
>   mutate(Purchase_ID = 1:n()) %>%
>   group_by(Purchase_ID) %>%
> do(split_items(.))
>
> cat_help %>% gather("Foo", "Item") %>%
>   filter(!is.na(Item)) %>%
> left_join(data_help, by = "Item") %>%
>   group_by(Foo, Purchase_ID) %>%
>   summarise(Item = paste(Item, collapse = ", ")) %>%
>   spread(key = "Foo", value = "Item")
>
> On 31 August 2017 at 13:17, Ulrik Stervbo <ulrik.ster...@gmail.com> wrote:
>
>> Hi Hemant,
>>
>> the solution is really quite similar, and the logic is identical:
>>
>> library(readr)
>> library(dplyr)
>> library(stringr)
>> library(tidyr)
>>
>> data_help <- read_csv("data_help.csv")
>> cat_help <- read_csv("cat_help.csv")
>>
>> # Helper function to split the Items and create a data_frame
>> split_items <- function(items){
>>   x <- items$Items_purchased_on_Receipts %>%
>> str_split(pattern = ",") %>%
>> unlist(use.names = FALSE)
>>
>>   data_frame(Item = x, Purchase_ID = items$Purchase_ID)
>> }
>>
>> data_help <-
>>   data_help %>%
>>   mutate(Purchase_ID = 1:n()) %>%
>>   group_by(Purchase_ID) %>%
>> do(split_items(.))
>>
>> cat_help %>% gather("Foo", "Item") %>%
>>   filter(!is.na(Item)) %>%
>> left_join(data_help, by = "Item") %>%
>>   group_by(Foo, Purchase_ID) %>%
>>   summarise(Item = paste(Item, collapse = ", ")) %>%
>>   spread(key = "Foo", value = "Item")
>>
>> HTH
>> Ulrik
>>
>> On Wed, 30 Aug 2017 at 13:22 Hemant Sain <hemantsai...@gmail.com> wrote:
>>
>>> by using these two tables we have to create third table in this format
>>> where categories will be on the top and transaction will be in the rows,
>>>
>>> On 30 August 2017 at 16:42, Hemant Sain <hemantsai...@gmail.com> wrote:
>>>
>>>> Hello Ulrik,
>>>> Can you please once check this code again on the following data set
>>>> because it doesn't giving same output to me due to absence of quantity,a
>>>> compare to previous demo data set becaue spiting is getting done on the
>>>> basis of quantity and in real data set quantity is missing. so please use
>>>> following data set and help me out please consider this mail is my final
>>>> email i won't bother you again but its about my job please help me
>>>> .
>>>>
>>>> Note* the file I'm attaching is very confidential
>>>>
>>>> On 30 August 2017 at 15:02, Ulrik Stervbo <ulrik.ster...@gmail.com>
>>>>  wrote:
>>>>
>>>>> Hi Hemant,
>>>>>
>>>>> Does this help you along?
>>>>>
>>>>> table_1 <- textConnection("Item_1;Item_2;Item_3
>>>>> 1KG banana;300ML milk;1kg sugar
>>>>> 2Large Corona_Beer;2pack Fries;
>>>>> 2 Lux_Soap;1kg sugar;")
>>>>>
>>>>> table_1 <- read.csv(table_1, sep = ";", na.strings = "",
>>>>> stringsAsFactors = FALSE, check.names = FALSE)
>>>

Re: [R] Dataframe Manipulation

2017-08-31 Thread Ulrik Stervbo

Hi Hemant,

the solution is really quite similar, and the logic is identical:

library(readr)
library(dplyr)
library(stringr)
library(tidyr)

data_help <- read_csv("data_help.csv")
cat_help <- read_csv("cat_help.csv")

# Helper function to split the Items and create a data_frame
split_items <- function(items){
  x <- items$Items_purchased_on_Receipts %>%
str_split(pattern = ",") %>%
unlist(use.names = FALSE)

  data_frame(Item = x, Purchase_ID = items$Purchase_ID)
}

data_help <-
  data_help %>%
  mutate(Purchase_ID = 1:n()) %>%
  group_by(Purchase_ID) %>%
do(split_items(.))

cat_help %>% gather("Foo", "Item") %>%
  filter(!is.na(Item)) %>%
left_join(data_help, by = "Item") %>%
  group_by(Foo, Purchase_ID) %>%
  summarise(Item = paste(Item, collapse = ", ")) %>%
  spread(key = "Foo", value = "Item")

HTH
Ulrik

On Wed, 30 Aug 2017 at 13:22 Hemant Sain <hemantsai...@gmail.com> wrote:

> by using these two tables we have to create third table in this format
> where categories will be on the top and transaction will be in the rows,
>
> On 30 August 2017 at 16:42, Hemant Sain <hemantsai...@gmail.com> wrote:
>
>> Hello Ulrik,
>> Can you please once check this code again on the following data set
>> because it doesn't giving same output to me due to absence of quantity,a
>> compare to previous demo data set becaue spiting is getting done on the
>> basis of quantity and in real data set quantity is missing. so please use
>> following data set and help me out please consider this mail is my final
>> email i won't bother you again but its about my job please help me
>> .
>>
>> Note* the file I'm attaching is very confidential
>>
>> On 30 August 2017 at 15:02, Ulrik Stervbo <ulrik.ster...@gmail.com>
>> wrote:
>>
>>> Hi Hemant,
>>>
>>> Does this help you along?
>>>
>>> table_1 <- textConnection("Item_1;Item_2;Item_3
>>> 1KG banana;300ML milk;1kg sugar
>>> 2Large Corona_Beer;2pack Fries;
>>> 2 Lux_Soap;1kg sugar;")
>>>
>>> table_1 <- read.csv(table_1, sep = ";", na.strings = "",
>>> stringsAsFactors = FALSE, check.names = FALSE)
>>>
>>> table_2 <-
>>> textConnection("Toiletries;Fruits;Beverages;Snacks;Vegetables;Clothings;Dairy
>>> Products
>>> Soap;banana;Corona_Beer;King Burger;Pumpkin;Adidas Sport Tshirt XL;milk
>>> Shampoo;Mango;Red Label Whisky;Fries;Potato;Nike Shorts Black L;Butter
>>> Showergel;Oranges;grey Cocktail;cheese pizza;Tomato;Puma Jersy red
>>> M;sugar
>>> Lux_Soap;;2 Large corona Beer;;Cheese;Toothpaste")
>>>
>>> table_2 <- read.csv(table_2, sep = ";", na.strings = "",
>>> stringsAsFactors = FALSE, check.names = FALSE)
>>>
>>> library(tidyr)
>>> library(dplyr)
>>>
>>> table_2 <- gather(table_2, "Category", "Item")
>>>
>>> table_1 <- gather(table_1, "Foo", "Item") %>%
>>>   filter(!is.na(Item))
>>>
>>> table_1 <- separate(table_1, col = "Item", into = c("Quantity", "Item"),
>>> sep = " ")
>>>
>>> table_3 <- left_join(table_1, table_2, by = "Item") %>%
>>>   mutate(Item = paste(Quantity, Item)) %>%
>>>   select(-Quantity)
>>>
>>> table_3 %>%
>>>   group_by(Foo, Category) %>%
>>>   summarise(Item = paste(Item, collapse = ", ")) %>%
>>>   spread(key = "Category", value = "Item")
>>>
>>> You need to figure out how to handle words written with different cases
>>> and how to get the quantity in an universal way. For the code above, I
>>> corrected these things by hand in the example data.
>>>
>>> HTH
>>> Ulrik
>>>
>>> On Wed, 30 Aug 2017 at 10:16 Hemant Sain <hemantsai...@gmail.com> wrote:
>>>
>>>> Hey PIKAL,
>>>> It's not a homework neithe that is the real dataset i have signer NDA
>>>> for
>>>> my company so that i can share the original data file, Actually I'm
>>>> working
>>>> on a market basket analysis task but not able to convert my existing
>>>> data
>>>> table to appropriate format so that i can apply Apriori algorithm using
>>>> R,
>>>> and this is very important me to get it done because I'm an intern and
>>&g

Re: [R] Dataframe Manipulation

2017-08-30 Thread Ulrik Stervbo

Hi Hemant,

Does this help you along?

table_1 <- textConnection("Item_1;Item_2;Item_3
1KG banana;300ML milk;1kg sugar
2Large Corona_Beer;2pack Fries;
2 Lux_Soap;1kg sugar;")

table_1 <- read.csv(table_1, sep = ";", na.strings = "", stringsAsFactors =
FALSE, check.names = FALSE)

table_2 <-
textConnection("Toiletries;Fruits;Beverages;Snacks;Vegetables;Clothings;Dairy
Products
Soap;banana;Corona_Beer;King Burger;Pumpkin;Adidas Sport Tshirt XL;milk
Shampoo;Mango;Red Label Whisky;Fries;Potato;Nike Shorts Black L;Butter
Showergel;Oranges;grey Cocktail;cheese pizza;Tomato;Puma Jersy red M;sugar
Lux_Soap;;2 Large corona Beer;;Cheese;Toothpaste")

table_2 <- read.csv(table_2, sep = ";", na.strings = "", stringsAsFactors =
FALSE, check.names = FALSE)

library(tidyr)
library(dplyr)

table_2 <- gather(table_2, "Category", "Item")

table_1 <- gather(table_1, "Foo", "Item") %>%
  filter(!is.na(Item))

table_1 <- separate(table_1, col = "Item", into = c("Quantity", "Item"),
sep = " ")

table_3 <- left_join(table_1, table_2, by = "Item") %>%
  mutate(Item = paste(Quantity, Item)) %>%
  select(-Quantity)

table_3 %>%
  group_by(Foo, Category) %>%
  summarise(Item = paste(Item, collapse = ", ")) %>%
  spread(key = "Category", value = "Item")

You need to figure out how to handle words written with different cases and
how to get the quantity in an universal way. For the code above, I
corrected these things by hand in the example data.

HTH
Ulrik

On Wed, 30 Aug 2017 at 10:16 Hemant Sain  wrote:

> Hey PIKAL,
> It's not a homework neithe that is the real dataset i have signer NDA for
> my company so that i can share the original data file, Actually I'm working
> on a market basket analysis task but not able to convert my existing data
> table to appropriate format so that i can apply Apriori algorithm using R,
> and this is very important me to get it done because I'm an intern and if i
> won't get it done they will not  going to hire me as a full-time employee.
> i tried everything by myself but not able to get it done.
> your precious 10-15 can save my upcoming years. so please if you can please
> help me through this.
> i want another dataset based on first two dataset i have mentioned .
>
> Thanks
>
> On 30 August 2017 at 12:49, PIKAL Petr  wrote:
>
> > Hi
> >
> > It seems to me like homework, there is no homework policy on this help
> > list.
> >
> > What do you want to do with your table 3? It seems to me futile.
> >
> > Anyway, some combination of melt, merge, cast and regular expressions
> > could be employed in such task, but it could be rather tricky.
> >
> > But be aware that
> >
> > Suger does not match sugar (I wonder that sugar is dairy product)
> >
> > and you mix uppercase and lowercase letters which could be also
> > problematic, when matching words.
> >
> > Cheers
> > Petr
> >
> > > -Original Message-
> > > From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Hemant
> > Sain
> > > Sent: Wednesday, August 30, 2017 8:28 AM
> > > To: r-help@r-project.org
> > > Subject: [R] Dataframe Manipulation
> > >
> > > i want to do a market basket analysis and I’m trying to create a
> dataset
> > for that
> > > i have two tables, one table contains daily transaction of products in
> > which
> > > each row of table shows item purchased by the customer, The second
> table
> > > contains parent group under those products are fallen, for example
> under
> > fruit
> > > category there are several fruits like mango, banana, apple etc.
> > > i want to create a third table in which parent group are mentioned as
> > header
> > > which can be extracted from Table 2, and all the rows represent
> > transaction of
> > > products
> > >
> > > with their names, and if there is no transaction for any parent
> category
> > then
> > > the cell supposed to fill as NA. please help me with R or C/c++ code( R
> > would be
> > >
> > > preferred) here I’m attaching you all three tables for better reference
> > i have
> > > first two tables and i want to get a table like table 3
> > >
> > > Tables are explained in the attached doc.
> > >
> > > --
> > > hemantsain.com
> >
> > 
> > Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou
> > určeny pouze jeho adresátům.
> > Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě
> > neprodleně jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho
> kopie
> > vymažte ze svého systému.
> > Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento
> email
> > jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat.
> > Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou modifikacemi
> > či zpožděním přenosu e-mailu.
> >
> > V případě, že je tento e-mail součástí obchodního jednání:
> > - vyhrazuje si odesílatel právo ukončit kdykoliv jednání o uzavření
> > smlouvy, a to z jakéhokoliv důvodu i bez uvedení důvodu.
> > - a obsahuje-li nabídku, je adresát

Re: [R] Find maxima of a function

2017-08-26 Thread Ulrik Stervbo

Please keep the list in cc.

Sorry, it didn't work as expected. Maybe someone else have an appropriate
solution.

Best,
Ulrik

On Sa., 26. Aug. 2017, 12:57 niharika singhal <niharikasinghal1...@gmail.com>
wrote:

> Hi
>
> Thanks for you mail,
> I really appreciate your time on my problem
>
> I have posted this problem on
>
>
> https://stats.stackexchange.com/questions/299590/to-find-maxima-for-gaussian-mixture-model
>
>
> The plot I am getting using UnivarMixingDistribution from distr package in
> R
>
> code is
>
> mc0= c(0.1241933, 0.6329082 <06329%20082>, 0.2428986 <02428%20986>)
> rv
> <-UnivarMixingDistribution(Norm(506.8644,61.02859),Norm(672.8448,9.149168),Norm(
> 829.902,74.84682), mixCoeff=mc0/sum(mc0))
> plot(rv, to.draw.arg="d")
>
> I want output around 672 in first case and in 2nd case around 2.1
> according to the plot.
> your code will not work in both the scenario
>
> Regards
> Niharika Singhal
>
>
> On Sat, Aug 26, 2017 at 12:47 PM, Ulrik Stervbo <ulrik.ster...@gmail.com>
> wrote:
>
>> Hi,
>>
>> I once found this somewhere on stackoverflow:
>>
>> values <- rnorm(20, mean = c(2.15,2.0,2.9), sd = c(0.1,0.1,0.1))
>>
>> v_dens <- density(values)
>> v_dens_y <- v_dens$y
>>
>> r <- rle(v_dens_y)
>> # These functions ignore the extremes if they are the first or last point
>> maxima_index <- which(rep(x = diff(sign(diff(c(-Inf, r$values, -Inf
>> == -2,  times = r$lengths))
>> minima_index <- which(rep(x = diff(sign(diff(c(-Inf, r$values, -Inf
>> == 2,  times = r$lengths))
>>
>> plot(v_dens_y)
>>
>> HTH
>> Ulrik
>>
>>
>> On Sat, 26 Aug 2017 at 11:49 niharika singhal <
>> niharikasinghal1...@gmail.com> wrote:
>>
>>> I have a Gaussian mixture model with some parameters
>>>
>>> mean=(506.8644,672.8448,829.902)
>>>
>>> sigma=(61.02859,9.149168,74.84682)
>>>
>>> c=(0.1241933, 0.6329082 <06329%20082>, 0.2428986 <02428%20986>)
>>>
>>> And the plot look something like below.[image: enter image description
>>> here]
>>> <https://i.stack.imgur.com/4uUQ9.png>
>>>
>>> Also, if I change my parameters to
>>>
>>> mean=(2.15,2.0,2.9)
>>>
>>> sigma=(0.1,0.1,0.1)
>>>
>>> c=(1/3,1/3,1/3)
>>>
>>> Then plot would change to[image: enter image description here]
>>> <https://i.stack.imgur.com/kESYX.png>
>>>
>>> Is there any way to find the maxima. I have tried Newton's method but it
>>> gave me the wrong output.
>>>
>>> Like in general some common solution, which would work on all the cases,
>>> is
>>> needed.Can someone suggest me how can I achieve this
>>>
>>> Thanks in advance
>>>
>>> Niharika Singhal
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Find maxima of a function

2017-08-26 Thread Ulrik Stervbo

Hi,

I once found this somewhere on stackoverflow:

values <- rnorm(20, mean = c(2.15,2.0,2.9), sd = c(0.1,0.1,0.1))

v_dens <- density(values)
v_dens_y <- v_dens$y

r <- rle(v_dens_y)
# These functions ignore the extremes if they are the first or last point
maxima_index <- which(rep(x = diff(sign(diff(c(-Inf, r$values, -Inf ==
-2,  times = r$lengths))
minima_index <- which(rep(x = diff(sign(diff(c(-Inf, r$values, -Inf ==
2,  times = r$lengths))

plot(v_dens_y)

HTH
Ulrik


On Sat, 26 Aug 2017 at 11:49 niharika singhal 
wrote:

> I have a Gaussian mixture model with some parameters
>
> mean=(506.8644,672.8448,829.902)
>
> sigma=(61.02859,9.149168,74.84682)
>
> c=(0.1241933, 0.6329082 <06329%20082>, 0.2428986 <02428%20986>)
>
> And the plot look something like below.[image: enter image description
> here]
> 
>
> Also, if I change my parameters to
>
> mean=(2.15,2.0,2.9)
>
> sigma=(0.1,0.1,0.1)
>
> c=(1/3,1/3,1/3)
>
> Then plot would change to[image: enter image description here]
> 
>
> Is there any way to find the maxima. I have tried Newton's method but it
> gave me the wrong output.
>
> Like in general some common solution, which would work on all the cases, is
> needed.Can someone suggest me how can I achieve this
>
> Thanks in advance
>
> Niharika Singhal
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] about multi-optimal points

2017-08-26 Thread Ulrik Stervbo

HI lily,

for the colouring of individual points you can set the colour aesthetic.
The ID is numeric so ggplot applies a colour scale. If we cast ID to a
factor we get the appropriate colouring.

test_df <- data.frame(ID = 1:20, v1 = rnorm(20), v2 = rnorm(20), v3 =
rnorm(20))

ggplot(data=test_df, aes(x=v1,y=v2, colour = as.factor(ID))) +
geom_point()+ theme_bw()+
  xlab('Variable 1')+ ylab('Variable 2')

How to choose a number of samples from the dataset you can use the subset
function to select by some variable:
sub_test_df1 <- subset(test_df, ID < 5)

ggplot(data=sub_test_df1, aes(x=v1,y=v2, colour = as.factor(ID))) +
geom_point()+ theme_bw()+
  xlab('Variable 1')+ ylab('Variable 2')

Or sample a number of random rows using samle() if this is your intention.
sub_test_df2 <- test_df[sample(x = 1:nrow(test_df), size = 10), ]

ggplot(data=sub_test_df2, aes(x=v1,y=v2, colour = as.factor(ID))) +
geom_point()+ theme_bw()+
  xlab('Variable 1')+ ylab('Variable 2')

HTH
Ulrik

On Fri, 25 Aug 2017 at 21:38 lily li  wrote:

> Hi R users,
>
> I have some sets of variables and put them into one dataframe, like in the
> following. How to choose a specific set of pareto front, such as 10 from
> the current datasets (which contains more than 100 sets)? And how to show
> the 10 points on one figure with different colors? I can put all the points
> on one figure though, and have the code below. I drew two ggplots to show
> their correlations, but I want v1 and v3 to be as close as 1, v2 to be as
> close as 0. Thanks very much.
>
> DF
>
> IDv1 v2 v3
> 10.8 0.10.7
> 20.85   0.30.6
> 30.9 0.21  0.7
> 40.95   0.22  0.8
> 50.9 0.30.7
> 60.8 0.40.76
> 70.9 0.30.77
> ...
>
> fig1 = ggplot(data=DF, aes(x=v1,y=v2))+ geom_point()+ theme_bw()+
> xlab('Variable 1')+ ylab('Variable 2')
> print(fig1)
>
> fig2 = ggplot(data=DF, aes(x=v1,y=v3)+ geom_point()+ theme_bw()+
> xlab('Variable 1')+ ylab('Variable 3')
> print(fig2)
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help Required in looping visuals

2017-08-21 Thread Ulrik Stervbo

Hi Venkat,

I must admit I don't understand what you are looking for, but maybe just
store the visuals in a named lIst?

Also, I have started to use nested data.frames to keep plots together with
identifiers of the data sets. The nest and unnest functions are in the
tidyr package. It keeps me from having to create and parse long names, and
provides a nice structure.

HTH
Ulrik

On Mon, 21 Aug 2017 at 10:00 Venkateswara Reddy Marella (Infosys Ltd) via
R-help  wrote:

> Hi Team ,
>
> I have a requirement of building set of panels in which each panel has
> multiple visuals based on single set of dataset values and this thing is
> repeated for other set of values as well.
> For this requirement , I am trying to use a for loop to create visuals and
> panel for each set of values ( like first panel should be for first set of
> dataset values and so on) . Do we have any available solution for this
> problem.
>
> Thanks,
> Venkat.
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] data frame question

2017-08-06 Thread Ulrik Stervbo

Hi Andreas,

assuming that the increment is always indicated by the same value (in your
example 0), this could work:

df$a <- cumsum(seq_along(df$b) %in% which(df$b == 0))
df

HTH,
Ulrik

On Sun, 6 Aug 2017 at 18:06 Bert Gunter  wrote:

> Your specification is a bit unclear to me, so I'm not sure the below
> is really what you want. For example, your example seems to imply that
> a and b must be of the same length, but I do not see that your
> description requires this. So the following may not be what you want
> exactly, but one way to do this(there may be cleverer ones!) is to
> make use of ?rep. Everything else is just fussy detail. (Your example
> suggests that you should also learn about ?seq. Both of these should
> be covered in any good R tutorial, which you should probably spend
> time with if you haven't already).
>
> Anyway...
>
> ## WARNING: Not thoroughly tested! May (probably :-( ) contain bugs.
>
> f <- function(x,y,switch_val =0)
> {
>wh <- which(y == switch_val)
>len <- length(wh)
>len_x <- length(x)
>if(!len) x
>else if(wh[1] == 1){
>   if(len ==1) return(rep(x[1],len_x))
>   else {
>  wh <- wh[-1]
>  len <- len -1
>   }
>}
>count <- c(wh[1]-1,diff(wh))
>if(wh[len] == len_x) count<- c(count,1)
>else count <- c(count, len_x - wh[len] +1)
>rep(x[seq_along(count)],times = count)
> }
>
> > a <- c(1:5,1:8)
> > b <- c(0:4,0:7)
> > f(a,b)
>  [1] 1 1 1 1 1 2 2 2 2 2 2 2 2
>
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Sun, Aug 6, 2017 at 4:10 AM, Andras Farkas via R-help
>  wrote:
> > Dear All,
> >
> > wonder if you have thoughts on the following:
> >
> > let us say we have:
> >
> >
> df<-data.frame(a=c(1,2,3,4,5,1,2,3,4,5,6,7,8),b=c(0,1,2,3,4,0,1,2,3,4,5,6,7))
> >
> >
> >  I would like to rewrite values in column name "a" based on values in
> column name "b", where based on a certain value of column "b" the next
> value of column 'a' is prompted, in other words would like to have this as
> a result:
> >
> >
> df<-data.frame(a=c(1,1,1,1,1,2,2,2,2,2,2,2,2),b=c(0,1,2,3,4,0,1,2,3,4,5,6,7))
> >
> >
> > where at the value of 0 in column 'b' the number in column a changes
> from 1 to 2. From the first zero value of column 'b' and until the next
> zero in column 'b' the numbers would not change in 'a', ie: they are all 1
> in my example... then from 2 it would change to 3 again as 'b' will have
> zero again in a row, and so on.. Would be grateful for a solution that
> would allow me to set the values (from 'b') that determine how the values
> get established in 'a' (ie: lets say instead of 0 I would want 3 being the
> value where 1 changes to 2 in 'a') and that would be flexible to take into
> account that the number of rows and the number of time 0 shows up in a row
> in column 'b' may vary...
> >
> > much appreciate your thoughts..
> >
> > Andras
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] about saving format

2017-08-05 Thread Ulrik Stervbo

I have no clue how Rstudio saves plots, but when I was writing directly to
the pdf plot device I had similar problems. Setting useDingbats = TRUE made
everything work well.

I think it is more prudent - and less clicking here and there - to save
plots from within the script.

I imagine this works as expected:

pdf(useDingbats = TRUE)
plot(0, xlab = "Percent (‰)", ylab = "Another Percent (‰)”)
dev.off()

Best,
Ulrik

Duncan Murdoch  schrieb am So., 6. Aug. 2017,
03:44:

> On 05/08/2017 9:10 PM, Bert Gunter wrote:
> > If the OP is using RStudio and not using R  (i.e. pdf()) directly, it
> > sounds like this query should be directed to RStudio support, not
> > here.
>
> Two things:
>
> First, to Lily: "‰" is the "per mil" symbol, not percent; it's not an
> ASCII symbol, which is why there are issues displaying it.  If you
> really mean percent, use "%" and you likely won't have problems.
>
> And to confirm what Bert said:  this does work in R even if you use
> dev.copy() to copy from the regular R graphics device to pdf, but not
> from the RStudioGD to pdf.  So this really is an RStudio bug.
>
> Duncan Murdoch
>
> >
> > Cheers,
> > Bert
> >
> >
> > Bert Gunter
> >
> > "The trouble with having an open mind is that people keep coming along
> > and sticking things into it."
> > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >
> >
> > On Sat, Aug 5, 2017 at 5:54 PM, Ismail SEZEN 
> wrote:
> >>
> >>> On 6 Aug 2017, at 03:47, lily li  wrote:
> >>>
> >>> In the lower right panel of R-studio interface, there is the "Export"
> button. I saved as PDF from there directly, rather than using functions
> >>>
> >>> On Sat, Aug 5, 2017 at 6:18 PM, Ismail SEZEN 
> wrote:
> >>>
>  On 6 Aug 2017, at 03:01, lily li  wrote:
> 
>  I am using the plot() function, but have a problem. When saving as pdf
>  format, the ‰ sign in the x-axis label becomes (...) sign. I prefer
> to save
>  in pdf, as this format has a higher resolution than jpeg or other
> picture
>  formats. Could anyone tell me how to do then? Thanks.
> >>>
> >>> Please, share minimal example. which function do you use to save the
> plot as pdf? ‘pdf' or ‘cairo_pdf’ or something else?
> >>
> >> I used the code below, and used export button to save as pdf. It saves
> as expected. Please, minimize your code as below and share. So I can try on
> my machine.
> >>
> >> plot(0, xlab = "Percent (‰)", ylab = "Another Percent (‰)”)
> >> __
> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] define a list with names as variables

2017-08-04 Thread Ulrik Stervbo

Hi Giovani,

I would create an unnamed list and set the names after.

Best,
Ulrik

On Fri, 4 Aug 2017 at 12:08 Giovanni Gherdovich 
wrote:

> Hello,
>
> I'm having troubles defining a list where names are variables (of type
> character). Like this, which gives "foo" instead of "world" (the way I
> meant it is that "world" is the value of the variable foo). Any hint?
>
> > f <- function(foo, bar) { list(foo = bar) }
> > x <- f("hello", "world")
> > names(x)
> [1] "foo"
>
>
> Thanks,
> Giovanni
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Restructuring Star Wars data from rwars package

2017-08-04 Thread Ulrik Stervbo

Hi Matt,

the usual way would be to use do.call():

.lst <- list(x = list(a = 1, b = 2), y = list(a = 5, b = 8))
do.call(rbind, lapply(.lst, data.frame, stringsAsFactors = FALSE))

however, your list has vectors of unequal lengths making the above fail.
You somehow need to get everything to have the same length, The dplyr data
set has nested columns, but I believe a more transparent way is simply to
concatenate the elements of each vector longer than 1.

library("rwars")
library("tidyverse")

people <- get_all_people(parse_result = T)
people <- get_all_people(getElement(people, "next"), parse_result = T)

list_to_df_collapse <- function(.list){
  .list %>%
lapply(paste, collapse = "|") %>%
bind_rows()
}

people$results %>%
  lapply(list_to_df_collapse) %>%
  bind_rows()

This does not re-create the dplyr data set though. To do this you need to
nest the longer than 1 variables. It turns out that some variables are not
found in all members of result, and some variables might have the length of
1 in one case but more than one in another. This means we are probably
better of knowing which columns must be nested.

# Find the variables that must be nested
vars_to_nest <- people$results %>%
  # Get the length of each variable at each entry
  map_df(function(.list){
.names <- names(.list)
.lengths <- sapply(.list, length)
data.frame(col = .names, len = .lengths, stringsAsFactors = FALSE)
  }) %>%
  # Get those that has a length of 2 or more in any entry
  filter(len > 1) %>%
  distinct(col) %>% flatten_chr()

list_to_df_nest <- function(.list, .vars_to_nest){
  # Create a list of data.frames
  tmp_lst <-  .list %>%
  map2(names(.), function(.value, .id){
data_frame(.value) %>%
  set_names(.id)})

  # Nest those that must be nesed
  nested_vars <- tmp_lst[.vars_to_nest] %>%
# We might have selected something that does not exist we better clear
away
compact() %>%
  # Do the nesting
  map2(names(.), function(.df, .id){
  nest(.df, one_of(.id)) %>%
set_names(.id)
  })

  # Overwrite the list elements with the nested data.frames
  tmp_lst[names(nested_vars)] <- nested_vars
  tmp_lst %>% bind_cols()
}

people$results %>%
  lapply(list_to_df_nest, .vars_to_nest = vars_to_nest) %>%
  bind_rows()

The first solution is considerably faster than my second, though everything
might be done in a more clever way...

HTH
Ulrik

On Fri, 4 Aug 2017 at 05:57 Matt Van Scoyoc  wrote:

> I'm having trouble restructuring data from the rwars package into a
> dataframe. Can someone help me?
>
> Here's what I have...
>
> library("rwars")
> library("tidyverse")
>
> # These data are json, so they load into R as a list
> people <- get_all_people(parse_result = T)
> people <- get_all_people(getElement(people, "next"), parse_result = T)
>
> # Look at Anakin Skywalker's data
> people$results[[1]]
> people$results[[1]][1] # print his name
>
> # To use them in R, I need to restructure them to a dataframe like they are
> in dplyr
> data("starwars")
> glimpse(starwars)
>
> Thanks for the help.
>
> Cheers,
> MVS
> =
> Matthew Van Scoyoc
> =
> Think SNOW!
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] fill out a PDF form in R

2017-07-26 Thread Ulrik Stervbo

On second thought, you could also use pdftk to fill out the pdf form with
data generated in R.

On Wed, 26 Jul 2017 at 14:01 Ulrik Stervbo <ulrik.ster...@gmail.com> wrote:

> Hi Elahe,
>
> I have no clue, but maybe you can dump the data fields using pdftk, and
> work with those in R.
>
> HTH
> Ulrik
>
> On Wed, 26 Jul 2017 at 13:50 Elahe chalabi via R-help <
> r-help@r-project.org> wrote:
>
>> Hi all,
>>
>> I would like to get ideas about how to fill out a PDF form in R and to
>> know if it's possible or not. I could not find something helpful in
>> Internet.
>>
>> Does anyone know a good link for that or have experience in this?
>> Thanks for any help!
>>
>> Elahe
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] fill out a PDF form in R

2017-07-26 Thread Ulrik Stervbo

Hi Elahe,

I have no clue, but maybe you can dump the data fields using pdftk, and
work with those in R.

HTH
Ulrik

On Wed, 26 Jul 2017 at 13:50 Elahe chalabi via R-help 
wrote:

> Hi all,
>
> I would like to get ideas about how to fill out a PDF form in R and to
> know if it's possible or not. I could not find something helpful in
> Internet.
>
> Does anyone know a good link for that or have experience in this?
> Thanks for any help!
>
> Elahe
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] spaghetti plot - urgent

2017-07-19 Thread Ulrik Stervbo

Hi Rosa,

You pass a vector to ggplot, which expects a data.frame. I am sure you
meant to do this:

point7$y_point7 <- point7$beta0_7 + point7$beta1_7*point7$time + point7
$epsilon_7

ggplot(point7, aes(time, y_point7)) + geom_line()

HTH
Ulrik


On Wed, 19 Jul 2017 at 20:37 Rosa Oliveira  wrote:

> Hi everyone,
>
> I’m trying to do a spaghetti plot and I know I’m doing all wrong, It must
> be.
>
> What I need:
>
> 15 subjects, each with measurements over 5 different times (t1, ..., t5),
> and the variable that I need to represent in the spaguetti plot is given by:
>
> PCR = b0 + b1 * ti + epsilon
>
> B0, - baseline of each subject
> B1 - trajectory of each subject over time (so multiply by t)
> Epsilon - error associated with each subject
>
> Regression model with mixed effects.
>
> Thus, I generated b0, b1, epsilon and time created sequence.
>
> But I need to do spaguetti plot of the outcome and I can not understand
> how much I search the publications.
>
> Sorry for the stupidity, but I do not even know how to do it and I need it
> with the utmost urgency to finish a publication proposal :(
>
> Follows what I tried to do :( :( :(
>
>
> library(ggplot2)
> library(reshape)
> library(lattice)
> library(gtable)
> library(grid)
>
>
> set.seed(9027)
>
> n.longitudinal.observations  = 5  # number of PCR
> measures (per subject) in the hospital period
> subjects = 15  # Number of
> simulations (1 per subject in the study)
>
> beta0_7_gerar  = rnorm(subjects, mean = 1, sd = .5)
> beta0_7=
> as.data.frame(matrix(beta0_7_gerar,nrow=subjects,ncol=1))  # beta 0 -
> input variable used to calculate PCR (the outcome)
> beta1_7_gerar = rnorm(subjects, mean = -1, sd = .5)
> beta1_7   =
> as.data.frame(matrix(beta1_7_gerar,nrow=subjects,ncol=1) )  # beta 1 -
> input variable used to calculate PCR (the outcome)
>
> tj_gerar= seq.int(1,
> n.longitudinal.observations, 1)
> epsilon_7_gerar  = rnorm(5*subjects, mean = 0, sd = .1)
> epsilon_7 =
> as.data.frame(matrix(epsilon_7_gerar,nrow=subjects,ncol=1) )   # epsilon_7
> - input variable used to calculate PCR (the outcome) - associated with each
> subject
>
> tj  =
> as.data.frame(matrix(tj_gerar,nrow=subjects,ncol=1) )   #
> time
>
> point7 <- cbind(beta0_7, beta1_7, tj, epsilon_7)
> point7
> point7 <- as.data.frame(point7)
>
> colnames(point7) = c("beta0_7","beta1_7","time", "epsilon_7")
>
>
> y_point7 <- point7$beta0_7 + point7$beta1_7*point7$time + point7
> $epsilon_7 (the outcome of the study - PCR)
> y_point7
>
> require(ggplot2)
>
> png('test.png')
> p = ggplot(y_point7, aes(time, y_point7)) + geom_line()
> print(p)
> dev.off()
> savehistory()
>
>
>
>
>
>
> OR:
>
> In the last part I also tried:
>
>
> ID = rep(1:3, each = 5)
>
>
> point7 <- cbind(ID,beta0_7, beta1_7, tj, epsilon_7)
> point7
> point7 <- as.data.frame(point7)
>
> colnames(point7) = c("ID","beta0_7","beta1_7","time", "epsilon_7")
>
>
>
>
>
> y_point7 <- point7$beta0_7 + point7$beta1_7*point7$time + point7 $epsilon_7
> y_point7
>
> crp7 <- y_point7
>
> head(point7, n = 15)
>
>
> ggplot(aes(x = tj_gerar, y = crp7), data = point7) +
>   geom_line(aes(group = ID), color = "gray") +
>   geom_smooth(aes(group = 1), method = "lm", size = 3, color = "red", se =
> FALSE) +
>   theme_bw()
>
> But none of these worked :(
>
> I was looking to have something like:
>
>
> Being the outcome PCR and the year the times (1, 2, 3, 4, 5).
>
> Can someone help me please?
>
>
> Thanks,
>
> Best Rosa
>
>
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem in shiny writing a .txt file

2017-07-19 Thread Ulrik Stervbo

Hi Ana,

The path is most likely wrong.

How does f.texto() know the res.path? Do you manage to remove the old path
and create a new one but f.texto() doesn't know?

Not reasons for your problem, but curious: Why do you change the working
directory? What is the intention behind appending dir(res.path) to
res.path? Why don't you create a vector with the lines and use writeline()
in f.texto() instead of opening and closing the file several times?

HTH
Ulrik

On Wed, 19 Jul 2017 at 11:32 Rolf Turner  wrote:

>
> On 19/07/17 19:19, Ana Belén Marín wrote:
>
> > Hi all!
> >
> > I'm developing a shiny app and I have problems when I wanna write a .txt
> > file.
>
> 
>
> " ... when I *want to* write ..."
>
> The language of this mailing list is *English*, not Valspeak.
>
> cheers,
>
> Rolf Turner
>
> --
> Technical Editor ANZJS
> Department of Statistics
> University of Auckland
> Phone: +64-9-373-7599 ext. 88276 <+64%209-373%207599>
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Arranging column data to create plots

2017-07-16 Thread Ulrik Stervbo

Hi Michael,

Try gather from the tidyr package

HTH
Ulrik

Michael Reed via R-help  schrieb am So., 16. Juli
2017, 10:19:

> Dear All,
>
> I need some help arranging data that was imported.
>
> The imported data frame looks something like this (the actual file is
> huge, so this is example data)
>
> DF:
> IDKey  X1  Y1  X2  Y2  X3  Y3  X4  Y4
> Name1  21  15  25  10
> Name2  15  18  35  24  27  45
> Name3  17  21  30  22  15  40  32  55
>
> I would like to create a new data frame with the following
>
> NewDF:
> IDKey   X   Y
> Name1  21  15
> Name1  25  10
> Name2  15  18
> Name2  35  24
> Name2  27  45
> Name3  17  21
> Name3  30  22
> Name3  15  40
> Name3  32  55
>
> With the data like this I think I can do the following
>
> ggplot(NewDF, aes(x=X, y=Y, color=IDKey) + geom_line
>
> and get 3 lines with the various number of points.
>
> The point is that each of the XY pairs is a data point tied to NameX.  I
> would like to rearrange the data so I can plot the points/lines by the
> IDKey.  There will be at least 2 points, but the number of points for each
> IDKey can be as many as 4.
>
> I have tried using the gather() function from the tidyverse package, but I
> can't make it work.  The issue is that I believe I need two separate gather
> statements (one for X, another for Y) to consolidate the data.  This causes
> the pairs to not stay together and the data becomes jumbled.
>
> Thoughts
> Thanks for your help
>
> Michael E. Reed
>
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help with R script

2017-07-13 Thread Ulrik Stervbo

@Don your solution does not solve Vijayan's scenario 2. I used spread and
gather for that.

An alternative solution to insert mising Fval - picking up with Don's
newtst - is

newtst <- c("FName: fname1", "Fval: Fval1.name1", "FName: fname2", "Fval:
Fval2.name2", "FName: fname3", "FName: fname4", "Fval: fval4.fname4")

newtst_new <- vector(mode = "character", length = sum(grepl("FName",
newtst)) * 2)
newtst_len <- length(newtst)
i <- 1
j <- 1
while(i <= newtst_len){
  if(grepl("FName", newtst[i]) & grepl("Fval", newtst[i + 1])){
newtst_new[c(j, j + 1)] <- newtst[c(i, i + 1)]
i <- i + 2
  }else{
newtst_new[c(j, j + 1)] <- c(newtst[c(i)], "Fval: ")
i <- i + 1
  }
  j <- j + 2

}
newtst_new

which is also not very pretty.

HTH
Ulrik

On Thu, 13 Jul 2017 at 16:48 MacQueen, Don <macque...@llnl.gov> wrote:

> Using Ulrik’s example data (and assuming I understand what is wanted),
> here is what I would do:
>
> ex.dat <- c("FName: fname1", "Fval: Fval1.name1", "Fval: ", "FName:
> fname2", "Fval: Fval2.name2", "FName: fname3")
> tst <- data.frame(x = ex.dat, stringsAsFactors=FALSE)
>
> sp <- strsplit(tst$x, ':', fixed=TRUE)
> chk <- unlist(lapply(sp, function(txt) txt[2] != ' '))
> newtst <- tst[chk,,drop=FALSE]
>
> This both assumes and requires that ALL of the rows are structured as in
> the example data in the original question.
> For example:
>   if any row is missing the “:”, it will fail.
>   If the “:” is not followed by a space character it may fail (I have not
> checked)
>
> -Don
>
> --
> Don MacQueen
>
> Lawrence Livermore National Laboratory
> 7000 East Ave., L-627
> Livermore, CA 94550
> 925-423-1062
>
>
> On 7/13/17, 6:47 AM, "R-help on behalf of Ulrik Stervbo" <
> r-help-boun...@r-project.org on behalf of ulrik.ster...@gmail.com> wrote:
>
> Hi Vijayan,
>
> one way going about it *could* be this:
>
> library(dplyr)
> library(tidyr)
> library(purrr)
>
> ex_dat <- c("FName: fname1", "Fval: Fval1.name1", "Fval: ", "FName:
> fname2", "Fval: Fval2.name2", "FName: fname3")
>
> data.frame(x = ex_dat) %>%
>   separate(x, c("F1", "F2"), sep = ": ") %>%
>   filter(F2 != "") %>%
>   group_by(F1) %>%
>   mutate(indx = row_number()) %>%
>   spread(F1, F2, fill = "") %>%
>   gather(F1, F2, FName, Fval) %>%
>   arrange(indx) %>%
>   mutate(x = paste(F1, F2, sep = ": ")) %>%
>   select(x) %>%
>   flatten_chr()
>
> It is not particularly nice or clever, but it gets the job done using
> R.
>
> HTH
> Ulrik
>
> On Thu, 13 Jul 2017 at 13:13 Vijayan Padmanabhan <v.padmanab...@itc.in
> >
> wrote:
>
> >
> > Dear R-help Group
> >
> >
> > Scenario 1:
> > I have a text file running to 1000 of lines...that
> > is like as follows:
> >
> > [922] "FieldName: Wk3PackSubMonth"
> >
> >  [923] "FieldValue: Apr"
> >
> >  [924] "FieldName: Wk3PackSubYear"
> >
> >  [925] "FieldValue: 2017"
> >
> >  [926] "FieldName: Wk3Code1"
> >
> >  [927] "FieldValue: "
> >
> >  [928] "FieldValue: K4"
> >
> >  [929] "FieldName: Wk3Code2"
> >
> >  [930] "FieldValue: "
> >
> >  [931] "FieldValue: Q49"
> >
> >
> > I want this to be programmatically corrected to
> > read as follows: (All consecutive lines starting
> > with FieldValue is cleaned to retain only one
> > line)
> >
> > [922] "FieldName: Wk3PackSubMonth"
> >
> >  [923] "FieldValue: Apr"
> >
> >  [924] "FieldName: Wk3PackSubYear"
> >
> >  [925] "FieldValue: 2017"
> >
> >  [926] "FieldName: Wk3Code1"
> >
> >  [927] "FieldValue: K4"
> >
> >  [928] "FieldName: Wk3Code2"
> >
> >  [929] "FieldValue: Q49"
> >
> > Scenario 2:
> > In the same file, in some instances, the lines
> > could be as follows: i

Re: [R] Help with R script

2017-07-13 Thread Ulrik Stervbo

Hi Vijayan,

one way going about it *could* be this:

library(dplyr)
library(tidyr)
library(purrr)

ex_dat <- c("FName: fname1", "Fval: Fval1.name1", "Fval: ", "FName:
fname2", "Fval: Fval2.name2", "FName: fname3")

data.frame(x = ex_dat) %>%
  separate(x, c("F1", "F2"), sep = ": ") %>%
  filter(F2 != "") %>%
  group_by(F1) %>%
  mutate(indx = row_number()) %>%
  spread(F1, F2, fill = "") %>%
  gather(F1, F2, FName, Fval) %>%
  arrange(indx) %>%
  mutate(x = paste(F1, F2, sep = ": ")) %>%
  select(x) %>%
  flatten_chr()

It is not particularly nice or clever, but it gets the job done using R.

HTH
Ulrik

On Thu, 13 Jul 2017 at 13:13 Vijayan Padmanabhan 
wrote:

>
> Dear R-help Group
>
>
> Scenario 1:
> I have a text file running to 1000 of lines...that
> is like as follows:
>
> [922] "FieldName: Wk3PackSubMonth"
>
>  [923] "FieldValue: Apr"
>
>  [924] "FieldName: Wk3PackSubYear"
>
>  [925] "FieldValue: 2017"
>
>  [926] "FieldName: Wk3Code1"
>
>  [927] "FieldValue: "
>
>  [928] "FieldValue: K4"
>
>  [929] "FieldName: Wk3Code2"
>
>  [930] "FieldValue: "
>
>  [931] "FieldValue: Q49"
>
>
> I want this to be programmatically corrected to
> read as follows: (All consecutive lines starting
> with FieldValue is cleaned to retain only one
> line)
>
> [922] "FieldName: Wk3PackSubMonth"
>
>  [923] "FieldValue: Apr"
>
>  [924] "FieldName: Wk3PackSubYear"
>
>  [925] "FieldValue: 2017"
>
>  [926] "FieldName: Wk3Code1"
>
>  [927] "FieldValue: K4"
>
>  [928] "FieldName: Wk3Code2"
>
>  [929] "FieldValue: Q49"
>
> Scenario 2:
> In the same file, in some instances, the lines
> could be as follows: in this case, wherever a line
> is beginning with FieldName and the subsequent
> line is not displaying a FieldValue, I would want
> to programmatically identify such lines and insert
> FieldValue (as blank).
>
> [941] "FieldName: Wk3Code6"
>
>  [942] "FieldValue: "
>
>  [943] "FieldName: Wk3Code7"
>
>  [944] "FieldValue: "
>
>  [945] "FieldName: Wk3PackWSColorStiffRemarkCode1"
>
>  [946] "FieldName: Wk3PackWSColorWrappRemarkCode1"
>
>  [947] "FieldName:
> Wk3PackWSDelamiStiffRemarkCode1"
>
>
> ie in the above, it should be replaced as
>
> [941] "FieldName: Wk3Code6"
>
>  [942] "FieldValue: "
>
>  [943] "FieldName: Wk3Code7"
>
>  [944] "FieldValue: "
>
>  [945] "FieldName: Wk3PackWSColorStiffRemarkCode1"
>  [946] "FieldValue: "
>
>  [947] "FieldName: Wk3PackWSColorWrappRemarkCode1"
>  [948] "FieldValue: "
>
>  [949] "FieldName:
> Wk3PackWSDelamiStiffRemarkCode1"
>  [950] "FieldValue: "
>
>
> Can anybod suggest how to acheive this in R?
>
> Thanks for your time.
> Regards
> VP
>
>
>
> Disclaimer:\ This Communication is for the exclusive use...{{dropped:8}}
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] PAM Clustering

2017-07-10 Thread Ulrik Stervbo

Hi Sema,

read.csv2 use ',' as the decimal separator. Since '.' is used in your file,
everything becomes a character which in turn makes pam complain that what
you pass to the function isn't numeric.

Use read.csv2("data.csv", dec = ".") and it should work.

You can also use class(d) to check the class of the matrix before you pass
it to pam().

See ?read.table for more options.

There is a base function called 'data', so naming a variable data is a poor
choice.

HTH
Ulrik

On Mon, 10 Jul 2017 at 17:25 Sema Atasever  wrote:

> Dear Authorized Sir / Madam,
>
> I have an R script file in which it includes PAM Clustering codes:
>
> *when i ran R script i am getting this error:*
> *Error in pam(d, 10) : x is not a numeric dataframe or matrix.*
> *Execution halted*
>
> How can i fix this error?
>
> Thanks in advance.
> 
>  data.csv
> <
> https://drive.google.com/file/d/0B4rY6f4kvHeCcVpLRTQ5VDhDNUk/view?usp=drive_web
> >
> 
>
> *pam.R*
> data <- read.csv2("data.csv")
> attach(data)
> d=as.matrix(data)
> library(cluster)
> cluster.pam = pam(d,10)
> table(cluster.pam$clustering)
>
> filenameclu = paste("clusters", ".txt")
> write.table(cluster.pam$clustering, file=filenameclu,sep=",")
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data import R: some explanatory variables not showing up correctly in summary

2017-06-01 Thread Ulrik Stervbo

Hi Tara,

It seems that you categorise and count for each category. Could it be that
the method you use puts everything that doesn't match the predefined
categories in Other?

I'm only guessing because without a minimal  reproducible example it's
difficult to do anything else.

Best wishes
Ulrik

Rui Barradas  schrieb am Do., 1. Juni 2017, 17:30:

> Hello,
>
> In order for us to help we need to know how you've imported your data.
> What was the file type? What instructions have you used to import it?
> Did you use base R or a package?
> Give us a minimal but complete code example that can reproduce your
> situation.
>
> Hope this helps,
>
> Rui Barradas
>
> Em 01-06-2017 11:02, Tara Adcock escreveu:
> > Hi,
> >
> > I have a question regarding data importing into R.
> >
> > When I import my data into R and review the summary, some of my
> explanatory variables are being reported as if instead of being one
> variable, they are two with the same name. See below for an example;
> >
> > Behav person Behav dog   Position
> >**combination  : 38   combination  :  4** Bank:372
> >**combination  :  7   combination  :  4**   **Island  :119**
> >  fast :123   fast : 15 **Island  : 11**
> >  slow :445   slow : 95   Land:  3
> >  stat :111   stat : 14   Water   :230
> >
> > Also, all of the distances I have imported are showing up in the summary
> along with a line entitled "other". However, I haven't used any other
> distances?
> >
> > DistanceDistance.dog
> > 2-10m  :184 <50m   : 35
> > <50m   :156 2-10m  : 27
> > 10-20m :156 20-30m : 23
> > 20-30m : 91 30-40m : 16
> > 40-50m : 57 10-20m : 13
> > **(Other): 82   (Other): 18**
> >
> > I have checked my data sheet over and over again and I think
> standardised the data, but the issue keeps arising. I'm assuming I need to
> clean the data set but as a nearly complete novice in R I am not certain
> how to do this. Any help at all with this would be much appreciated. Thanks
> so much.
> >
> > Kind Regards,
> >
> > Tara Adcock.
> >
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] organizing data in a matrix avoiding loop

2017-05-26 Thread Ulrik Stervbo

Hi Mario,

does acast from the reshape2 package help?

dfa<- data.frame(iso_o = letters[c(1, 1:4)], iso_d = letters[6:10], year =
c(1985, 1985, 1986, 1987, 1988), flow = c(1,2,3,4, NA))
reshape2::acast(dfa, iso_o ~ iso_d, fun.aggregate = sum, value.var = "flow")

HTH
Ulrik

On Fri, 26 May 2017 at 13:47 A M Lavezzi  wrote:

> Dear R-Users
>
> I have data on bilateral trade flows among countries in the following form:
>
> > head(dataTrade)
>
>   iso_o iso_d year FLOW
> 1   ABW   AFG 1985   NA
> 2   ABW   AFG 1986   NA
> 3   ABW   AFG 1987   NA
> 4   ABW   AFG 1988   NA
> 5   ABW   AFG 1989   NA
> 6   ABW   AFG 1990   NA
>
> where:
> iso_o: code of country of origin
> iso_d: code of country of destination
> year: 1985:2015
> FLOW: amount of trade (values are "NA", 0s, or positive numbers)
>
> I have 215 countries. I would like to create a 215x215 matrix , say M, in
> which element M(i,j) is the total trade between countries i and j between
> 1985 and 2015 (i.e. the sum of annual amounts of trade).
>
> After collecting the country codes in a variable named "my_iso", I can
> obtain M in a straightforward way using a loop such as:
>
> for (i in my_iso){
>   for(j in my_iso)
> if(i!=j){
>   M[seq(1:length(my_iso))[my_iso==i],seq(1:length(my_iso))[my_iso==j]]
> <-
> sum(dataTrade[dataTrade$iso_o==i &
> dataTrade$iso_d==j,"FLOW"],na.rm=TRUE)
> }
> }
>
> However, it takes ages.
>
> Is there a way to avoid these loops?
>
> Thanks for your help
> Mario
>
>
> --
> Andrea Mario Lavezzi
> DiGi,Sezione Diritto e Società
> Università di Palermo
> Piazza Bologni 8
> 90134 Palermo, Italy
> tel. ++39 091 23892208 <+39%20091%202389%202208>
> fax ++39 091 6111268 <+39%20091%20611%201268>
> skype: lavezzimario
> email: mario.lavezzi (at) unipa.it
> web: http://www.unipa.it/~mario.lavezzi
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] using "dcast" function ?

2017-05-26 Thread Ulrik Stervbo

It is correct and will produce a data.frame. But I guess the result is not
what you intend since the resulting data.frame nothing but NA and Samples
in the diagonal:

df1 <- data.frame(x = letters[1:5], y = letters[6:10])
reshape2::dcast(df1, x ~ y)

You are missing values somewhere. If you want all possible combinations of
Cancer_Gene and Sample, look at expand.grid:

expand.grid(df1)

HTH
Ulrik

On Fri, 26 May 2017 at 11:26 Bogdan Tanasa  wrote:

> Dear all, I would like to double-check with you please the use of "acast"
> or "dcast" function from "reshape2"package.
>
> I am starting with a data frame Y of GENES and SAMPLES,eg :
>
>   Cancer_Gene Sample
> 1ABL2  WT_10T
> 2ABL2   WT_6T
> 3  ADGRA2   HB_8R
> 4AFF4 EWS_13R
>
> and I would like to have a dataframe/matrix of CANCER_GENES * SAMPLES.
>
> Shall I do " dcast(Y, Cancer_Gene ~ Sample)", would it be correct ? thank
> you !
>
> -- bogdan
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] about combining two dataframes

2017-05-24 Thread Ulrik Stervbo

Hi Lily,

maybe you should read up on what bind_rows/bind_cols (or the base functions
rbind and cbind) do.

bind_cols and cbind will fail in this case because of the different number
of rows.

bind_rows and rbind will fail because the column names are different - how
can R know that month and mon really is the same.

Depending on what you want, you should unify the column names (I have a
hunch that this is what you want), or make sure the data.frames have the
same number of rows.

HTH
Ulrik

On Wed, 24 May 2017 at 19:30 lily li  wrote:

> Hi all,
>
> I have a question about combining two data frames. For example, there are
> the two dataframes below, with the same structure but different column
> names and column lengths. How to add the values in DF2 to the end of DF1,
> though the column names do not match? How to add more than two? Thanks.
>
> DF1
> year   month   day   product1   product2   product3
> 1981 1  1 18  5620
> 1981 1  2 19  4522
> 1981 1  3 16  4828
> 1981 1  4 19  5021
>
> DF2
> yr mon  d prodprod2   prod3
> 1981 2  1 17  4925
> 1981 2  2 20  4723
> 1981 2  3 21  5227
>
> I use the code below but it does not work.
> require(dplyr)
> bind_rows(DF1, DF2) or bind_cols(DF1, DF2)
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] problem with system.file

2017-05-22 Thread Ulrik Stervbo

What is your version of readxl?

In my version 1.0, there is no directory called estdata, but there is one
called extdata. However, in that directory there is no file called
"results.xlsx"

Either it was there once and has now gone missing or "results.xlsx" your
own file? It looks like the latter - and in which case, there is no point
in using system.file. Rather you should use read_xlsx([path/file]).

HTH
Ulrik

On Mon, 22 May 2017 at 14:44 Pau Marc Muñoz Torres 
wrote:

> Hello everybody
>
>  I am trying to use system.file but it returns not file found
>
> what I have done is
>
> > sample <- system.file("results.xlsx","estdata", package =
> "readxl",mustWork = TRUE)
> Error in system.file("results.xlsx", "estdata", package = "readxl",
> mustWork = TRUE) :
>   no file found
>
> i have checked the path was correct and the file exists with
>
> FILES <- file.path("results.xlsx")
> present <- file.exists(FILES)
>
> and it returned a true values. Anyone can tell me which can be the problem
> ?
>
> thanks
>
> Pau Marc Muñoz Torres
> skype: pau_marc
> http://www.linkedin.com/in/paumarc
> http://www.researchgate.net/profile/Pau_Marc_Torres3/info/
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Randomly select one row by group from a matrix

2017-05-18 Thread Ulrik Stervbo

Hi Marine,

your manipulation of the matrix is quite convoluted, and it helps to expand
a bit:

test_lst <- split(test, test[,c("id")])
test_lst$`1`

after splitting, your matrix has gone back to be a plain vector, which
makes the sampling fail.

The reason is that, a matrix - behind the scenes - is a vector with a
dimension and when splitting the matrix you lose the dimension information.

Do you really need to work with a matrix? I prefer data.frames because I
can mix different types. Also with data.frame you can use the functionality
of the dplyr library, which also makes things more readable:

library(dplyr)

test_df <- data.frame(xcor = rnorm(8), ycor = rnorm(8), id = c(1, 2))

grouped_test_df <- group_by(test_df, id)
sample_n(grouped_test_df, 1)

HTH
Ulrik

On Thu, 18 May 2017 at 17:18 Marine Regis  wrote:

> Hello,
> I would like to randomly select one row by group from a matrix. Here is an
> example where there is one row by group. The code gives an error message:
> test <- matrix(c(4,4, 6,2, 1,2), nrow = 2, ncol = 3, dimnames = list(NULL,
> c("xcor", "ycor", "id")))
> do.call(rbind, lapply(split(test, test[,c("id")]), function(x)
> x[sample(nrow(x), 1), ]))
>  Show Traceback
>
>  Rerun with Debug
>
> Error in sample.int(length(x), size, replace, prob) :
>   invalid first argument
>
>
> How can I modify the code so that it works when there are several rows or
> one row for a given group?
> Thanks very much for your time
> Have a nice day
> Marine
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Plotting bar charts by Month

2017-05-13 Thread Ulrik Stervbo

Does

scale_x_date(date_breaks = "1 month")

do what you want?

Ulrik

John Kane via R-help  schrieb am Sa., 13. Mai 2017,
17:12:

> Could we see some sample data?
>
>
> On Tuesday, May 9, 2017 9:55 PM, Jeff Reichman <
> reichm...@sbcglobal.net> wrote:
>
>
>  r-help
>
>
>
> Trying to figure out how to plot by month bar charts. The follow code plots
> the monthly portion on a yearly x-scale.  So I either I create 12
> individual
> month plots or maybe there is some sort of "break" to tell R separate by
> month and use the months dates as the x-scale; so that Jan's scale is 1 -
> 31
> Jan , Feb scale is 1 - 28 Feb etc.  As it is now I get the Jan values
> ploted
> with a 1-Jan to 31 Dec x-scale; Feb's value are ploted on a 1-Jan to 31 Dec
> x-scale etc.
>
>
>
> ggplot(data = df, aes(x = date, y = height)) +
>
> geom_bar(stat = "identity") +
>
> geom_bar(aes(x = action, y = height), color = "red", stat =
> "identity") +
>
> facet_wrap(~month, nrow = 3)
>
>
>
> Jeff
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] creating a color gradient in geom_ribbon

2017-05-10 Thread Ulrik Stervbo

I haven't tested it  but the first thing I'd look at is scale_fill_gradient.

HTH


Ulrik

Jim Lemon  schrieb am Do., 11. Mai 2017, 07:22:

> Hi Kristi,
> It can be done, but it is messy:
>
> pl = data.frame(Time = 0:10, menle = rnorm(11))
> pl$menlelb = pl$menle -1
> pl$menleub = pl$menle +1
> rg<-0.95
> blue<-1
> plot(pl$Time,pl$menlelb,ylim=range(c(pl$menlelb,pl$menleub)),type="l",
>  lwd=7,col=rgb(rg,rg,blue))
> lines(pl$Time,pl$menlelb,lwd=7,col=rgb(rg,rg,blue))
> rg<-seq(0.9,0.3,length.out=9)
> offset<-seq(0.88,0.08,by=-0.1)
> for(i in 1:9) {
>  lines(pl$Time,pl$menle+offset[i],lwd=7,col=rgb(rg[i],rg[i],blue))
>  lines(pl$Time,pl$menle-offset[i],lwd=7,col=rgb(rg[i],rg[i],blue))
> }
> lines(pl$Time,pl$menle,lwd=6,col=rgb(0,0,blue))
>
> For the ggplot solution, this might work:
>
> ggplot(pl, aes(Time)) +
>   geom_line(aes(y=menle+1), colour=rgb(0.95,0.95,1), width=7) +
>   geom_line(aes(y=menle-1), colour=rgb(0.95,0.95,1), width=7) +
>   geom_line(aes(y=menle+0.88), colour=rgb(0.9,0.9,1), width=7) +
>   geom_line(aes(y=menle-0.88), colour=rgb(0.9,0.9,1), width=7) +
>   geom_line(aes(y=menle+0.78), colour=rgb(0.825,0.825,1), width=7) +
>   geom_line(aes(y=menle-0.78), colour=rgb(0.825,0.825,1), width=7) +
>   geom_line(aes(y=menle+68), colour=rgb(0.75,0.75,1), width=7) +
>   geom_line(aes(y=menle-68), colour=rgb(0.75,0.75,1), width=7) +
>   geom_line(aes(y=menle+0.58), colour=rgb(0.675,0.675,1), width=7) +
>   geom_line(aes(y=menle-0.58), colour=rgb(0.675,0.675,1), width=7) +
>   geom_line(aes(y=menle+0.48), colour=rgb(0.6,0.6,1), width=7) +
>   geom_line(aes(y=menle-0.48), colour=rgb(0.6,0.6,1), width=7) +
>   geom_line(aes(y=menle+0.38), colour=rgb(0.525,0.525,1), width=7) +
>   geom_line(aes(y=menle-0.38), colour=rgb(0.525,0.525,1), width=7) +
>   geom_line(aes(y=menle+0.28), colour=rgb(0.45,0.45,1), width=7) +
>   geom_line(aes(y=menle-0.28), colour=rgb(0.45,0.45,1), width=7) +
>   geom_line(aes(y=menle+0.18), colour=rgb(0.375,0.375,1), width=7) +
>   geom_line(aes(y=menle-0.18), colour=rgb(0.375,0.375,1), width=7) +
>   geom_line(aes(y=menle+0.08), colour=rgb(0.3,0.3,1), width=7) +
>   geom_line(aes(y=menle-0.08), colour=rgb(0.3,0.3,1), width=7) +
>   geom_line(aes(y=menle), colour="blue") )
>
> but I can't test it.
>
> Jim
>
> On Thu, May 11, 2017 at 6:05 AM, Kristi Glover
>  wrote:
> > Hi R Users,
> >
> > I was trying to create a figure with geom_ribbon. There is a function
> "fill", but I want to make the shaded area with a gradient (increasing dark
> color towards a central line, inserted of having a color). Is there any
> possibility?
> >
> >
> > In the given example, I want the colour with "blue" but in a gradient
> (dark=central, light= as goes higher or lower)
> >
> >
> > pl = data.frame(Time = 0:10, menle = rnorm(11))
> >
> > pl$menlelb = pl$menle -1
> >
> > pl$menleub = pl$menle +1
> >
> > ggplot(pl, aes(Time)) +
> >
> >   geom_line(aes(y=menle), colour="blue") +
> >
> >   geom_ribbon(aes(ymin=menlelb, ymax=menleub), fill="blue")
> >
> >
> > Thanks
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] passing arguments to simple plotting program.

2017-05-09 Thread Ulrik Stervbo

Hi Gerard,

Quotation marks are used for strings. In you function body you try to use
the strings "indata" and "fig_descrip" (the latter will work but is not
what you want).

In your current function call you pass the variable Figure as the value to
the argument fig_descrip, followed by a lot of other stuff your function
doesn't know what to do with.

Remove the quotation marks around indata and fig_descrip in the function
body, call your function with:

plot_f1(indata=v5, n1=114, n2=119, n3=116, fig_descrip="Figure 2a\nChange
in Composite Score at Visit 5 (Day 31)\nPer Protocol Population")

and you should be fine.

HTH

Ulrik
Gerard Smits <smits.gerar...@gmail.com> schrieb am Di., 9. Mai 2017, 18:27:

> Hi Ulrik,
>
> If I can trouble you with one more question.
>
> Now trying to send a string to the main= .  I was able to pass the data
> name in data=in_data, but same logic is not working in passion the main
> string.
>
>
> plot_f1 <-function(indata,n1,n2,n3,fig_descrip) {
>   par(oma=c(2,2,2,2))
>   boxplot(formula = d_comp ~ rx_grp,
>   data="indata”,# <- worked fine here.
>   main="fig_descrip",
>   ylim=c(-10,5),
>   names=c(paste0("Placebo(N=", n1,  ")"),
>  paste0("Low Dose(N=", n2, ")"),
>  paste0("High Dose(N=", n3,")")),
>   ylab='Change from Baseline')
>   abline(h=c(0), col="lightgray")
> }
>
> plot_f1(indata=v5, n1=114, n2=119, n3=116, fig_descrip=Figure 2a\nChange
> in Composite Score at Visit 5 (Day 31)\nPer Protocol Population)
>
> Error Message: Error: unexpected numeric constant in "plot_f1(indata=v5,
> n1=114, n2=119, n3=116, fig_descrip=Figure 2”
>
> Even this call gives the same error:  plot_f1(indata=v5, n1=114, n2=119,
> n3=116, fig_descrip=Figure)
>
>
> Thanks,
>
> Gerard
>
>
>
>
>
>
> On May 8, 2017, at 11:40 PM, Ulrik Stervbo <ulrik.ster...@gmail.com>
> wrote:
>
> HI Gerard,
>
> You get the literals because the variables are not implicitly expanded -
> 'Placebo(N=n1)  ' is just a string indicating the N = n1.
>
> What you want is to use paste() or paste0():
> c(paste0("Placebo(N=", n1, ")"), paste0("Low Dose (N=", n2, ")"),
> paste0("High Dose (N=", n3, ")"))
> should do it.
>
> I was taught a long ago that attach() should be avoided to avoid name
> conflicts. Also, it makes it difficult to figure out which data is actually
> being used.
>
> HTH
> Ulrik
>
> On Tue, 9 May 2017 at 06:44 Gerard Smits <smits.gerar...@gmail.com> wrote:
>
>> Hi All,
>>
>> I thought I’d try to get a function working instead of block copying code
>> and editing. My backorund is more SAS, so using a SAS Macro would be easy,
>> but not so lucky with R functions.
>>
>>
>> R being used on Mac Sierra 10.12.4:
>>
>> R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
>> Copyright (C) 2016 The R Foundation for Statistical Computing
>> Platform: x86_64-apple-darwin13.4.0 (64-bit)
>>
>>
>> resp<-read.csv("//users//gerard//gs//r_work//xyz.csv", header = TRUE)
>>
>> v5  <-subset(resp, subset=visit==5 & pp==1)
>>
>> plot_f1 <-function(n1,n2,n3) {
>>   attach(v8)
>>   par(oma=c(2,2,2,2))
>>   boxplot(formula = d_comp ~ rx_grp,
>>   main="Figure 2\nChange in Composite Score at Visit 5 (Day
>> 31)\nPer Protocol Population",
>>   ylim=c(-10,5),
>>   names=c('Placebo(N=n1)  ',
>>   'Low Dose(N=n2) ',
>>   'High Dose(N=n3)'),
>>   ylab='Change from Baseline')
>>   abline(h=c(0), col="lightgray")
>> }
>>
>> plot_f1(n1=114, n2=119, n3=116)
>>
>> The above is a simplified example where I am trying to pass 3 arguments,
>> n1-n3, to be shown in the x-axis tables,  Instead of the numbers, I get the
>> literal n1, n2, n3.
>>
>> Any help appreciated.
>>
>> Thanks,
>>
>> Gerard
>>
>>
>>
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> <http://www.r-project.org/posting-guide.html>
>> and provide commented, minimal, self-contained, reproducible code.
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Joining tables with different order and matched values

2017-05-09 Thread Ulrik Stervbo

Hi Abo,

Please keep the list in cc.

I think the function documentation is pretty straight forward - two
data.frames are required, and if you wish to keep elements that are not
present in both data.frames, you set the flag all = TRUE. You also have the
option to specify which columns to join by.

If you need more assistance with joining two data.frames, you should
provide a reproducible example, and if you have trouble with a function you
should provide an example of what you have tried so far.

Best wishes,
Ulrik



On Tue, 9 May 2017 at 10:00 abo dalash <abo_d...@hotmail.com> wrote:

> Could you please teach me about the correct formation of the syntax?. I
> have read the help page and other online resources about inner,left,
> join but wasn't able to formulate the correct syntax.
>
>
> Sent from my Samsung device
>
>
>  Original message 
> From: Ulrik Stervbo <ulrik.ster...@gmail.com>
> Date: 09/05/2017 7:42 a.m. (GMT+00:00)
> To: abo dalash <abo_d...@hotmail.com>, "r-help@R-project.org" <
> r-help@r-project.org>
> Subject: Re: [R] Joining tables with different order and matched values
>
> Hi Abo,
>
> ?merge
>
> or the join functions from dplyr.
>
> HTH
> Ulrik
>
> On Tue, 9 May 2017 at 06:44 abo dalash <abo_d...@hotmail.com> wrote:
>
>> Hi All ..,
>>
>>
>> I have 2 tables and I'm trying to have some information from the 1st
>> table to appear in the second table with different order.
>>
>>
>> For Example, let's say this is my 1st table :-
>>
>>
>>
>> Drug name   indications
>>
>>  IbuprofenPain
>>
>>  Simvastatinhyperlipidemia
>>
>> losartan   hypertension
>>
>>
>>
>> my 2nd table is in different order for the 1st column :-
>>
>>
>> Drug name   indications
>>
>>
>> Simvastatin
>>
>> losartan
>>
>> Ibuprofen
>>
>> Metformin
>>
>>
>> I wish to see the indication of each drug in my 2nd table subsisted from
>> the information in my 1st table so the final table
>>
>> would be like this
>>
>>
>> Drug name   indications
>>
>>
>> Simvastatin hyperlipidemia
>>
>> losartan   hypertension
>>
>> Ibuprofen   pain
>>
>> MetforminN/A
>>
>>
>> I have been trying to use Sqldf package and right join function but not
>> able to formulate the correct syntax.
>>
>>
>> I'm also trying to identify rows contain at least one shared value  in a
>> dataset called 'Values":
>>
>>
>> >Values
>>
>> A B
>>
>> 1,2,5   3,8,7
>>
>> 2,4,6   7,6,3
>>
>>
>>
>> Columns A & B in the first row do not share any value while in the 2nd
>> row they have a single shared value which is 6.
>>
>> The result I wish to see :-
>>
>>
>> A B shared values
>>
>> 1,2,5   3,8,7 N/A
>>
>> 2,4,6   7,6,3   6
>>
>>
>> I tried this syntax : SharedValues <- Values$A == Values$B but this
>> returns logical results and what I wish to have
>>
>> is a new data frame including the new vector "shared values" showing the
>> information exactly as above.
>>
>>
>>
>>
>> Kind Regards
>>
>>
>>
>>
>>
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Joining tables with different order and matched values

2017-05-09 Thread Ulrik Stervbo

Hi Abo,

?merge

or the join functions from dplyr.

HTH
Ulrik

On Tue, 9 May 2017 at 06:44 abo dalash  wrote:

> Hi All ..,
>
>
> I have 2 tables and I'm trying to have some information from the 1st table
> to appear in the second table with different order.
>
>
> For Example, let's say this is my 1st table :-
>
>
>
> Drug name   indications
>
>  IbuprofenPain
>
>  Simvastatinhyperlipidemia
>
> losartan   hypertension
>
>
>
> my 2nd table is in different order for the 1st column :-
>
>
> Drug name   indications
>
>
> Simvastatin
>
> losartan
>
> Ibuprofen
>
> Metformin
>
>
> I wish to see the indication of each drug in my 2nd table subsisted from
> the information in my 1st table so the final table
>
> would be like this
>
>
> Drug name   indications
>
>
> Simvastatin hyperlipidemia
>
> losartan   hypertension
>
> Ibuprofen   pain
>
> MetforminN/A
>
>
> I have been trying to use Sqldf package and right join function but not
> able to formulate the correct syntax.
>
>
> I'm also trying to identify rows contain at least one shared value  in a
> dataset called 'Values":
>
>
> >Values
>
> A B
>
> 1,2,5   3,8,7
>
> 2,4,6   7,6,3
>
>
>
> Columns A & B in the first row do not share any value while in the 2nd row
> they have a single shared value which is 6.
>
> The result I wish to see :-
>
>
> A B shared values
>
> 1,2,5   3,8,7 N/A
>
> 2,4,6   7,6,3   6
>
>
> I tried this syntax : SharedValues <- Values$A == Values$B but this
> returns logical results and what I wish to have
>
> is a new data frame including the new vector "shared values" showing the
> information exactly as above.
>
>
>
>
> Kind Regards
>
>
>
>
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] passing arguments to simple plotting program.

2017-05-09 Thread Ulrik Stervbo

HI Gerard,

You get the literals because the variables are not implicitly expanded -
'Placebo(N=n1)  ' is just a string indicating the N = n1.

What you want is to use paste() or paste0():
c(paste0("Placebo(N=", n1, ")"), paste0("Low Dose (N=", n2, ")"),
paste0("High Dose (N=", n3, ")"))
should do it.

I was taught a long ago that attach() should be avoided to avoid name
conflicts. Also, it makes it difficult to figure out which data is actually
being used.

HTH
Ulrik

On Tue, 9 May 2017 at 06:44 Gerard Smits  wrote:

> Hi All,
>
> I thought I’d try to get a function working instead of block copying code
> and editing. My backorund is more SAS, so using a SAS Macro would be easy,
> but not so lucky with R functions.
>
>
> R being used on Mac Sierra 10.12.4:
>
> R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
> Copyright (C) 2016 The R Foundation for Statistical Computing
> Platform: x86_64-apple-darwin13.4.0 (64-bit)
>
>
> resp<-read.csv("//users//gerard//gs//r_work//xyz.csv", header = TRUE)
>
> v5  <-subset(resp, subset=visit==5 & pp==1)
>
> plot_f1 <-function(n1,n2,n3) {
>   attach(v8)
>   par(oma=c(2,2,2,2))
>   boxplot(formula = d_comp ~ rx_grp,
>   main="Figure 2\nChange in Composite Score at Visit 5 (Day
> 31)\nPer Protocol Population",
>   ylim=c(-10,5),
>   names=c('Placebo(N=n1)  ',
>   'Low Dose(N=n2) ',
>   'High Dose(N=n3)'),
>   ylab='Change from Baseline')
>   abline(h=c(0), col="lightgray")
> }
>
> plot_f1(n1=114, n2=119, n3=116)
>
> The above is a simplified example where I am trying to pass 3 arguments,
> n1-n3, to be shown in the x-axis tables,  Instead of the numbers, I get the
> literal n1, n2, n3.
>
> Any help appreciated.
>
> Thanks,
>
> Gerard
>
>
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Formatting column displays

2017-05-05 Thread Ulrik Stervbo

Hi Bruce,

while working with data I would not touch the formatting of the columns. If
knowing the units is important, you can add it to the column name rather
than the values of the columns.

For presentation purposes - where everything is turned into strings - it is
a different story. Once you are done with your calculations, you can format
everything the way you like it. I find kable() from the knitr package to
help a lot with basic formatting, though you might have to do a little
pre-processing by hand (and here sprintf comes in handy).

And to echo Jeff: you should use Rmarkdown if you are writing up reports
and it is well integrated with Rstudio. knitr works behind the scenes, and
I believe it can even create a Microsoft Word document though pandoc,
without the need for ReporteR (but I have never tried, so I might be wrong).

Best,
Ulrik

On Fri, 5 May 2017 at 15:26 Bruce Ratner PhD  wrote:

> Jeff: Thanks for reply. I will follow your lead.
> Thanks.
> Bruce
>
> __
>
>
>
> > On May 5, 2017, at 9:15 AM, Jeff Newmiller 
> wrote:
> >
> > Data frames are primarily data storage objects, not data display
> objects. You can create a separate version of your data frame with
> formatted text strings, but what you usually really want is to handle
> column alignment as well and that really has to be addressed as part of
> your data output process, which you have said nothing about.
> >
> > Do you know about HTML or markdown or LaTeX? These are useful formats
> for creating reproducible research, and they are well supported through the
> knitr package and in RStudio via Rnw and Rmd files. Tables in particular
> are well supported via LaTeX with the tables package.  The ReporteR package
> can output to Microsoft Word files directly with various formatting
> options, but it doesn't play well with the other tools.
> > --
> > Sent from my phone. Please excuse my brevity.
> >
> >> On May 5, 2017 5:08:19 AM PDT, Bruce Ratner PhD  wrote:
> >> R-helpers:
> >> I need some references for formatting the display of my data frame
> >> columns.
> >> Any guidance will be appreciated. Bruce
> >> ~~
> >> I have a date frame with one column as an integer for which I want a
> >> comma display,
> >> one column consisting of dollar amounts, one column for which I want a
> >> display to two digits after the decimal point, and one column as
> >> integers ranging between
> >> 100 - 999.
> >>
> >> __
> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Formatting column displays

2017-05-05 Thread Ulrik Stervbo

Hi Bruce,

display as in the console or as a table for presentation?

For the latter, look at sprintf:

sprintf("%,1f", 1)
sprintf("%.2f", 2.5678)
sprintf("$%.3f", 2.5678)

HTH
Ulrik

On Fri, 5 May 2017 at 14:08 Bruce Ratner PhD  wrote:

> R-helpers:
> I need some references for formatting the display of my data frame columns.
> Any guidance will be appreciated. Bruce
> ~~
> I have a date frame with one column as an integer for which I want a comma
> display,
> one column consisting of dollar amounts, one column for which I want a
> display to two digits after the decimal point, and one column as integers
> ranging between
> 100 - 999.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Multiple Histograms in R

2017-04-20 Thread Ulrik Stervbo

Hi Prateek,

maybe facet_* with ggplot is what you are looking for

HTH
Ulrik

On Thu, 20 Apr 2017 at 13:24 prateek pande  wrote:

> HI Hasan,
>
> Thanks for sharing the solution. Really appreciate it.
>
> But i was reading somewhere that we cannot use par with ggplot 2 . we can
> only use grid extra to have multiple plots in a single view.
>
> Is it right?
>
> Regards
> Prateek
>
> On Thu, Apr 20, 2017 at 10:47 AM, Hasan Diwan 
> wrote:
>
> > Prateek,
> > I'm shocked this isn't answered previously, but you can try the par
> > command (mfrow and mfcol parameters) and par(mfrow=n, mfcol=m) will
> > generate n plots per row and m rows per column. For subsequent questions,
> > please do a search through the archives before asking. -- H
> >
> > On 19 April 2017 at 06:05, prateek pande  wrote:
> >
> >> Hi,
> >>
> >> I have a data as mentioned below(at the bottom)
> >>
> >> Now out of that data i have to create multiple histograms in a single
> view
> >> in  R. On that histogram i need on x -axis binned data with Breaks 10
> and
> >> on y axis event rate . Here churn is dependent variable.
> >>
> >>
> >> *for example, for mou_mean , on x -axis on histogram i need
> Bins(mou_mean)
> >> and on y - axis in need Churn%age. *
> >> *Bins(mou_mean)*
> >>
> >> *Churn %age*
> >> 23-43  0.23%
> >> 33-53  0.5%
> >> 43-63   0.3%
> >> 53-73   0.4%
> >> 63-83   0.7%
> >> 83-1030.8%
> >>
> >> Please help
> >>
> >>
> >> *mou_mean*
> >>
> >> *totalmrc_mean*
> >>
> >> *rev_range*
> >>
> >> *mou_range*
> >>
> >> *Churn*
> >>
> >> 23
> >>
> >> 24
> >>
> >> 25
> >>
> >> 27
> >>
> >> 1
> >>
> >> 45
> >>
> >> 46
> >>
> >> 47
> >>
> >> 49
> >>
> >> 1
> >>
> >> 43
> >>
> >> 44
> >>
> >> 45
> >>
> >> 47
> >>
> >> 1
> >>
> >> 45
> >>
> >> 46
> >>
> >> 47
> >>
> >> 49
> >>
> >> 0
> >>
> >> 56
> >>
> >> 57
> >>
> >> 58
> >>
> >> 60
> >>
> >> 0
> >>
> >> 67
> >>
> >> 68
> >>
> >> 69
> >>
> >> 71
> >>
> >> 1
> >>
> >> 67
> >>
> >> 68
> >>
> >> 69
> >>
> >> 71
> >>
> >> 0
> >>
> >> 44
> >>
> >> 45
> >>
> >> 46
> >>
> >> 48
> >>
> >> 1
> >>
> >> 33
> >>
> >> 34
> >>
> >> 35
> >>
> >> 37
> >>
> >> 0
> >>
> >> 90
> >>
> >> 91
> >>
> >> 92
> >>
> >> 94
> >>
> >> 1
> >>
> >> 87
> >>
> >> 88
> >>
> >> 89
> >>
> >> 91
> >>
> >> 1
> >>
> >> 76
> >>
> >> 77
> >>
> >> 78
> >>
> >> 80
> >>
> >> 1
> >>
> >> 33
> >>
> >> 34
> >>
> >> 35
> >>
> >> 37
> >>
> >> 1
> >>
> >> 44
> >>
> >> 45
> >>
> >> 46
> >>
> >> 48
> >>
> >> 1
> >>
> >> [[alternative HTML version deleted]]
> >>
> >> __
> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide http://www.R-project.org/posti
> >> ng-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >
> >
> >
> > --
> > OpenPGP: https://sks-keyservers.net/pks/lookup?op=
> > get=0xFEBAD7FFD041BBA1
> > If you wish to request my time, please do so using http://bit.ly/
> > hd1ScheduleRequest.
> > Si vous voudrais faire connnaisance, allez a http://bit.ly/
> > hd1ScheduleRequest.
> >
> >  >Sent
> > from my mobile device
> > Envoye de mon portable
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] problems in vectors of dates_times

2017-04-07 Thread Ulrik Stervbo

Hi Troels,

I get no error. I think we need more information to be of any help.

Best wishes,
Ulrik

On Fri, 7 Apr 2017 at 08:17 Troels Ring  wrote:

> Dear friends - I have further problems  handling dates_times, as
> demonstrated below where concatenating two formatted vectors of
> date_times results in errors.
> I wonder why this happens and what was wrong in trying to take these two
> vectors together
> All best wishes
> Troels Ring
> Aalborg, Denmark
> Windows
> R version 3.3.2 (2016-10-31)
>
>
> A <- structure(c(1364450400, 1364450400, 1364536800, 1364623200,
> 1364709600,
> 1364796000, 1364882400, 1364968800, 1365055200, 1365141600, 1365228000,
> 1365314400, 1365400800), class = c("POSIXct", "POSIXt"), tzone = "UTC")
> A
> B <- structure(c(1365141600, 1365228000, 1365314400, 1365400800,
> 1365487200,
> 1365573600, 136566, 1365746400, 1365832800, 1365919200, 1366005600,
> 1366092000), class = c("POSIXct", "POSIXt"), tzone = "UTC")
> B
> C <- c(A,B)
> C
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] readr to generate tibble from a character matrix

2017-04-06 Thread Ulrik Stervbo

Hi Ben,

type.convert should do the trick:

m %>%
  as_tibble() %>%
  lapply(type.convert) %>%
  as_tibble()

I am not too happy about to double 'as_tibble' but it get the job done.

HTH
Ulrik

On Thu, 6 Apr 2017 at 16:41 Ben Tupper  wrote:

> Hello,
>
> I have a workflow yields a character matrix that I convert to a tibble.
> Here is a simple example.
>
> library(tibble)
> library(readr)
>
> m <- matrix(c(letters[1:12], 1:4, (11:14 + 0.2)), ncol = 5)
> colnames(m) <- LETTERS[1:5]
>
> x <- as_tibble(m)
>
> # # A tibble: 4 × 5
> #   A B C D E
> #   
> # 1 a e i 1  11.2
> # 2 b f j 2  12.2
> # 3 c g k 3  13.2
> # 4 d h l 4  14.2
>
> The workflow output columns can be a mix of a known set column outputs.
> Some of the columns really should be converted to non-character types
> before I proceed.  Right now I explictly set the column classes with
> something like this...
>
> mode(x[['D']]) <- 'integer'
> mode(x[['E']]) <- 'numeric'
>
> # # A tibble: 4 × 5
> #   A B C D E
> #   
> # 1 a e i 1  11.2
> # 2 b f j 2  12.2
> # 3 c g k 3  13.2
> # 4 d h l 4  14.2
>
>
> I wonder if there is a way to use the read_* functions in the readr
> package to read the character matrix into a tibble directly which would
> leverage readr's excellent column class guessing. I can see in the vignette
> ( https://cran.r-project.org/web/packages/readr/vignettes/readr.html )
> that I'm not too far off in thinking this could be done (step 1
> tantalizingly says 'The flat file is parsed into a rectangular matrix of
> strings.')
>
> I know that I could either write the matrix to a file or paste it all into
> a character vector and then use read_* functions, but I confess I am
> looking for a straighter path by simply passing the matrix to a function
> like readr::read_matrix() or the like.
>
> Thanks!
> Ben
>
> Ben Tupper
> Bigelow Laboratory for Ocean Sciences
> 60 Bigelow Drive, P.O. Box 380
> East Boothbay, Maine 04544
> http://www.bigelow.org
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Using R and Python together

2017-03-31 Thread Ulrik Stervbo

'Snakemake' (https://snakemake.readthedocs.io/en/stable/) was created to
ease pipelines through different tools so it might be useful.

In all honesty I only know of Snakemake, so it might be the completely
wrong horse.

HTH
Ulrik

On Fri, 31 Mar 2017 at 06:01 Wensui Liu  wrote:

> How about pyper?
>
> On Thu, Mar 30, 2017 at 10:42 PM Kankana Shukla 
> wrote:
>
> > Hello,
> >
> > I am running a deep neural network in Python.  The input to the NN is the
> > output from my R code. I am currently running the python script and
> calling
> > the R code using a subprocess call, but this does not allow me to
> > recursively change (increment) parameters used in the R code that would
> be
> > the inputs to the python code.  So in short, I would like to follow this
> > automated process:
> >
> >1. Parameters used in R code generate output
> >2. This output is input to Python code
> >3. If output of Python code > x,  stop
> >4. Else, increment parameters used as input in R code (step 1) and
> >repeat all steps
> >
> > I have searched for examples using R and Python together, and rpy2 seems
> > like the way to go, but is there another (easier) way to do it?  I would
> > highly appreciate the help.
> >
> > Thanks in advance,
> >
> > Kankana
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Antwort: Re: Way to Plot Multiple Variables and Change Color

2017-03-28 Thread Ulrik Stervbo

Hi Georg,

you were on the right path - it is all about scale_fill*

The 'problem' as you've discovered is that value is continuous, but
applying scale_fill_manual or others (except scale_fill_gradient) expects
discrete values.

The solution is simply to set the fill with that by using factor():

ggplot(
  d_result,
  aes(variable, y = n, fill = factor(value))) +
  geom_bar(stat = "identity") +
scale_fill_manual(values = RColorBrewer::brewer.pal(4, "Blues"))
or:
 ggplot(
  d_result,
  aes(variable, y = n, fill = factor(value))) +
  geom_bar(stat = "identity") +
  scale_fill_manual(values = c("red","blue", "green", "purple"))

When using colorBrewer (which I highly recommend), I use scale_*_brewer
rather than setting the colour manually:

ggplot(
  d_result,
  aes(variable, y = n, fill = factor(value))) +
  geom_bar(stat = "identity") +
  scale_fill_brewer(palette = "Blues ")

Best,
Ulrik


On Tue, 28 Mar 2017 at 18:21  wrote:

> Hi Richard,
>
> many thanks for your reply.
>
> Your solution is not exactly what I was looking for. I would like to know
> how I can change the colors of the stacked bars in my plot and not use the
> default values. How can this be done?
>
> Kind regards
>
> Georg
>
>
>
>
> Von:"Richard M. Heiberger" 
> An: g.maub...@weinwolf.de,
> Kopie:  r-help 
> Datum:  28.03.2017 17:40
> Betreff:Re: [R] Way to Plot Multiple Variables and Change Color
>
>
>
> I think you are looking for the likert function in the HH package.
> >From ?likert
>
>
> Diverging stacked barcharts for Likert, semantic differential, rating
> scale data, and population pyramids.
>
>
> This will get you started.  Much more fine control is available.  See
> the examples and demo.
>
> ## install.packages("HH") ## if not yet on your system.
>
> library(HH)
>
> AA <- dfr[,-9]
>
> labels <- sort(unique(as.vector(data.matrix(AA
> result.template <- integer(length(labels))
> names(result.template) <- labels
>
> BB <- apply(AA, 2, function(x, result=result.template) {
>   tx <- table(x)
>   result[names(tx)] <- tx
>   result
> }
> )
>
> BB
>
> likert(t(BB), ReferenceZero=0, horizontal=FALSE)
>
>
> On Tue, Mar 28, 2017 at 6:05 AM,   wrote:
> > Hi All,
> >
> > in my current project I have to plot a whole bunch of related variables
> > (item batteries, e.g. How do you rate ... a) Accelaration, b) Horse
> Power,
> > c) Color Palette, etc.) which are all rated on a scale from 1 .. 4.
> >
> > I need to present the results as stacked bar charts where the variables
> > are columns and the percentages of the scales values (1 .. 4) are the
> > chunks of the stacked bar for each variable. To do this I have
> transformed
> > my data from wide to long and calculated the percentage for each
> variable
> > and value. The code for this is as follows:
> >
> > -- cut --
> >
> > dfr <- structure(
> >   list(
> > v07_01 = c(3, 1, 1, 4, 3, 4, 4, 1, 3, 2, 2, 3,
> >4, 4, 4, 1, 1, 3, 3, 4),
> > v07_02 = c(1, 2, 1, 1, 2, 1, 4, 1, 1,
> >4, 4, 1, 4, 4, 1, 3, 2, 3, 3, 1),
> > v07_03 = c(3, 2, 2, 1, 4, 1,
> >2, 3, 3, 1, 4, 2, 3, 1, 4, 1, 4, 2, 2, 3),
> > v07_04 = c(3, 1, 1,
> >4, 2, 4, 4, 2, 2, 2, 4, 1, 2, 1, 3, 1, 2, 4, 1, 4),
> > v07_05 = c(1,
> >2, 2, 2, 4, 4, 1, 1, 4, 4, 2, 1, 2, 1, 4, 1, 2, 4, 1, 4),
> > v07_06 = c(1,
> >2, 1, 2, 1, 1, 3, 4, 3, 2, 2, 3, 3, 2, 4, 2, 3, 1, 4, 3),
> > v07_07 = c(3,
> >2, 3, 3, 1, 1, 3, 3, 4, 4, 1, 3, 1, 3, 2, 4, 1, 2, 3, 4),
> > v07_08 = c(3,
> >2, 1, 2, 2, 2, 3, 3, 4, 4, 1, 1, 1, 2, 3, 1, 4, 2, 2, 4),
> > cased_id = structure(
> >   1:20,
> >   .Label = c(
> > "1",
> > "2",
> > "3",
> > "4",
> > "5",
> > "6",
> > "7",
> > "8",
> > "9",
> > "10",
> > "11",
> > "12",
> > "13",
> > "14",
> > "15",
> > "16",
> > "17",
> > "18",
> > "19",
> > "20"
> >   ),
> >   class = "factor"
> > )
> >   ),
> >   .Names = c(
> > "v07_01",
> > "v07_02",
> > "v07_03",
> > "v07_04",
> > "v07_05",
> > "v07_06",
> > "v07_07",
> > "v07_08",
> > "cased_id"
> >   ),
> >   row.names = c(NA, -20L),
> >   class = c("tbl_df", "tbl",
> > "data.frame")
> > )
> >
> > mdf <- melt(df)
> > d_result <- mdf  %>%
> >   dplyr::group_by(variable) %>%
> >   count(value)
> >
> > ggplot(
> >   d_result,
> >   aes(variable, y = n, fill = value)) +
> >   geom_bar(stat = "identity") +
> >   coord_cartesian(ylim = c(0,100))
> >
> > -- cut --
> >
> > Is there an easier way of doing this, i. e. a way without need to
> > transform the data?
> >
> > How can I change the colors for the data points 1 .. 4?
> >
> > I tried
> >
> > -- cut --
> >
> >   d_result,
> >   aes(variable, y = n,

Re: [R] Way to Plot Multiple Variables and Change Color

2017-03-28 Thread Ulrik Stervbo

Hi Georg,

I am a little unsure of what you want to do, but maybe this:

mdf <- melt(dfr)
d_result <- mdf  %>%
  dplyr::group_by(variable, value) %>%
  summarise(n = n())

ggplot(
  d_result,
  aes(variable, y = n, fill = value)) +
  geom_bar(stat = "identity")

HTH
Ulrik

On Tue, 28 Mar 2017 at 15:11  wrote:

> Hi All,
>
> in my current project I have to plot a whole bunch of related variables
> (item batteries, e.g. How do you rate ... a) Accelaration, b) Horse Power,
> c) Color Palette, etc.) which are all rated on a scale from 1 .. 4.
>
> I need to present the results as stacked bar charts where the variables
> are columns and the percentages of the scales values (1 .. 4) are the
> chunks of the stacked bar for each variable. To do this I have transformed
> my data from wide to long and calculated the percentage for each variable
> and value. The code for this is as follows:
>
> -- cut --
>
> dfr <- structure(
>   list(
> v07_01 = c(3, 1, 1, 4, 3, 4, 4, 1, 3, 2, 2, 3,
>4, 4, 4, 1, 1, 3, 3, 4),
> v07_02 = c(1, 2, 1, 1, 2, 1, 4, 1, 1,
>4, 4, 1, 4, 4, 1, 3, 2, 3, 3, 1),
> v07_03 = c(3, 2, 2, 1, 4, 1,
>2, 3, 3, 1, 4, 2, 3, 1, 4, 1, 4, 2, 2, 3),
> v07_04 = c(3, 1, 1,
>4, 2, 4, 4, 2, 2, 2, 4, 1, 2, 1, 3, 1, 2, 4, 1, 4),
> v07_05 = c(1,
>2, 2, 2, 4, 4, 1, 1, 4, 4, 2, 1, 2, 1, 4, 1, 2, 4, 1, 4),
> v07_06 = c(1,
>2, 1, 2, 1, 1, 3, 4, 3, 2, 2, 3, 3, 2, 4, 2, 3, 1, 4, 3),
> v07_07 = c(3,
>2, 3, 3, 1, 1, 3, 3, 4, 4, 1, 3, 1, 3, 2, 4, 1, 2, 3, 4),
> v07_08 = c(3,
>2, 1, 2, 2, 2, 3, 3, 4, 4, 1, 1, 1, 2, 3, 1, 4, 2, 2, 4),
> cased_id = structure(
>   1:20,
>   .Label = c(
> "1",
> "2",
> "3",
> "4",
> "5",
> "6",
> "7",
> "8",
> "9",
> "10",
> "11",
> "12",
> "13",
> "14",
> "15",
> "16",
> "17",
> "18",
> "19",
> "20"
>   ),
>   class = "factor"
> )
>   ),
>   .Names = c(
> "v07_01",
> "v07_02",
> "v07_03",
> "v07_04",
> "v07_05",
> "v07_06",
> "v07_07",
> "v07_08",
> "cased_id"
>   ),
>   row.names = c(NA, -20L),
>   class = c("tbl_df", "tbl",
> "data.frame")
> )
>
> mdf <- melt(df)
> d_result <- mdf  %>%
>   dplyr::group_by(variable) %>%
>   count(value)
>
> ggplot(
>   d_result,
>   aes(variable, y = n, fill = value)) +
>   geom_bar(stat = "identity") +
>   coord_cartesian(ylim = c(0,100))
>
> -- cut --
>
> Is there an easier way of doing this, i. e. a way without need to
> transform the data?
>
> How can I change the colors for the data points 1 .. 4?
>
> I tried
>
> -- cut --
>
>   d_result,
>   aes(variable, y = n, fill = value)) +
>   geom_bar(stat = "identity") +
>   coord_cartesian(ylim = c(0,100)) +
>   scale_fill_manual(values = RColorBrewer::brewer.pal(4, "Blues"))
>
> -- cut -
>
> but this does not work cause I am mixing continuous and descrete values.
>
> How can I change the colors for the bars?
>
> Kind regards
>
> Georg
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Looping Through DataFrames with Differing Lenghts

2017-03-28 Thread Ulrik Stervbo

Hi Paul,

does this do what you want?

exdf1 <- data.frame(Date = c("1985-10-01", "1985-11-01", "1985-12-01",
"1986-01-01"), Transits = c(NA, NA, NA, NA))
exdf2 <- data.frame(Date = c("1985-10-01", "1986-01-01"), Transits = c(15,
20))

tmpdf <- subset(exdf1, !Date %in% exdf2$Date)

rbind(exdf2, tmpdf)

HTH,
Ulrik

On Tue, 28 Mar 2017 at 10:50 Paul Bernal  wrote:

Dear friend Mark,

Great suggestion! Thank you for replying.

I have two dataframes, dataframe1 and dataframe2.

dataframe1 has two columns, one with the dates in -MM-DD format and the
other colum with number of transits (all of which were set to NA values).
dataframe1 starts in 1985-10-01 (october 1st 1985) and ends in 2017-03-01
(march 1 2017).

dataframe2 has the same  two columns, one with the dates in -MM-DD
format, and the other column with number of transits. dataframe2 starts
have the same start and end dates, however, dataframe2 has missing dates
between the start and end dates, so it has fewer observations.

dataframe1 has a total of 378 observations and dataframe2 has a  total of
362 observations.

I would like to come up with a code that could do the following:

Get the dates of dataframe1 that are missing in dataframe2 and add them as
records to dataframe 2 but with NA values.

:

> Make some small dataframes of just a few rows that illustrate the problem
> structure. Make a third that has the result you want. You will get an
> answer very quickly. Without a self-contained reproducible problem,
results
> vary.
>
> Mark
> R. Mark Sharp, Ph.D.
> msh...@txbiomed.org
>
>
>
>
>
> > On Mar 27, 2017, at 3:09 PM, Paul Bernal  wrote:
> >
> > Dear friends,
> >
> > I have one dataframe which contains 378 observations, and another one,
> > containing 362 observations.
> >
> > Both dataframes have two columns, one date column and another one with
> the
> > number of transits.
> >
> > I wanted to come up with a code so that I could fill in the dates that
> are
> > missing in one of the dataframes and replace the column of transits with
> > the value NA.
> >
> > I have tried several things but R obviously complains that the length of
> > the dataframes are different.
> >
> > How can I solve this?
> >
> > Any guidance will be greatly appreciated,
> >
> > Best regards,
> >
> > Paul
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> CONFIDENTIALITY NOTICE: This e-mail and any files and/or attachments
> transmitted, may contain privileged and confidential information and is
> intended solely for the exclusive use of the individual or entity to whom
> it is addressed. If you are not the intended recipient, you are hereby
> notified that any review, dissemination, distribution or copying of this
> e-mail and/or attachments is strictly prohibited. If you have received
this
> e-mail in error, please immediately notify the sender stating that this
> transmission was misdirected; return the e-mail to sender; destroy all
> paper copies and delete all electronic copies from your system without
> disclosing its contents.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Looping Through DataFrames with Differing Lenghts

2017-03-27 Thread Ulrik Stervbo

You could use merge() or %in%.

Best,
Ulrik

Mark Sharp  schrieb am Mo., 27. März 2017, 22:20:

> Make some small dataframes of just a few rows that illustrate the problem
> structure. Make a third that has the result you want. You will get an
> answer very quickly. Without a self-contained reproducible problem, results
> vary.
>
> Mark
> R. Mark Sharp, Ph.D.
> msh...@txbiomed.org
>
>
>
>
>
> > On Mar 27, 2017, at 3:09 PM, Paul Bernal  wrote:
> >
> > Dear friends,
> >
> > I have one dataframe which contains 378 observations, and another one,
> > containing 362 observations.
> >
> > Both dataframes have two columns, one date column and another one with
> the
> > number of transits.
> >
> > I wanted to come up with a code so that I could fill in the dates that
> are
> > missing in one of the dataframes and replace the column of transits with
> > the value NA.
> >
> > I have tried several things but R obviously complains that the length of
> > the dataframes are different.
> >
> > How can I solve this?
> >
> > Any guidance will be greatly appreciated,
> >
> > Best regards,
> >
> > Paul
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> CONFIDENTIALITY NOTICE: This e-mail and any files and/or...{{dropped:10}}
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R package

2017-03-23 Thread Ulrik Stervbo

Hi Elahe,

maybe the us.cities() in the maps package is what you look for.

HTH
Ulrik

On Thu, 23 Mar 2017 at 11:34 Elahe chalabi via R-help 
wrote:

> Hi all,
>
> I have a data frame containing serial numbers for US. I also have a column
> showing the city in US, now my question is is there a package in R able to
> get the city in US as input and then return the name of State for that
> city?!
>
> Thanks for any help!
> Elahe
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] string pattern matching

2017-03-22 Thread Ulrik Stervbo

Hi Joe,

you could also rethink your pattern:

grep("x1 \\+ x2", test, value = TRUE)

grep("x1 \\+ x", test, value = TRUE)

grep("x1 \\+ x[0-9]", test, value = TRUE)

HTH
Ulrik

On Wed, 22 Mar 2017 at 02:10 Jim Lemon  wrote:

> Hi Joe,
> This may help you:
>
> test <- c("x1", "x2", "x3", "x1 + x2 + x3")
> multigrep<-function(x1,x2) {
>  xbits<-unlist(strsplit(x1," "))
>  nbits<-length(xbits)
>  xans<-rep(FALSE,nbits)
>  for(i in 1:nbits) if(length(grep(xbits[i],x2))) xans[i]<-TRUE
>  return(all(xans))
> }
> multigrep("x1 + x3","x1 + x2 + x3")
> [1] TRUE
> multigrep("x1 + x4","x1 + x2 + x3")
> [1] FALSE
>
> Jim
>
> On Wed, Mar 22, 2017 at 10:50 AM, Joe Ceradini 
> wrote:
> > Hi Folks,
> >
> > Is there a way to find "x1 + x2 + x3" given "x1 + x3" as the pattern?
> > Or is that a ridiculous question, since I'm trying to find something
> > based on a pattern that doesn't exist?
> >
> > test <- c("x1", "x2", "x3", "x1 + x2 + x3")
> > test
> > [1] "x1"   "x2"   "x3"   "x1 + x2 + x3"
> >
> > grep("x1 + x2", test, fixed=TRUE, value = TRUE)
> > [1] "x1 + x2 + x3"
> >
> >
> > But what if only have "x1 + x3" as the pattern and still want to
> > return "x1 + x2 + x3"?
> >
> > grep("x1 + x3", test, fixed=TRUE, value = TRUE)
> > character(0)
> >
> > I'm sure this looks like an odd question. I'm trying to build a
> > function and stuck on this. Rather than dropping the whole function on
> > the list, I thought I'd try one piece I needed help with...although I
> > suspect that this question itself probably does bode well for my
> > function :)
> >
> > Thanks!
> > Joe
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] find and

2017-03-18 Thread Ulrik Stervbo

Using dplyr:

library(dplyr)

# Counting unique
DF4 %>%
  group_by(city) %>%
  filter(length(unique(var)) == 1)

# Counting not duplicated
DF4 %>%
  group_by(city) %>%
  filter(sum(!duplicated(var)) == 1)

HTH
Ulrik


On Sat, 18 Mar 2017 at 15:17 Rui Barradas  wrote:

> Hello,
>
> I believe this does it.
>
>
> sp <- split(DF4, DF4$city)
> want <- do.call(rbind, lapply(sp, function(x)
> if(length(unique(x$var)) == 1) x else NULL))
> rownames(want) <- NULL
> want
>
>
> Hope this helps,
>
> Rui Barradas
>
> Em 18-03-2017 13:51, Ashta escreveu:
> > Hi all,
> >
> > I am trying to find a city that do not have the same "var" value.
> > Within city the var should be the same otherwise exclude the city from
> > the final data set.
> > Here is my sample data and my attempt. City1 and city4 should be
> excluded.
> >
> > DF4 <- read.table(header=TRUE, text=' city  wk var
> > city1  1  x
> > city1  2  -
> > city1  3  x
> > city2  1  x
> > city2  2  x
> > city2  3  x
> > city2  4  x
> > city3  1  x
> > city3  2  x
> > city3  3  x
> > city3  4  x
> > city4  1  x
> > city4  2  x
> > city4  3  y
> > city4  4  y
> > city5  3  -
> > city5  4  -')
> >
> > my attempt
> >   test2  <-   data.table(DF4, key="city,var")
> >   ID1<-   test2[ !duplicated(test2),]
> >  dps <-   ID1$city[duplicated(ID1$city)]
> > Ddup  <-   which(test2$city %in% dps)
> >
> >  if(length(Ddup) !=0)  {
> >test2   <-  test2[- Ddup,]  }
> >
> > want <-  data.frame(test2)
> >
> >
> > I want get the following result but I am not getting it.
> >
> > city wk var
> >city2  1   x
> >city2  2   x
> >city2  3   x
> >city2  4   x
> >city3  1   x
> >city3  2   x
> >   city3  3   x
> >   city3  4   x
> >   city5  3   -
> >   city5  4   -
> >
> > Can some help me out the problem is?
> >
> > Thank you.
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] lagging over consecutive pairs of rows in dataframe

2017-03-17 Thread Ulrik Stervbo

Hi Evan

you can easily do this by applying diff() to each exp group.

Either using dplyr:
library(dplyr)
mydata %>%
  group_by(exp) %>%
  summarise(difference = diff(rslt))

Or with base R
aggregate(mydata, by = list(group = mydata$exp), FUN = diff)

HTH
Ulrik


On Fri, 17 Mar 2017 at 17:34 Evan Cooch  wrote:

> Suppose I have a dataframe that looks like the following:
>
> n=2
> mydata <- data.frame(exp = rep(1:5,each=n), rslt =
> c(12,15,7,8,24,28,33,15,22,11))
> mydata
> exp rslt
> 11   12
> 21   15
> 327
> 428
> 53   24
> 63   28
> 74   33
> 84   15
> 95   22
> 10   5   11
>
> The variable 'exp' (for experiment') occurs in pairs over consecutive
> rows -- 1,1, then 2,2, then 3,3, and so on. The first row in a pair is
> the 'control', and the second is a 'treatment'. The rslt column is the
> result.
>
> What I'm trying to do is create a subset of this dataframe that consists
> of the exp number, and the lagged difference between the 'control' and
> 'treatment' result.  So, for exp=1, the difference is (15-12)=3. For
> exp=2,  the difference is (8-7)=1, and so on. What I'm hoping to do is
> take mydata (above), and turn it into
>
>   exp  diff
> 1   1  3
> 2   2  1
> 3   3  4
> 4   4  -18
> 5   5  -11
>
> The basic 'trick' I can't figure out is how to create a lagged variable
> between the second row (record) for a given level of exp, and the first
> row for that exp.  This is easy to do in SAS (which I'm more familiar
> with), but I'm struggling with the equivalent in R. The brute force
> approach  I thought of is to simply split the dataframe into to (one
> even rows, one odd rows), merge by exp, and then calculate a difference.
> But this seems to require renaming the rslt column in the two new
> dataframes so they are different in the merge (say, rslt_cont n the odd
> dataframe, and rslt_trt in the even dataframe), allowing me to calculate
> a difference between the two.
>
> While I suppose this would work, I'm wondering if I'm missing a more
> elegant 'in place' approach that doesn't require me to split the data
> frame and do every via a merge.
>
> Suggestions/pointers to the obvious welcome. I've tried playing with
> lag, and some approaches using lag in the zoo package,  but haven't
> found the magic trick. The problem (meaning, what I can't figure out)
> seems to be conditioning the lag on the level of exp.
>
> Many thanks...
>
>
> mydata <-*data.frame*(x = c(20,35,45,55,70), n = rep(50,5), y =
> c(6,17,26,37,44))
>
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ggplot2: Adjusting title and labels

2017-03-16 Thread Ulrik Stervbo

Hi Georg,

If you remove the coord_polar, you'll see that the optimal y-value for the
labels is between the upper and lower bound of the stacked bar-element.

I am not sure it is the most elegant solution, but you can calculate them
like this:

df <- data.frame(group = c("Male", "Female", "Child"),
 value = c(25, 25, 50))

# Order the data.frame to match that of the final plot
df <- df[order(df$group, decreasing = TRUE), ]
# Get the upper bound of the stacked bar element
df$upper <- cumsum(df$value)
# And the lower
df$lower <- c(0, df$upper[seq_along(1:(nrow(df) - 1))])

# Now calculate the position
df$label_pos <- (df$upper - df$lower)/2 + df$lower

# And plot
blank_theme <- theme_minimal() + theme(
  axis.title.x = element_blank(),
  axis.title.y = element_blank(),
  axis.text.x = element_blank(),
  panel.border = element_blank(),
  panel.grid = element_blank(),
  axis.ticks = element_blank(),
  plot.title = element_text(size = 4, face = "bold"))

ggplot(df, aes(x = "", y = value, fill = group)) +
  geom_bar(
width = 1,
stat = "identity")+
  # coord_polar("y", start = 0) +
  scale_fill_brewer(
name = "Gruppe",
palette = "Blues") +
  blank_theme +
  geom_text(
aes(
  y = label_pos,
  label = scales::percent(value/100)),
size = 5) +
  labs(title = "Pie Title")

HTH
Ulrik


On Thu, 16 Mar 2017 at 17:24  wrote:

> Hi All,
>
> I have a question to ggplot 2. My code is the following:
>
> -- cut --
>
> library(ggplot2)
> library(scales)
>
> df <-
>   data.frame(group = c("Male", "Female", "Child"),
>  value = c(25, 25, 50))
>
> blank_theme <- theme_minimal() + theme(
>   axis.title.x = element_blank(),
>   axis.title.y = element_blank(),
>   axis.text.x = element_blank(),
>   panel.border = element_blank(),
>   panel.grid = element_blank(),
>   axis.ticks = element_blank(),
>   plot.title = element_text(size = 4, face = "bold"))
>
> ggplot(df, aes(x = "", y = value, fill = group)) +
>   geom_bar(
> width = 1,
> stat = "identity") +
>   coord_polar("y", start = 0) +
>   scale_fill_brewer(
> name = "Gruppe",
> palette = "Blues") +
>   blank_theme +
>   geom_text(
> aes(
>   y = c(10, 40, 75),
>   label = scales::percent(value/100)),
> size = 5) +
>   labs(title = "Pie Title")
>
> -- cut --
>
> Is there a way to give the position of the labels to the chunks of the pie
> in a generalized form instead of finding the value interatively by
> trial-n-error?
>
> How can I adjust the title of the graph converning font height and postion
> (e. g. center)?
>
> Kind regards
>
> Georg
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] field values from text file to dataframe

2017-03-13 Thread Ulrik Stervbo

I imagine that the FieldStateOption is irrelevant, so you might be able to
create a data.frame like this:

library(tidyr)

fl <- readLines("pdf_dump.txt")

fl <- grep("FieldStateOption", fl, value = TRUE, invert = TRUE)

field_number <- vector(mode = "integer", length = length(fl))
tmpid <- 0
for(i in seq_along(1:length(fl))){
  if(fl[i] == "---"){
tmpid <- tmpid + 1
  }
  field_number[i] <- tmpid
}

data.frame(field_number, file_line = fl) %>%
  subset(file_line != "---") %>%
  separate(file_line,into = c("field_name", "field_value")) %>%
  spread(key = "field_name", value = "field_value")

The field_number is there to make each row in the final data.frame unique
(without it, `spread` complains)

HTH,
Ulrik





On Mon, 13 Mar 2017 at 09:28 Jim Lemon  wrote:

> Hi Vijayan,
> You have a bit of a problem with repeated field names. While you can
> mangle the field names to do something like this, I don't see how you
> are going to make sense of multiple "FieldStateOption" fields. The
> strategy I would take is to collect all of the field names and then
> set up rows with the unique field names, but the multiple field names
> will make a mess of that.
>
> Jim
>
>
> On Sun, Mar 12, 2017 at 2:13 AM, Vijayan Padmanabhan
>  wrote:
> > Dear r-help group
> > I have a text file which is a data dump of a pdf form as given below..
> > I want it to be converted into a data frame with field name as column
> names
> > and the field value as the row value for each field.
> > I might have different pdf forms with different field name value pairs to
> > process. so the script should not require reference to specific field
> names
> > in the extraction of data frame.
> > Where the field value for a given field is empty or where Field Value
> > doesn't appear.. the dataframe can record them as NA against that field
> > name column
> >
> > Will someone know how to get this accomplished using R?
> >
> >
> > Regards
> > VP
> >
> > ---
> > FieldType: Choice
> > FieldName: P1
> > FieldFlags: 4849664
> > FieldValue: P1
> > FieldValueDefault: P1
> > FieldJustification: Left
> > FieldStateOption:
> > FieldStateOption: P1
> > ---
> > FieldType: Choice
> > FieldName: P2
> > FieldFlags: 4849664
> > FieldValue:
> > FieldValueDefault: P2
> > FieldJustification: Left
> > FieldStateOption:
> > FieldStateOption: P2
> > ---
> > FieldType: Choice
> > FieldName: P3
> > FieldFlags: 4849664
> > FieldValue:
> > FieldValueDefault: P3
> > FieldJustification: Left
> > FieldStateOption:
> > FieldStateOption: P3
> > ---
> > FieldType: Choice
> > FieldName: P4
> > FieldFlags: 4849664
> > FieldValue: P2
> > FieldValueDefault: P2
> > FieldJustification: Left
> > FieldStateOption:
> > FieldStateOption: P2
> > ---
> > FieldType: Choice
> > FieldName: P5
> > FieldFlags: 4849664
> > FieldValue:
> > FieldValueDefault:
> > FieldJustification: Left
> > FieldStateOption:
> > FieldStateOption: P5
> > ---
> > FieldType: Choice
> > FieldName: P6
> > FieldFlags: 4849664
> > FieldValue:
> > FieldValueDefault:
> > FieldJustification: Left
> > FieldStateOption:
> > FieldStateOption: P6
> > ---
> > FieldType: Choice
> > FieldName: P7
> > FieldFlags: 4849664
> > FieldValue:
> > FieldValueDefault:
> > FieldJustification: Left
> > FieldStateOption:
> > FieldStateOption: P7
> > ---
> > FieldType: Choice
> > FieldName: P8
> > FieldFlags: 4849664
> > FieldValue:
> > FieldValueDefault:
> > FieldJustification: Left
> > FieldStateOption:
> > FieldStateOption: P8
> > ---
> > FieldType: Choice
> > FieldName: P1IDS
> > FieldFlags: 4849664
> > FieldValue: 2
> > FieldValueDefault:
> > FieldJustification: Left
> > FieldStateOption:
> > FieldStateOption: 1
> > FieldStateOption: 2
> > FieldStateOption: 3
> > FieldStateOption: 4
> > FieldStateOption: 5
> > ---
> > FieldType: Choice
> > FieldName: P1PDS
> > FieldFlags: 4849664
> > FieldValue:
> > FieldValueDefault:
> > FieldJustification: Left
> > FieldStateOption:
> > FieldStateOption: 1
> > FieldStateOption: 2
> > FieldStateOption: 3
> > FieldStateOption: 4
> > FieldStateOption: 5
> > ---
> > FieldType: Choice
> > FieldName: P1IIU
> > FieldFlags: 4849664
> > FieldValue:
> > FieldValueDefault:
> > FieldJustification: Left
> > FieldStateOption:
> > FieldStateOption: 1
> > FieldStateOption: 2
> > FieldStateOption: 3
> > FieldStateOption: 4
> > FieldStateOption: 5
> > ---
> > FieldType: Choice
> > FieldName: P1PIU
> > FieldFlags: 4849664
> > FieldValue:
> > FieldValueDefault:
> > FieldJustification: Left
> > FieldStateOption:
> > FieldStateOption: 1
> > FieldStateOption: 2
> > FieldStateOption: 3
> > FieldStateOption: 4
> > FieldStateOption: 5
> > ---
> > FieldType: Choice
> > FieldName: P1IPU
> > FieldFlags: 4849664
> > FieldValue: 3
> > FieldValueDefault:
> > FieldJustification: Left
> > FieldStateOption:
> > FieldStateOption: 1
> > FieldStateOption: 2
> > FieldStateOption: 3
> > FieldStateOption: 4
> > FieldStateOption: 5
> > ---
> > FieldType: Choice
> > FieldName: P1PPU
> >

Re: [R] Extract student ID that match certain criteria

2017-03-13 Thread Ulrik Stervbo

Hi Roslinazairimah,

As Bert suggested, you should get acquainted with regular expressions. It
can be confusing at times, but pays off in the long run.

In your case, the pattern of "^[A-Z]{2}14.*" might work.

Best,
Ulrik

On Mon, 13 Mar 2017 at 06:20 roslinazairimah zakaria 
wrote:

> Another question,
>
> How do I extract ID based on the third and fourth letter:
>
> I have for example, AA14004, AB15035, CB14024, PA14009, PA14009 etc
>
> I would like to extract ID no. of AB14..., CB14..., PA14...
>
> On Mon, Mar 13, 2017 at 12:37 PM, roslinazairimah zakaria <
> roslina...@gmail.com> wrote:
>
> > Hi Bert,
> >
> > Thank you so much for your help.  However I don't really sure what is the
> > use of y values.  Can we do without it?
> >
> > x <- as.character(FKASA$STUDENT_ID)
> > y <- c(1,786)
> > My.Data <- data.frame (x,y)
> >
> > My.Data[grep("^AA14", My.Data$x), ]
> >
> > I got the following data:
> >
> >   x   y
> > 1   AA14068   1
> > 7   AA14090   1
> > 11  AA14099   1
> > 14  AA14012 786
> > 15  AA14039   1
> > 22  AA14251 786
> >
> > On Mon, Mar 13, 2017 at 11:51 AM, Bert Gunter 
> > wrote:
> >
> >> 1. Your code is incorrect. All entries are character strings and must be
> >> quoted.
> >>
> >> 2. See ?grep  and note in particular (in the "Value" section):
> >>
> >> "grep(value = TRUE) returns a character vector containing the selected
> >> elements of x (after coercion, preserving names but no other
> >> attributes)."
> >>
> >>
> >> 3. While the fixed = TRUE option will work here, you may wish to learn
> >> about "regular expressions", which can come in very handy for
> >> character string manipulation tasks. ?regex in R has a terse, but I
> >> have found comprehensible, discussion. There are many good gentler
> >> tutorials on the web, also.
> >>
> >>
> >> Cheers,
> >> Bert
> >>
> >> Bert Gunter
> >>
> >> "The trouble with having an open mind is that people keep coming along
> >> and sticking things into it."
> >> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >>
> >>
> >> On Sun, Mar 12, 2017 at 8:32 PM, roslinazairimah zakaria
> >>  wrote:
> >> > Dear r-users,
> >> >
> >> > I have this list of student ID,
> >> >
> >> > dt <- c(AA14068, AA13194, AE11054, AA12251, AA13228, AA13286, AA14090,
> >> > AA13256, AA13260, AA13291, AA14099, AA15071, AA13143, AA14012,
> AA14039,
> >> > AA15018, AA13234, AA13149, AA13282, AA13218)
> >> >
> >> > and I would like to extract all student of ID AA14... only.
> >> >
> >> > I search and tried substrt, subset and select but it fail.
> >> >
> >> >  substr(FKASA$STUDENT_ID, 2, nchar(string1))
> >> > Error in nchar(string1) : 'nchar()' requires a character vector
> >> >> subset(FKASA, STUDENT_ID=="AA14" )
> >> >  [1] FAC_CODEFACULTY STUDENT_ID  NAMEPROGRAM
>  KURSUS
> >> >  CGPAACT_SS  ACT_VAL ACT_CS  ACT_LED ACT_PS
> >> >  ACT_IM
> >> > [14] ACT_ENT ACT_CRE ACT_UNI ACT_VOL...
> >> >
> >> > Thank you so much for your help.
> >> >
> >> > How do I do it?
> >> > --
> >> > *Roslinazairimah Zakaria*
> >> > *Tel: +609-5492370 <+60%209-549%202370>; Fax. No.+609-5492766
> <+60%209-549%202766>*
> >> >
> >> > *Email: roslinazairi...@ump.edu.my ;
> >> > roslina...@gmail.com *
> >> > Faculty of Industrial Sciences & Technology
> >> > University Malaysia Pahang
> >> > Lebuhraya Tun Razak, 26300 Gambang, Pahang, Malaysia
> >> >
> >> > [[alternative HTML version deleted]]
> >> >
> >> > __
> >> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> > https://stat.ethz.ch/mailman/listinfo/r-help
> >> > PLEASE do read the posting guide http://www.R-project.org/posti
> >> ng-guide.html
> >> > and provide commented, minimal, self-contained, reproducible code.
> >>
> >
> >
> >
> > --
> > *Roslinazairimah Zakaria*
> > *Tel: +609-5492370 <+60%209-549%202370> <+60%209-549%202370>; Fax. No.
> +609-5492766 <+60%209-549%202766>
> > <+60%209-549%202766>*
> >
> > *Email: roslinazairi...@ump.edu.my ;
> > roslina...@gmail.com *
> > Faculty of Industrial Sciences & Technology
> > University Malaysia Pahang
> > Lebuhraya Tun Razak, 26300 Gambang, Pahang, Malaysia
> >
>
>
>
> --
> *Roslinazairimah Zakaria*
> *Tel: +609-5492370 <+60%209-549%202370>; Fax. No.+609-5492766
> <+60%209-549%202766>*
>
> *Email: roslinazairi...@ump.edu.my ;
> roslina...@gmail.com *
> Faculty of Industrial Sciences & Technology
> University Malaysia Pahang
> Lebuhraya Tun Razak, 26300 Gambang, Pahang, Malaysia
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
>

Re: [R] plotting longitudinal data with ggplot

2017-03-11 Thread Ulrik Stervbo

You need to set the aesthetic 'group' to something meaningful, probably ID
in this case.

HTH
Ulrik

On Fri, 10 Mar 2017, 19:30 Rayt Chiruka,  wrote:

> i am trying to convert a dataset from wide to long format using package
> tidyr- (seems to have been done)
>
> wen in try and plot the long dataset using ggplot i keep getting errors
>
> here is the code
>
>
>
>
>
> *library(tidyr) ht.long<-gather(ray.ht
> ,age,height,X0:X84,factor_key =
> TRUE) ht.long$ID<-factor(ht.long$ID)
> ggplot(ht.long,aes(age,height,shape=ID))+geom_line()
> ggplot(ht.long,aes(age,height))+
> facet_wrap(~ID) + geom_line()*
>
> the error i keep getting is the folowing.
>
>
> *geom_path: Each group consists of only one observation. Do you need to
> adjust the group aesthetic?*
>
> a part of the dataset is shown below
>
> ID X0 X4 X8 X12 X36 X48 X84
> 1 50 59 65 67 87 95 115
> 2 54 58 69 71 90 96 115
> 3 52 64 68 70 91 100 120
> 4 50 56 67 68 88 95 115
> 5 54 59 68 72 93 100 120
>
> --
> R T CHIRUKA
> University of Fort Hare
> Statistics Department
> Box X 1314
> Alice
> 5700
> South Africa
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] reading form data from pdf forms

2017-03-11 Thread Ulrik Stervbo

I don't know if there's a pure R option, but i believe pdftk can extract
the form data which you can then manipulate in R.

Best
Ulrik

On Sat, 11 Mar 2017, 05:14 Vijayan Padmanabhan, <
padmanabhan.vija...@gmail.com> wrote:

> Dear R-Help group
> Is there any way that I can programmatically extract form field values from
> a pdf form (either saved as pdf or fdf) in R?
> I would wish to not be dependent on any Paid tool for this purpose.
> Any guidance would be much appreciated.
>
> Regards
> VP
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] heat maps with qplot

2017-03-10 Thread Ulrik Stervbo

Hi Greg,

?theme

You can use the axis.text and axis.title if y and x are to be identical, or
axis.text.x, axis.text.y, axis.title.x, axis.title.y if you need different
font size.

HTH
Ulrik

On Fri, 10 Mar 2017 at 15:47 greg holly  wrote:

>  Hi all;
>
> The followings are my R codes for heat maps in ggplot2. I need to specify
> the font size for the y-axis (x-axis works) as well as font size for label
> y and x too. Your help highly appreciated.
>
> Thanks,
>
> Greg
>
> qplot(x=Var1, y=Var2, data=melt(cor(a, use="p")), fill=value, geom="tile")
> +
>  scale_fill_gradient2(limits=c(-1, 1))+
>  ylab('Super pathways') +
>  xlab('Significant Metabolites in Super pathways for DI') +
>  theme(axis.text.x = element_text(angle = 90, hjust = 1, size=6))
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] restructuring data frame

2017-03-10 Thread Ulrik Stervbo

Hi Petr,

maybe

library("splitstackshape")
cSplit(evid, "V4", "#", direction = "long")

or

library("tidyr")
separate_rows(evid, V4, sep = "#")

is helpful.

Best,
Ulrik

On Fri, 10 Mar 2017 at 08:32 PIKAL Petr  wrote:

Dear all

I have some data with following structure in data frame.

dput(evid[1:2,c(2:4)])

evid <- structure(list(V2 = c("test vodivosti kalcinátu", "impregnace
anatasové pasty rozprašovací sušárna"
), V3 = c("03.03.2017", "17.03.2017"), V4 = c("EICHLER Věra;#125",
"HOŠŤÁLKOVÁ Jarmila;#119;#BERNÁT Miroslav;#122;#OSTRČIL Marek;#60"
)), .Names = c("V2", "V3", "V4"), row.names = 9:10, class = "data.frame")

Each row in V4 column contain names followed by ;#xxx. I would like to
separate them like that

mena <- liche(unlist(strsplit(evid[2,4], ";#")))

here is function liche

liche <- function (x)
{
indices <- seq(along = x)
x[indices%%2 == 1]
}

and repeat each respective row of data frame for separated names like
following (which is only for one row)

temp<-evid[2,][rep(seq_len(nrow(evid[2,])), length(mena)),-4]
cbind(temp, mena)
   V1  V2 V3
10   NEPRAVDA impregnace anatasové pasty rozprašovací sušárna 17.03.2017
10.1 NEPRAVDA impregnace anatasové pasty rozprašovací sušárna 17.03.2017
10.2 NEPRAVDA impregnace anatasové pasty rozprašovací sušárna 17.03.2017
V5V6 V7   mena
10   OSTRČIL Marek OSTRČIL Marek 1kalcinace HOŠŤÁLKOVÁ Jarmila
10.1 OSTRČIL Marek OSTRČIL Marek 1kalcinaceBERNÁT Miroslav
10.2 OSTRČIL Marek OSTRČIL Marek 1kalcinace  OSTRČIL Marek

I probably could do it for each row in cycle (the data frame is not big)
but I wonder if somebody knows any more elegant/easy/effective solution for
such task.

Best regards

Petr Pikal

"Kdo vždy myslí, že se učí,
bude vlasti chlouba.
Kdo si myslí, že dost umí,
začíná být trouba."
Karel Havlíček Borovský



Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou
určeny pouze jeho adresátům.
Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě neprodleně
jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho kopie vymažte ze
svého systému.
Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento email
jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat.
Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou modifikacemi
či zpožděním přenosu e-mailu.

V případě, že je tento e-mail součástí obchodního jednání:
- vyhrazuje si odesílatel právo ukončit kdykoliv jednání o uzavření
smlouvy, a to z jakéhokoliv důvodu i bez uvedení důvodu.
- a obsahuje-li nabídku, je adresát oprávněn nabídku bezodkladně přijmout;
Odesílatel tohoto e-mailu (nabídky) vylučuje přijetí nabídky ze strany
příjemce s dodatkem či odchylkou.
- trvá odesílatel na tom, že příslušná smlouva je uzavřena teprve výslovným
dosažením shody na všech jejích náležitostech.
- odesílatel tohoto emailu informuje, že není oprávněn uzavírat za
společnost žádné smlouvy s výjimkou případů, kdy k tomu byl písemně zmocněn
nebo písemně pověřen a takové pověření nebo plná moc byly adresátovi tohoto
emailu případně osobě, kterou adresát zastupuje, předloženy nebo jejich
existence je adresátovi či osobě jím zastoupené známá.

This e-mail and any documents attached to it may be confidential and are
intended only for its intended recipients.
If you received this e-mail by mistake, please immediately inform its
sender. Delete the contents of this e-mail with all attachments and its
copies from your system.
If you are not the intended recipient of this e-mail, you are not
authorized to use, disseminate, copy or disclose this e-mail in any manner.
The sender of this e-mail shall not be liable for any possible damage
caused by modifications of the e-mail or by delay with transfer of the
email.

In case that this e-mail forms part of business dealings:
- the sender reserves the right to end negotiations about entering into a
contract in any time, for any reason, and without stating any reasoning.
- if the e-mail contains an offer, the recipient is entitled to immediately
accept such offer; The sender of this e-mail (offer) excludes any
acceptance of the offer on the part of the recipient containing any
amendment or variation.
- the sender insists on that the respective contract is concluded only upon
an express mutual agreement on all its aspects.
- the sender of this e-mail informs that he/she is not authorized to enter
into any contracts on behalf of the company except for cases in which
he/she is expressly authorized to do so in writing, and such authorization
or power of attorney is submitted to the recipient or the person
represented by the recipient, or the existence of such authorization is
known to the recipient of the person represented by the recipient.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see

Re: [R] Unable to Load package Rcmdr after installation

2017-03-09 Thread Ulrik Stervbo

Hi Paul,

what fails and how? What did you do from the time the package worked until
is didn't? Did you update packages? Which packages are you trying to load?

Best
Ulrik

On Thu, 9 Mar 2017 at 18:40 paulberna...@gmail.com <paulberna...@gmail.com>
wrote:

> Thanks Ulrik, but the thing is that I tried installing adn loading tve
> Hmisc package but wasn't able to do that either.
>
>
>  Mensaje original 
> Asunto: Re: [R] Unable to Load package Rcmdr after installation
> De: Ulrik Stervbo
> Para: Paul Bernal ,r-help@r-project.org
> CC:
>
>
> Hi Paul,
>
> The error tells you, that the 'Hmisc' does not exist on your system. If
> you install it, everything should work.
>
> Use install.packages with dependencies = TRUE to avoid the problem of
> missing packages.
>
> HTH
>
> Ulrik
>
> On Thu, 9 Mar 2017 at 16:51 Paul Bernal <paulberna...@gmail.com> wrote:
>
> Hello friends,
>
> Has anyone experienced trouble when trying to load package Rcmdr? It was
> working perfectly a couple of days ago, I don´t know why it isn´t working.
>
> > library("Rcmdr")
> Loading required package: splines
> Loading required package: RcmdrMisc
> Loading required package: car
> Loading required package: sandwich
> Error in loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck
> = vI[[j]]) :
>   there is no package called ‘Hmisc’
> Error: package ‘RcmdrMisc’ could not be loaded
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] concatenating range of columns in dataframe

2017-03-09 Thread Ulrik Stervbo

Hi Evan,

the unite function of the tidyr package achieves the same as Jim suggested,
but in perhaps a slightly more readable manner.

Ulrik

On Fri, 10 Mar 2017 at 07:50 Jim Lemon  wrote:

> Hi Evan,
> How about this:
>
> df2<-data.frame(Trt=df[,1],Conc=apply(df[,2:5],1,paste,sep="",collapse=""))
>
> Jim
>
> On Fri, Mar 10, 2017 at 3:16 PM, Evan Cooch  wrote:
> > Suppose I have the following data frame (call it df):
> >
> > Trt   y1  y2  y3  y4
> > A1A   1001
> > A1B  1100
> > A1 C   0   10   1
> > A1D   111   1
> >
> > What I want to do is concatenate columns y1  -> y4 into a contiguous
> string
> > (which I'll call df$conc), so that the final df looks like
> >
> > Trt  Conc
> > A1A   1001
> > A1B   1100
> > A1C  0101
> > A1D   
> >
> >
> > Now, if my initial dataframe was simply
> >
> >  1   0  0  1
> >  1   1  0  0
> >   0  1  0  1
> >   1  1  1  1
> >
> > then apply(df,1,paste,collapse="") does the trick, more or less.
> >
> > But once I have a Trt column, this approach yields
> >
> > A1A1001
> > A1B1100
> > A1C0101
> > A1D
> >
> > I need to maintain the space between Trt, and the other columns. So, I'm
> > trying to concatenate a subset of columns in the data frame, but I don't
> > want to have to do something like create a cahracter vector of the column
> > names to do it (e.g., c("y1","y2","y3","y4"). Doing a few by hand that
> way
> > is easy, but not if you  have dozens to hundreds of columns to work with.
> >
> >  Ideally, I'd like to be able to say
> >
> > "concatenate df[,2:4], get rid of the spaces, pipe the concatenated
> columns
> > to a new named column, and drop the original columns from the final df.
> >
> > Heuristically,
> >
> > df$conc <- concatenate df[,2:4] # making a new, 5th column in df
> > df[,2:4] <- NULL   # to drop original columns 2 -> 4
> >
> > Suggestions/pointers to the obvious appreciated.
> >
> > Thanks in advance!
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Unable to Load package Rcmdr after installation

2017-03-09 Thread Ulrik Stervbo

Hi Paul,

The error tells you, that the 'Hmisc' does not exist on your system. If you
install it, everything should work.

Use install.packages with dependencies = TRUE to avoid the problem of
missing packages.

HTH

Ulrik

On Thu, 9 Mar 2017 at 16:51 Paul Bernal  wrote:

Hello friends,

Has anyone experienced trouble when trying to load package Rcmdr? It was
working perfectly a couple of days ago, I don´t know why it isn´t working.

> library("Rcmdr")
Loading required package: splines
Loading required package: RcmdrMisc
Loading required package: car
Loading required package: sandwich
Error in loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck
= vI[[j]]) :
  there is no package called ‘Hmisc’
Error: package ‘RcmdrMisc’ could not be loaded

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problems outputting ggplot2 graphics to pdf

2017-03-02 Thread Ulrik Stervbo

Hi Hugh,

I believe the recommended way of saving ggplots is through ggsave. It
defaults to take the latest plot displayed, but you can specify which plot
to save by passing the variable to the plot argument.

If you need to save multiple plots in one file, you have to create a
multipage plot using gridExtra::marrangeGrob.

Best wishes,
Ulrik

On Thu, 2 Mar 2017 at 14:53 Hugh Morgan  wrote:

> Thanks for the help.  My test script was changed with:
>
> p <- ggplot(df, aes(gp, y)) +
>geom_point() +
>geom_point(data = ds, aes(y = mean), colour = 'red', size = 3)
> print(p)
>
> And this now works.
>
> Cheers,
>
> Hugh
>
>
> On 02/03/17 13:42, Richard M. Heiberger wrote:
> > You need the print() statement.  See FAQ  7.22 in file
> > system.file("../../doc/FAQ")
> >
> >
> > 7.22 Why do lattice/trellis graphics not work?
> > ==
> >
> > The most likely reason is that you forgot to tell R to display the
> > graph.  Lattice functions such as 'xyplot()' create a graph object, but
> > do not display it (the same is true of *ggplot2*
> > (https://CRAN.R-project.org/package=ggplot2) graphics, and Trellis
> > graphics in S-PLUS).  The 'print()' method for the graph object produces
> > the actual display.  When you use these functions interactively at the
> > command line, the result is automatically printed, but in 'source()' or
> > inside your own functions you will need an explicit 'print()' statement.
> >
> > On Thu, Mar 2, 2017 at 8:37 AM, Hugh Morgan 
> wrote:
> >> Hi All,
> >>
> >> I am having trouble outputting ggplot2 graphics to pdf as part of a
> >> script.  It works if when I pipe the script into R or if I type the
> >> commands directly into the terminal, but not if I load it using the
> >> source(..) command.  In this case the outputted pdf is always size 3611,
> >> and it fails to open with the error "This document contains no pages".
> >>
> >> As an example I wrap the create pdf commands around the 1st example in
> >> ?ggplot:
> >>
> >> $ cat test.R
> >>
> >> library(ggplot2)
> >>
> >> pdf("test.pdf")
> >>
> >> df <- data.frame(
> >>gp = factor(rep(letters[1:3], each = 10)),
> >>y = rnorm(30)
> >> )
> >> ds <- plyr::ddply(df, "gp", plyr::summarise, mean = mean(y), sd = sd(y))
> >> ggplot(df, aes(gp, y)) +
> >>geom_point() +
> >>geom_point(data = ds, aes(y = mean), colour = 'red', size = 3)
> >>
> >> dev.off()
> >>
> >> Piping it into R works:
> >>
> >> $ R --no-save < test.R
> >>
> >> ...
> >>
> >> $ ll test.pdf
> >>
> >> -rw-rw-r-- 1 user group 4842 Mar  2 13:18 test.pdf
> >>
> >> This file opens fine and has a graphic.  If I repeat the process using
> >> source():
> >>
> >> $ R --no-save
> >>
> >> R version 3.3.2 (2016-10-31) -- "Sincere Pumpkin Patch"
> >> ...
> >>
> >>> source("test.R")
> >>>
> >> $ ll test.pdf
> >> -rw-rw-r-- 1 user group 3611 Mar  2 13:25 test.pdf
> >>
> >> This file fails to open, and always has the size 3611.
> >>
> >> Any help appreciated,
> >>
> >> Hugh
> >>
> >>
> >>
> >> This email may have a PROTECTIVE MARKING, for an explanation please see:
> >>
> http://www.mrc.ac.uk/About/Informationandstandards/Documentmarking/index.htm
> >>
> >> __
> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
>
>
> This email may have a PROTECTIVE MARKING, for an explanation please see:
> http://www.mrc.ac.uk/About/Informationandstandards/Documentmarking/index.htm
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Averaging without NAs

2017-03-02 Thread Ulrik Stervbo

Hi Elahe,

?mean

in particular the na.rm argument.

HTH
Ulrik

On Thu, 2 Mar 2017 at 11:55 ch.elahe via R-help 
wrote:

> Hi all,
>
> The question seems easy but I could not find an answer for it. I have the
> following column in my data frame and I want to take average of the column
> excluding the number of NAs.
>
> $X2016.Q1 : int 47 53 75 97 NA NA 23 NA 43 NA 
>
> Does anyone know how to do that?
> Thanks for any help
> Elahe
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] where is .emacs file?

2017-03-01 Thread Ulrik Stervbo

Files starting with dots are hidden files on Mac. The emacs configuration
file .emacs should be in your home directory. You can list all files - also
the hidden ones - by `ls -a` in your console. I don't use Mac so I can't
tell you how to show hidden files in Finder.

If you still can't find the file, you might have better chances by asking
in some Emacs forum.

Alternatively you can use R-studio instead of emacs/ess.

HTH
Ulrik

On Wed, 1 Mar 2017 at 13:12 Naresh Gurbuxani 
wrote:


I am trying to install ESS so that it can be used when EMACS is launched
from Mac Terminal.  After running "make" from the directory where ESS files
are saved, the instructions ask the following to be added to .emacs file:

(require 'ess-site)

But I cannot find .emacs file.

I have already installed EMACS modified for ESS from Vincent Goulet's
website.  In my EMACS application, ESS is available (verified by typing M-X
ess-version).

Thanks,
Naresh
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Selecting rows and columns of a data frame using relational operators

2017-02-27 Thread Ulrik Stervbo

Hi Tunga,

The function subset() is probably what you are looking for. You might also
want to look at a tutorial to understand the R syntax.

In addition, calling your data data is not a good idea because of the name
clash with the function data().

Hope this helps,
Ulrik

On Mon, 27 Feb 2017 at 13:10 Tunga Kantarcı  wrote:

> Consider a data frame named data. data contains 4 columns and 1000
> rows. Say the aim is to bring together columns 1, 2, and 4, if the
> values in column 4 is equal to 1. We could use the syntax
>
> data(data[,4] == 1, c(1 2 4))
>
> for this purpose. Suppose now that the aim is to bring together
> columns 1, 2, and 4, if the values in column 4 is equal to 1, for the
> first 20 rows of column 4. We could use the syntax
>
> data(data[1:20,4] == 1, c(1 2 4))
>
> for this purpose. However, this does not produce the desired result.
> This is surprising at least for someone coming from MATLAB because
> MATLAB produces what is desired.
>
> Question 1: The code makes sense but why does it not produce what we
> expect it to produce?
>
> Question 2: What code is instead suitable?
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Create gif from series of png files

2017-02-14 Thread Ulrik Stervbo

Hi Shane,

Wrong forum. This might be what you are looking for

ffmpeg -i %03d.png output.gif

Or use the library gganimate.

Best
Ulrik

Shane Carey  schrieb am Di., 14. Feb. 2017, 12:08:

> Hi,
>
> I have many png files that I would like to stitch together, in order to
> make a gif file.
>
> Any ideas how I would do this?
>
> Thanks
>
> --
> Le gach dea ghui,
> Shane
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] RStudio: Place for Storing Options

2017-02-09 Thread Ulrik Stervbo

Hi Georg,

maybe someone here knows, but I think you are more likely to get answers to
Rstudio related questions with RStudio support:
https://support.rstudio.com/hc/en-us

Best,
Ulrik

On Thu, 9 Feb 2017 at 12:35  wrote:

> Hi All,
>
> I would like to make a backup of my RStudio IDE options I configure using
> "Tools/Global Options" from the menu bar. Searching the web did not reveal
> anything.
>
> Can you tell me where RStudio IDE does store its configuration?
>
> Kind regards
>
> Georg
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] can this be simplified with purrr:map or mapply?

2017-01-18 Thread Ulrik Stervbo

If you want to use purrr, you could do

fil <- paste0("data",2004:2014,".txt")
map_df(fil, read.table, .id = "fil")

to get everything in one data frame (I assume all files have the same
structure)

HTH
Ulrik

On Wed, 18 Jan 2017 at 16:10 PIKAL Petr  wrote:

> Hi
>
> Let me make some assumptions. Your data are stored as *txt files somewhere.
>
> I would start with vector of file names
>
> fil<-paste0("data",2004:2014,".txt")
> > fil
>  [1] "data2004.txt" "data2005.txt" "data2006.txt" "data2007.txt"
> "data2008.txt"
>  [6] "data2009.txt" "data2010.txt" "data2011.txt" "data2012.txt"
> "data2013.txt"
> [11] "data2014.txt"
>
> than I would read my files into predefined list
>
> mylist<-vector(length=11, mode="list")
>
> You can add names to this list by
> names(mylist) <- fil
>
> Than simple cycle
>
> for (i in seq_along(fil)) {
>
> mylist[[i]] <- read.table(fil[i])
>
> ... you could also do some polishing of single files here
> }
>
> Now you have list of data frames and you can do many operations on this
> whole list instead on separate data frames.
>
> Further operations depend on structure of your data frames and your exact
> intention what do you want to do with them.
>
> I wonder why merge is appropriate if you have separate data for each year.
> Why not to use rbind followed by aggregate.
>
> But here I am fishing in really murky water.
>
> Cheers
> Petr
>
>
> > -Original Message-
> > From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Brandon
> > Payne
> > Sent: Wednesday, January 18, 2017 10:57 AM
> > To: r-help@r-project.org
> > Subject: [R] can this be simplified with purrr:map or mapply?
> >
> > I can't get my head around the *apply family of functions,
> > surely there's a better way than what I have.
> > I've tried using paste0 on the years, but then I have lists of strings,
> not
> > data frames.
> >
> > `realList<-paste0("exp",as.character(2004:2014),"s")`
> >
> > I can't seem to convert these lists of strings to references to data
> frames.
> >
> > years<-c(2004:2014)
> > framesEXP<-
> > list(exp2004s,exp2005s,exp2006s,exp2007s,exp2008s,exp2009s,exp2011s,ex
> > p2010s,exp2012s,exp2013s,exp2014s)
> >
> > framesFAM<-
> > list(fam2004s,fam2005s,fam2006s,fam2007s,fam2008s,fam2009s,fam2011s,fa
> > m2010s,fam2012s,fam2013s,fam2014s)
> >
> > data2004<<-merge(exp2004s,fam2004s, by="HHNUM")
> > data2005<<-merge(exp2005s,fam2005s, by="HHNUM")
> > data2006<<-merge(exp2006s,fam2006s, by="HHNUM")
> > data2007<<-merge(exp2007s,fam2007s, by="HHNUM")
> > data2008<<-merge(exp2008s,fam2008s, by="HHNUM")
> > data2009<<-merge(exp2009s,fam2009s, by="HHNUM")
> > data2010<<-merge(exp2010s,fam2010s, by="HHNUM")
> > data2011<<-merge(exp2011s,fam2011s, by="HHNUM")
> > data2012<<-merge(exp2012s,fam2012s, by="HHNUM")
> > data2013<<-merge(exp2013s,fam2013s, by="HHNUM")
> > data2014<<-merge(exp2014s,fam2014s, by="HHNUM")
> >
> > dput(data2004, file="../dataframes/data2004.txt")
> > dput(data2005, file="../dataframes/data2005.txt")
> > dput(data2006, file="../dataframes/data2006.txt")
> > dput(data2007, file="../dataframes/data2007.txt")
> > dput(data2008, file="../dataframes/data2008.txt")
> > dput(data2009, file="../dataframes/data2009.txt")
> > dput(data2010, file="../dataframes/data2010.txt")
> > dput(data2011, file="../dataframes/data2011.txt")
> > dput(data2012, file="../dataframes/data2012.txt")
> > dput(data2013, file="../dataframes/data2013.txt")
> > dput(data2014, file="../dataframes/data2014.txt")
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-
> > guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> 
> Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou
> určeny pouze jeho adresátům.
> Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě
> neprodleně jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho kopie
> vymažte ze svého systému.
> Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento email
> jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat.
> Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou modifikacemi
> či zpožděním přenosu e-mailu.
>
> V případě, že je tento e-mail součástí obchodního jednání:
> - vyhrazuje si odesílatel právo ukončit kdykoliv jednání o uzavření
> smlouvy, a to z jakéhokoliv důvodu i bez uvedení důvodu.
> - a obsahuje-li nabídku, je adresát oprávněn nabídku bezodkladně přijmout;
> Odesílatel tohoto e-mailu (nabídky) vylučuje přijetí nabídky ze strany
> příjemce s dodatkem či odchylkou.
> - trvá odesílatel na tom, že příslušná smlouva je uzavřena teprve
> výslovným dosažením shody na všech jejích náležitostech.
> - odesílatel tohoto emailu informuje, že není oprávněn

Re: [R] Error in doc_parse_file

2017-01-07 Thread Ulrik Stervbo

Hi Maicel,

Please keep the list in CC.

I can't help with read_xml but perhaps someone on the list can.

Best,
Ulrik

On Fri, 6 Jan 2017, 19:03 Maicel Monzon,  wrote:

> Hi Ulrik,
>
>
>
> I meant 'read_xmlmap' was a bug. I did what you told me with all the set
> and the error message is:
>
>
>
>  “Error in doc_parse_file(con, encoding = encoding, as_html = as_html,
> options = options) :xmlParseEntityRef: no name [68]”
>
>
>
>
>
> keyword <-
>
>   muestra %>%
>
>   select(path) %>%  # I am attaching the all xml files..
>
>   unlist() %>%
>
>   map(.f = function(x) { read_xml(x) %>%
>
>xml_find_all( ".//kwd") %>%
>
>xml_text(trim=T) })
>
>
>
>
>
>
>
> I am attaching the xml files..
>
> Thank you
>
> Best regard
>
> Maicel
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] purrr::map and xml2:: read_xml

2017-01-06 Thread Ulrik Stervbo

Hi Maicel,

I'm guessing that B works on 50 files, and that A fails because there is no
function called 'read_xmlmap'. If the function that you map work well,
removing 'dplyr::sample_n(50)' from 'B' should solve the problem.

If that is not the case, we need a bit more information.

HTH
Ulrik

On Fri, 6 Jan 2017 at 17:08  wrote:

> Hi List, I am trying to extract the key words from 1403 papers in xml
> format. I programmed such codes but they do not work but they only do
> with the modification showed below. But that variation is not the one
> I need because the 1403 xml files do not match to those in my folder.
> Could you please tell me where are the mistakes in the codes list (A
> or B) to help me to correct them? The data frame columns are an id and
> the paths.
>
> A-Does not work, but it is the one I need.
>
> keyword <-
>muestra %>%
>select(path) %>%
>read_xmlmap(.f = function(x) { read_xml(x) %>%
> xml_find_all( ".//kwd") %>%
> xml_text(trim=T) })
>
> B-It works but only with a small number of papers.
>
> keyword <-
>muestra %>%
>select(path) %>%
> dplyr::sample_n(50) %>%
> unlist() %>%
>map(.f = function(x) { read_xml(x) %>%
> xml_find_all( ".//kwd") %>%
> xml_text(trim=T) })
>
> Thank you,
> Maicel Monzon MD, PHD
>
>
> 
>
>
>
>
> --
> Este mensaje le ha llegado mediante el servicio de correo electronico que
> ofrece Infomed para respaldar el cumplimiento de las misiones del Sistema
> Nacional de Salud. La persona que envia este correo asume el compromiso de
> usar el servicio a tales fines y cumplir con las regulaciones establecidas
>
> Infomed: http://www.sld.cu/
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

1 2 3 >

1 - 100 of 202 matches

Mail list logo