Thanks to Jim's prompting, I think I came up with a fairly painless way to
parse the HTML without having to write any parsing code myself using the
function getHTMLExternalFiles in the XML package. A working version of the
code follows:
## Code to process USGS peak flow data
Dear all,
would appreciate your suggestions on subsetting a dataframe : please let's
consider an example dataframe df:
dd-c(1,2,3)
rows-c(A1,A2,A3)
columns-c(B1,B2,B3)
numbers - c(400, 500, 600)
df - dataframe(dd,rows,columns, numbers)
and a vector : test_rows -c(A1,A3) ;
how could I subset
Hi,How is it possible to load a very big .RData that can't be loaded it's very
big and the following error msg is displayed
load(.RData)
Error: error reading from connection
Thanks
Carol
[[alternative HTML version deleted]]
__
Hello,
I want to subset specific rows of data from 80 .csv files and write those
subsets into new .csv files. The data I want to subset starts on a
different row for each original .csv file. I've created variables that
identify which row the subset should start and end on, but I want to loop
Use is.element(elements,set), or its equivalent, elements %in% set:
df - data.frame(dd = c(1, 2, 3),
rows = c(A1, A2, A3),
columns = c(B1, B2, B3),
numbers = c(400, 500, 600))
test_rows -c(A1,A3)
df[ is.element(df$rows, test_rows), ]
# dd rows
Thank you Don.
I've incorporated your suggestions which have helped me to understand how
loops work better than previously. However, the loop gets stuck trying to
read the current file:
mig.processed.data - read.csv(/Users/cdanyluck/Documents/Studies/MIG -
Dissertation/Data
participant.files - list.files(/Users/cdanyluck/Documents/Studies/MIG -
Dissertation/Data Syntax/MIG_RAW DATA TXT Files/Plain Text Files)
Try adding the argument full.names=TRUE to that call to list.files().
Bill Dunlap
TIBCO Software
wdunlap tibco.com
On Mon, Jun 8, 2015 at 7:15 PM, Chad
So you have 80 files, one for each participant?
It appears that from each of the 80 files you want to extract three
subsets of rows,
one set for baseline
one set for audio
one set for free
What I think I would do, if the above is correct, is create one master
file. This file will have
Hi All,
I have a data set with 11000 rows 19 columns.
I have 2 columns on which I need to summarize the data:- Date Weight.
Snapshot is :
Date
13/03/2015
31/03/2015
15/03/2015
17/03/2015
17/03/2015
11/3/2015
11/3/2015
19/03/2015
CHG_WT
0
0
0
770
3,730
70
10
500
Dear list,
I found an odd behavior of the mean function; it is allowed to do
something that you probably shouldn't:
If you calculate mean() of a sequence of numbers (without declaring them
as vector), mean() then just computes mean() of the first element. Is
there a reason why there is no
Hi,
I have been doing some text mining. I created the DTM matrix using the
following steps.
corpus1-VCorpus(VectorSource(resume1$Dat1))
corpus1-tm_map(corpus1,content_transformer(tolower))
dtm-DocumentTermMatrix(corpus1,
control = list(removePunctuation = TRUE,
On Mon, 8 Jun 2015, Christian Brandstätter wrote:
Dear list,
I found an odd behavior of the mean function; it is allowed to do something
that you probably shouldn't:
If you calculate mean() of a sequence of numbers (without declaring them as
vector), mean() then just computes mean() of the
Thanks all off you ;)
I think I got it.
I was saving the workplace and loading it, but after that I wasn’t calling my
data ;)
really naive.
Thanks very much.
best
RO
Atenciosamente,
Rosa Oliveira
--
Rosa
Hi,
Thanks a lot. I downloaded the tar.gz file and I found the C code.
I would really appreciate it if you could field another question:
I have to use sql, and I have to perform various statistical calculations -
like integrate, dbeta etc. Sql does not have these functions, plus they are
very
Hi
Is your Date really Date or is it character? What is result of
str(Date)
If you want to det summaries for dates you can use
?aggregate
However in this case I strongly recommend to show us your data by
dput(yourdata)
and explain on the example what summary do you want.
I can be
Dear Jim Lemon,
Thank you very much Jim for your help.
Regards,
Pijush
[[alternative HTML version deleted]]
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the
Aehm, adding on this: I incorrectly *assumed* without testing that rounding
would help; it doesn't:
ecdf(round(test2,0))# a rounding that is way too rough for my application...
#Error in xy.coords(x, y) : 'x' and 'y' lengths differ
Digging deeper: The initially mentioned call to unique() is
Hi Petr,
Thanks for the explanation below.
I tried the code you supplied however it seems as my date is a factor hence
it is not working.
The error I got from the code was :
Error: unexpected symbol in:
final-aggregate(test$CHG_WT,list(format(test$CR_DT,%d),sum)
final
str(test$CR_DT)- gives
All,
I encountered the following issue with ecdf which was originally on a vector of
length 10,000, but I have been able to reduce it to a minimal reproducible
example (just to avoid questions why I'd want to do this for a vector of length
2...):
test2 = structure(list(X817 =
Thank you for the explanation.
But if you take for instance plot.default(), being another generic
function, it would not work like that:
plot(1,2,3,4), only plot(1,2) is accepted.
From R-help (Usage):
## Default S3 method:
mean(x, trim = 0, na.rm = FALSE, ...)
What is puzzling, is that
Hi All,
Kindly see the below code I have used:
maxorder-ddply(test, ~ ORIGIN,summarize,Weight=sum(CHG_WT))
Here I have written the code to summarize values based on origin and total
weight however I am getting below error:
Error: ‘sum’ not meaningful for factors
Please advice. I need CHG_WT
On 07/06/2015 11:05 PM, Varun Sinha wrote:
Hi,
Thanks a lot. I downloaded the tar.gz file and I found the C code.
I would really appreciate it if you could field another question:
I have to use sql, and I have to perform various statistical
calculations - like integrate, dbeta etc. Sql
David,I appreciate you suggestion, but it won't work for me. I need to replace
the space for a period at the time the data are read, not afterward. My
variables names have periods I want to keep, if I use your suggestion I will
replace the period inserted when the data are read, as well as the
John,
I like using stringr or stringi for this type of thing. stringi is written in C
and faster so I now typically use it. You can also use base functions. The main
trick is the handy names() function.
example - data.frame(Col 1 A = 1:3, Col 1 B = letters[1:3])
example
Col.1.A Col.1.B
1
You can use gsub() to change the names:
dat - data.frame(Var 1=rnorm(5, 10), Var 2=rnorm(5, 15))
dat
Var.1Var.2
1 9.627122 14.15376
2 10.741617 16.92937
3 8.492926 15.23767
4 12.226146 15.19834
5 8.829982 14.46957
names(dat) - gsub(\\., _, names(dat))
dat
Var_1Var_2
1
Hi
-Original Message-
From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Shivi82
Sent: Monday, June 08, 2015 11:23 AM
To: r-help@r-project.org
Subject: Re: [R] Summarizing data based on Date
Hi Petr,
Thanks for the explanation below.
I tried the code you supplied
I am reading a csv file. The column headers have spaces in them. The spaces are
replaced by a period. I want to replace the space by another character (e.g.
the underline) rather than the period. Can someone tell me how to accomplish
this?Thank you,
John
John David Sorkin M.D., Ph.D.
Professor
Hi
-Original Message-
From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Shivi82
Sent: Monday, June 08, 2015 12:48 PM
To: r-help@r-project.org
Subject: [R] Cannot Sum with DDPLY
Hi All,
Kindly see the below code I have used:
maxorder-ddply(test, ~
Easiest? Use sub() to replace the periods after the fact.
You can also use the check.names or the col.names arguments to
read.table() to customize your import.
Sarah
On Mon, Jun 8, 2015 at 10:15 AM, John Sorkin
jsor...@grecc.umaryland.edu wrote:
I am reading a csv file. The column headers have
On 08/06/2015 6:04 AM, Christian Brandstätter wrote:
Thank you for the explanation.
But if you take for instance plot.default(), being another generic
function, it would not work like that:
plot(1,2,3,4), only plot(1,2) is accepted.
From R-help (Usage):
## Default S3 method:
mean(x,
Thank you very much, I didn't know that.
On 08/06/2015 6:04 AM, Christian Brandstätter wrote:
Thank you for the explanation.
But if you take for instance plot.default(), being another generic
function, it would not work like that:
plot(1,2,3,4), only plot(1,2) is accepted.
From R-help
Then using Sarah's suggestion something like?
dat - read.table(text=
+ 'Var 1' Var.2
+ 1 6
+ 2 7
+ 3 8
+ 4 9
+ 5 10, header=TRUE, col.names=c(Var_1, Var.2))
dat
Var_1 Var.2
1 1 6
2 2 7
3 3 8
4 4 9
5 510
David C
From: John Sorkin
I've taken the liberty of copying this back to the list, so that others can
participate in or benefit from the discussion.
On Mon, Jun 8, 2015 at 10:49 AM, John Sorkin jsor...@grecc.umaryland.edu
wrote:
Sarah,
I am not sure how I use check.names to replace every space in the names of
my
On 08/06/2015 10:23 AM, Sarah Goslee wrote:
Easiest? Use sub() to replace the periods after the fact.
You can also use the check.names or the col.names arguments to
read.table() to customize your import.
Yes, check.names is the right idea. Use check.names = FALSE, then use
sub() or gsub()
Dear R users,
It is my pleasure to announce the availability of package stepR (1.0-2)
on CRAN.
The main purpose of the package is to fit piecewise constant functions
(a.k.a. step-functions or block signals) to serial data in a fully
data-driven manner under certain (Gaussian or
Dear R users,
It is a pleasure for me to announce the availability of the new package
kwb.hantush (0.2.1) on CRAN. Its objective is the calculation of groundwater
mounding beneath an (stormwater) infiltration basin by solving the Hantush
(1967) equation. For checking the correct
Dear UseRs,
wikipediatrend - a package to retrieve Wikipedia page access statistics -
has jumped from version 0.2 to 1.1.3 and now is more streamlined, feature
richer, more tested and comes with a vignette as well as a lot of fun.
packge information:
Aehm, adding on this: I incorrectly *assumed* without testing that rounding
would help; it doesn't:
ecdf(round(test2,0)) # a rounding that is way too rough for my application...
#Error in xy.coords(x, y) : 'x' and 'y' lengths differ
Digging deeper: The initially mentioned call to
On 08 Jun 2015, at 17:03 , Sarah Goslee sarah.gos...@gmail.com wrote:
I'd import them with check.names=FALSE, then modify them explicitly:
mynames - c(x y, x y, x y, x y)
mynames
[1] x y x y x y x y
mynames - sub( , ., mynames)
mynames
[1] x.y x.y x.y x.y
mynames - paste(mynames,
Sarah,
Many, many thanks.
John
John David Sorkin M.D., Ph.D.
Professor of Medicine
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology and
Geriatric Medicine
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
mynames
[1] x.y x.y x.y x.y
mynames - paste(mynames, seq_along(mynames), sep=_)
In addition, if there were a variety of names in mynames and you
wanted to number each unique name separately you could use ave():
origNames - c(X, Y, Y, X, Z, X)
ave(origNames, origNames,
Estimado Javier Villacampa González
Hace mucho que no uso awk o gawk, pero recuerdo cygwin y en lo personal no tuve
inconvenientes con awk.
No se como está esa tecnología hoy en día, pero yo evalué usar R con awk, al
respecto hay una integración en
Al final resulto más fácil de lo esperado. Hay que instalar cywin y
utilizar los comandos de la siguiente manera
system('C:/cygwin/bin/wc -l var_risco_2012.csv')
Esto en principio funciona
El 8 de junio de 2015, 17:41, Carlos Ortega c...@qualityexcellence.es
escribió:
Hola,
Mira esto:
Hola,
No sé si esta respuesta en Stack Overflow te puede ayudar:
http://stackoverflow.com/questions/26938118/check-for-dialog-box-in-rselenium
Saludos,
Carlos Ortega
www.qualityexcellence.es
El 8 de junio de 2015, 9:09, José Luis Cañadas Reche canadasre...@gmail.com
escribió:
Gracias
Hola,
si, se que para trabajar con data.table se usan [] pero en esta línea estoy
creando el data.table, es decir, data.table es la función a la que llamo y
PesosParam es el data.table que creo
Un saludo
MªLuz Morales
El 8 de junio de 2015, 15:17, Carlos J. Gil Bellosta c...@datanalytics.com
Hola,
yo quiero construir un data.table donde una columna (Parametros) son
caracteres y otra el resultado de la función information.gain, que devuelve
un data.frame. El código que he usado es este, pero me da error
PesosParam - data.table(,.(Parametros, Peso:=
Hola, ¿qué tal?
data.table funciona con corchetes, no paréntesis. ¿Has leído la viñeta/tutorial?
Un saludo,
Carlos J. Gil Bellosta
http://www.datanaytics.com
El día 8 de junio de 2015, 15:14, MªLuz Morales mlzm...@gmail.com escribió:
Hola,
yo quiero construir un data.table donde una columna
Solucionado
Efectivamente era un problema de notación. Esto funciona
PesosParam - data.table(Param = Parametros, Peso =
information.gain(In.hospital_death~., ParamCol))
Nota: Parametros es un vector de caracteres
Gracias
Un saludo
MªLuz
El 8 de junio de 2015, 16:08, Carlos Ortega
Hola buenas,
a veces empleo desde R shells de unix, Existe alguna manera de utilizar
estos shelss desde windows o el lenguaje awk.
La idea es hacerlo siempre desde R, igual invoncando cygwin desde windows
es posible. Pero no me queda claro
Un abrazo y gracias por adelntado
Javier
Hola,
Mira esto:
http://stackoverflow.com/questions/18603984/using-system-with-windows
Saludos,
Carlos Ortega
www.qualityexcellence.es
El 8 de junio de 2015, 17:14, Javier Villacampa González
javier.villacampa.gonza...@gmail.com escribió:
Hola buenas,
a veces empleo desde R shells de
50 matches
Mail list logo