[R] Restructuring Star Wars data from rwars package

2017-08-03 Thread Matt Van Scoyoc
I'm having trouble restructuring data from the rwars package into a
dataframe. Can someone help me?

Here's what I have...

library("rwars")
library("tidyverse")

# These data are json, so they load into R as a list
people <- get_all_people(parse_result = T)
people <- get_all_people(getElement(people, "next"), parse_result = T)

# Look at Anakin Skywalker's data
people$results[[1]]
people$results[[1]][1] # print his name

# To use them in R, I need to restructure them to a dataframe like they are
in dplyr
data("starwars")
glimpse(starwars)

Thanks for the help.

Cheers,
MVS
=
Matthew Van Scoyoc
=
Think SNOW!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] find similar words in text

2017-08-03 Thread Patrick Casimir
Use tm package and create a corpus to capture terms from the TDM within the 
corpus. Then you can apply as.matrix() to display terms' occurences. Go to CRAN 
and read about tm package.

From: R-help  on behalf of Boris Steipe 

Sent: Thursday, August 3, 2017 6:40:09 PM
To: Riaan Van Der Walt
Cc: R lists
Subject: Re: [R] find similar words in text

Please keep messages on the list so others can pitch in.

_Which_ words do you want to consider identical for the purpose of frequency 
count?
_What_ do you want to plot?



B.



> On Aug 3, 2017, at 4:36 PM, Riaan Van Der Walt  
> wrote:
>
> Hallo Boris,
> I've loaded the Rstem, Snowball.
> But I am clueless how to get a list eg. whal* (whale, whales, whaling, 
> whaler, whalers, whaleman, whalemen, whale-ship, whale-boat, whale's)
> in the book Moby Dick and the frequency of each of the different words.
> I'am usig this script:
>
> whales1.v <- grep("^whal.*", moby.word.v)
> whales1.v
>
> The total occurrence for whal* is 1699.
> But I can't display it or plot it.
>
> I am new to R and the learning curve is steep!!
>
> Thx!
> Riaan
>
>
> Riaan van der Walt
> Tel / Phone / Mogala : 27+72+2172429
> Email / Epos / Emeile: riaan.vanderw...@nwu.ac.za
> Url: http://www.nwu.ac.za/
>
> >>> Boris Steipe  31 Jul 2017 23:37 >>>
> You need a stemming algorithm. See here:
>   https://cran.r-project.org/web/views/NaturalLanguageProcessing.html
>
> Myself, I've had good experience with Rstem.
>
> B.
>
>
>
>
>
> > On Jul 31, 2017, at 4:47 PM, Riaan Van Der Walt 
> >  wrote:
> >
> > I am new to R.
> > Busy with Text Analysis.
> >
> > Need a script to find e.g
> >
> > whale, whales, whale's, whaler, whalers, whaling,... in Moby Dick
> >
> > Riaan
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> 

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] find similar words in text

2017-08-03 Thread Boris Steipe
Please keep messages on the list so others can pitch in.

_Which_ words do you want to consider identical for the purpose of frequency 
count?
_What_ do you want to plot?



B.



> On Aug 3, 2017, at 4:36 PM, Riaan Van Der Walt  
> wrote:
> 
> Hallo Boris,
> I've loaded the Rstem, Snowball.
> But I am clueless how to get a list eg. whal* (whale, whales, whaling, 
> whaler, whalers, whaleman, whalemen, whale-ship, whale-boat, whale's) 
> in the book Moby Dick and the frequency of each of the different words.
> I'am usig this script:
>  
> whales1.v <- grep("^whal.*", moby.word.v) 
> whales1.v
>  
> The total occurrence for whal* is 1699.
> But I can't display it or plot it.
>  
> I am new to R and the learning curve is steep!!
>  
> Thx!
> Riaan
> 
> 
> Riaan van der Walt
> Tel / Phone / Mogala : 27+72+2172429
> Email / Epos / Emeile: riaan.vanderw...@nwu.ac.za 
> Url: http://www.nwu.ac.za/
>  
> >>> Boris Steipe  31 Jul 2017 23:37 >>>
> You need a stemming algorithm. See here:
>   https://cran.r-project.org/web/views/NaturalLanguageProcessing.html
> 
> Myself, I've had good experience with Rstem.
> 
> B.
> 
> 
> 
> 
> 
> > On Jul 31, 2017, at 4:47 PM, Riaan Van Der Walt 
> >  wrote:
> > 
> > I am new to R.
> > Busy with Text Analysis.
> > 
> > Need a script to find e.g 
> > 
> > whale, whales, whale's, whaler, whalers, whaling,... in Moby Dick
> > 
> > Riaan
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> 
> 

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Strange behaviour to download zip file using R

2017-08-03 Thread David Winsemius

> On Aug 3, 2017, at 11:38 AM, Christofer Bogaso  
> wrote:
> 
> Hi again,
> 
> I was trying to download stock market data from below link :
> 
> https://www.nseindia.com/products/content/equities/equities/archieve_eq.htm
> 
> Input choice :
> 
> Select Report: Bhavcopy
> Date(DD-MM-): 03-03-2010
> 
> If you put manual input as above, then we will get option for manual
> download of file :
> 
> cm03MAR2010bhav.csv.zip
> 
> However I then tried to use R to have some automatic download :
> 
>> download.file('https://www.nseindia.com/content/historical/EQUITIES/2010/MAR/cm03MAR2010bhav.csv.zip',
>>  'aa.zip')
> 
> trying URL 
> 'https://www.nseindia.com/content/historical/EQUITIES/2010/MAR/cm03MAR2010bhav.csv.zip'
> 
> Error in 
> download.file("https://www.nseindia.com/content/historical/EQUITIES/2010/MAR/cm03MAR2010bhav.csv.zip;,
> :
> 
>  cannot open URL
> 'https://www.nseindia.com/content/historical/EQUITIES/2010/MAR/cm03MAR2010bhav.csv.zip'
> 
> In addition: Warning message:
> 
> In 
> download.file("https://www.nseindia.com/content/historical/EQUITIES/2010/MAR/cm03MAR2010bhav.csv.zip;,
> :
> 
>  cannot open URL
> 'https://www.nseindia.com/content/historical/EQUITIES/2010/MAR/cm03MAR2010bhav.csv.zip':
> HTTP status was '403 Forbidden'
> 
> Ofcourse I I place below direct link
> 'https://www.nseindia.com/content/historical/EQUITIES/2010/MAR/cm03MAR2010bhav.csv.zip'
> in address-bar of my Chrome, I am denied permission
> 
> Do you have any idea what is going on here?

Yes. They are trying to prevent you from accessing their files against their 
terms of service: Item # 12 in their TOS:

• You may not conduct any systematic or automated data collection 
activities (including scraping, data mining, data extraction and data 
harvesting) on or in relation to our website without our express written 
consent.

> Do I need to get some setting?

No, you just need to obey the law.

> 
> Any pointer will be highly appreciated.
> 
> 

David Winsemius
Alameda, CA, USA

'Any technology distinguishable from magic is insufficiently advanced.'   
-Gehm's Corollary to Clarke's Third Law

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Strange behaviour to download zip file using R

2017-08-03 Thread Christofer Bogaso
Hi again,

I was trying to download stock market data from below link :

https://www.nseindia.com/products/content/equities/equities/archieve_eq.htm

Input choice :

Select Report: Bhavcopy
Date(DD-MM-): 03-03-2010

If you put manual input as above, then we will get option for manual
download of file :

cm03MAR2010bhav.csv.zip

However I then tried to use R to have some automatic download :

> download.file('https://www.nseindia.com/content/historical/EQUITIES/2010/MAR/cm03MAR2010bhav.csv.zip',
>  'aa.zip')

trying URL 
'https://www.nseindia.com/content/historical/EQUITIES/2010/MAR/cm03MAR2010bhav.csv.zip'

Error in 
download.file("https://www.nseindia.com/content/historical/EQUITIES/2010/MAR/cm03MAR2010bhav.csv.zip;,
 :

  cannot open URL
'https://www.nseindia.com/content/historical/EQUITIES/2010/MAR/cm03MAR2010bhav.csv.zip'

In addition: Warning message:

In 
download.file("https://www.nseindia.com/content/historical/EQUITIES/2010/MAR/cm03MAR2010bhav.csv.zip;,
 :

  cannot open URL
'https://www.nseindia.com/content/historical/EQUITIES/2010/MAR/cm03MAR2010bhav.csv.zip':
HTTP status was '403 Forbidden'

Ofcourse I I place below direct link
'https://www.nseindia.com/content/historical/EQUITIES/2010/MAR/cm03MAR2010bhav.csv.zip'
in address-bar of my Chrome, I am denied permission

Do you have any idea what is going on here? Do I need to get some setting?

Any pointer will be highly appreciated.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] find similar words in text

2017-08-03 Thread Bert Gunter
You really need to spend some time learning R if you wish to use R.

See ?grep and note the "value" argument. So you want:

whales.v <- grep(*^whal.*", moby.word.v,value = TRUE)

-- Bert




On Thu, Aug 3, 2017 at 5:14 AM, Riaan Van Der Walt
 wrote:
> I received this from Matt Jockers and it worked!
> I missed something.
> How can I now see(display) this list?
> Hi Riann,
>
> There are a couple of ways that you could do this. . . the best
> approach would probably be to use *grep* instead of *which*, but let me
> show you both ways.
>
> On page 30, replace
> whales.v <- which(moby.word.v == *whale*)
> with
> whale_words <- c(*whale", *whales", *whale's", *whaler", *whalers",
> *whaling")
> whales.v <- which(moby.word.v %in% whale_words)
>
> the alternative (better) way to do this, with grep, looks like this
>
> whales.v <- grep(*^whal.*", moby.word.v)
>
> grep uses the regular expression ^whal.* to find all words starting (^)
> with *whal* followed by any number of other characters (.*)
>
> All best,
>
> Matt
>
> --
> Matthew L. Jockers
> Associate Dean for Research and Partnerships
> College of Arts & Sciences
> Susan J. Rosowski Associate Professor of English
> University of Nebraska-Lincoln
> 1223 Oldfather Hall
> P.O. Box 880312
> Lincoln, NE  68588-0312
> 402.472.2891
> www.matthewjockers.net
>
> I am new to R.
> Busy with Text Analysis.
>
> Need a script to find e.g
>
> whale, whales, whale's, whaler, whalers, whaling,... in Moby Dick
>
> Riaan
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test for proportion or concordance

2017-08-03 Thread Bert Gunter
This list is about R programming, not statistics, although admittedly
there is a nonempty intersection. However, I think you would do better
posting this on a statistics list like stats.stackexchange.com.

-- Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Thu, Aug 3, 2017 at 7:19 AM, Adrian Johnson
 wrote:
> Hello group,
>
> my question is deciding what test would be appropriate for following question.
>
> An experiment 'A' yielded 3200 observations of which 431 are
> significant. Similarly, using same method, another experiment 'B' on a
> different population yielded 2541 observations of which 260 are
> significant.
>
> There are 180 observations that are common between significant
> observations of A and B.
> (180 are common between 431 and 260).
>
> 80 observations are specific to A
> 251 observations are specific to B.
>
> The question are the 180 observations  that are common between A and B
> - are these 180 common observations occurring by  chance?
>
> What test would be appropriate for this scenario.  (if my total
> observations are fixed between two experiments A and B, I could use
> Cohens kappa for concordance or Chi-square etc.
> Since the total observations differ between experiments A and B, I
> dont know what test would be appropriate.   I appreciate your help.
>
> thanks
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] find similar words in text

2017-08-03 Thread Riaan Van Der Walt
I received this from Matt Jockers and it worked!
I missed something.
How can I now see(display) this list?
Hi Riann, 

There are a couple of ways that you could do this. . . the best
approach would probably be to use *grep* instead of *which*, but let me
show you both ways.

On page 30, replace
whales.v <- which(moby.word.v == *whale*) 
with 
whale_words <- c(*whale", *whales", *whale's", *whaler", *whalers",
*whaling")
whales.v <- which(moby.word.v %in% whale_words) 

the alternative (better) way to do this, with grep, looks like this

whales.v <- grep(*^whal.*", moby.word.v) 

grep uses the regular expression ^whal.* to find all words starting (^)
with *whal* followed by any number of other characters (.*)

All best,

Matt

--
Matthew L. Jockers
Associate Dean for Research and Partnerships
College of Arts & Sciences
Susan J. Rosowski Associate Professor of English
University of Nebraska-Lincoln
1223 Oldfather Hall
P.O. Box 880312
Lincoln, NE  68588-0312
402.472.2891
www.matthewjockers.net
 
I am new to R.
Busy with Text Analysis.
 
Need a script to find e.g 
 
whale, whales, whale's, whaler, whalers, whaling,... in Moby Dick
 
Riaan
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] test for proportion or concordance

2017-08-03 Thread Adrian Johnson
Hello group,

my question is deciding what test would be appropriate for following question.

An experiment 'A' yielded 3200 observations of which 431 are
significant. Similarly, using same method, another experiment 'B' on a
different population yielded 2541 observations of which 260 are
significant.

There are 180 observations that are common between significant
observations of A and B.
(180 are common between 431 and 260).

80 observations are specific to A
251 observations are specific to B.

The question are the 180 observations  that are common between A and B
- are these 180 common observations occurring by  chance?

What test would be appropriate for this scenario.  (if my total
observations are fixed between two experiments A and B, I could use
Cohens kappa for concordance or Chi-square etc.
Since the total observations differ between experiments A and B, I
dont know what test would be appropriate.   I appreciate your help.

thanks

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R-es] problema al transformar columna tipo "factor" a tipo"numeric" en data.table

2017-08-03 Thread eric

  
  
gracias a todos por las sugerencias ... 

Carlos: solo lo estaba haciendo dentro de data.table porque el
  archivo de datos ya tenia esa clase :) ... 

Dentro de data.table funciono bien transformando primero a
  character y luego a numeric, como indica Fernando.

gracias de nuevo.

eric.






On 08/03/2017 09:34 AM, Javier Marcuzzi
  wrote:


  Estimados

Mi forma en un ejemplo

Datos$Columna <- as.factor(Datos$Columna)

O as.numeric

Javier Rubén Marcuzzi

De: Fernando Macedo
Enviado: miércoles, 2 de agosto de 2017 21:42
Para: Carlos Ortega
CC: Lista R
Asunto: Re: [R-es] problema al transformar columna tipo "factor" a tipo"numeric" en data.table

Creo que el problema es que cuando lo pasas directamente a numeric el 
toma los niveles para transformarlos. Los factores tienen sus niveles 
con sus etiquetas, digamos, que es lo que vemos nosotros. Por ejemplo 
machos y hembras puede ser lo que vemos mientras que internamente los 
niveles son 1 y 2.
Si fuera ese el problema yo lo resuelvo transformando primero en 
character y luego a numeric.
Quedaría así:

datos$coltipofactor = as.numeric(as.character(datos$coltipofactor))

Prueba así a ver si era eso.

--
Fernando Macedo



El 02/08/17 a las 14:49, Carlos Ortega escribió:

  
Hola,

Hacerlo dentro de data.table tampoco es que te ofrezca muchas ventajas...

datos$coltipofactor <- as.factor(datos$coltipofactor)

Saludos,
Carlos Ortega
www.qualityexcellence.es


El 2 de agosto de 2017, 19:16, eric  escribió:



  Estimada comunidad, quiero pedirles ayuda con un problema que parece
simple, pero que no se como resolver. Resulta que quiero transformar una
columna tipo "factor" a tipo "numeric" en un data.table, pero al hacerlo
asi:

datos[, coltipofactor:=as.numeric(coltipofactor)]

toma los datos de "coltipofactor" y los cambia de manera consecutiva a los
que estaban en la columna. Me explico, "coltipofactor" contiene numeros del
1 al 12, que representan meses. Cuando transformo la columna a numerica el
1 se transforma en 13, el 2 en 14 el 3 en 15 y asi ...

Que estoy haciendo mal ? como se hace bien ? o no se puede hacer ?

Ya me habia pasado esto antes, y lo resolvi de forma manual, pero ahora
son muchos datos y seguro que hay una forma correcta de hacerlo.

Muchas gracias,

Eric.

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es






  
  
___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


	[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es




-- 
Forest Engineer
Master in Environmental and Natural Resource Economics
Ph.D. student in Sciences of Natural Resources at La Frontera University
Member in AguaDeTemu2030, citizen movement for Temuco with green city standards for living

Nota: Las tildes se han omitido para asegurar compatibilidad con algunos lectores de correo.

  


___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es

Re: [R-es] problema al transformar columna tipo "factor" a tipo"numeric" en data.table

2017-08-03 Thread Javier Marcuzzi
Estimados

Mi forma en un ejemplo

Datos$Columna <- as.factor(Datos$Columna)

O as.numeric

Javier Rubén Marcuzzi

De: Fernando Macedo
Enviado: miércoles, 2 de agosto de 2017 21:42
Para: Carlos Ortega
CC: Lista R
Asunto: Re: [R-es] problema al transformar columna tipo "factor" a 
tipo"numeric" en data.table

Creo que el problema es que cuando lo pasas directamente a numeric el 
toma los niveles para transformarlos. Los factores tienen sus niveles 
con sus etiquetas, digamos, que es lo que vemos nosotros. Por ejemplo 
machos y hembras puede ser lo que vemos mientras que internamente los 
niveles son 1 y 2.
Si fuera ese el problema yo lo resuelvo transformando primero en 
character y luego a numeric.
Quedaría así:

datos$coltipofactor = as.numeric(as.character(datos$coltipofactor))

Prueba así a ver si era eso.

--
Fernando Macedo



El 02/08/17 a las 14:49, Carlos Ortega escribió:
> Hola,
>
> Hacerlo dentro de data.table tampoco es que te ofrezca muchas ventajas...
>
> datos$coltipofactor <- as.factor(datos$coltipofactor)
>
> Saludos,
> Carlos Ortega
> www.qualityexcellence.es
>
>
> El 2 de agosto de 2017, 19:16, eric  escribió:
>
>> Estimada comunidad, quiero pedirles ayuda con un problema que parece
>> simple, pero que no se como resolver. Resulta que quiero transformar una
>> columna tipo "factor" a tipo "numeric" en un data.table, pero al hacerlo
>> asi:
>>
>> datos[, coltipofactor:=as.numeric(coltipofactor)]
>>
>> toma los datos de "coltipofactor" y los cambia de manera consecutiva a los
>> que estaban en la columna. Me explico, "coltipofactor" contiene numeros del
>> 1 al 12, que representan meses. Cuando transformo la columna a numerica el
>> 1 se transforma en 13, el 2 en 14 el 3 en 15 y asi ...
>>
>> Que estoy haciendo mal ? como se hace bien ? o no se puede hacer ?
>>
>> Ya me habia pasado esto antes, y lo resolvi de forma manual, pero ahora
>> son muchos datos y seguro que hay una forma correcta de hacerlo.
>>
>> Muchas gracias,
>>
>> Eric.
>>
>> ___
>> R-help-es mailing list
>> R-help-es@r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-help-es
>>
>
>

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


Re: [R] climate data-set; aggregate date (day)

2017-08-03 Thread Thierry Onkelinx
library(dplyr)
library(lubridate)
data %>%
group_by(floor_date(Timestamp, unit = "day")) %>%
summarise(rain = sum(Rain_mm_tot))

ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey

2017-08-03 12:31 GMT+02:00 T Wan :

> Hi there,
>
> I am trying to get the sum of rain per day.
>
> That is what the data-set looks like:
>
>
> TimestampRain_mm_Tot
> 2017-05-29 23:40:00 4.780
> 2017-05-29 23:50:00 1.200
> 2017-05-30 00:10:00 2.580
> 2017-05-30 00:20:00 1.2009600
> 2017-05-30 00:30:00 1.206
> 2017-05-30 00:40:00 2.5002480
>
> First I tried to define the column as date format with:
> data_date <-data[as.Date(data_date$Timestamp)<=as.Date("2017-06-15"),]
>
>
> Then I tried to aggregate as followed:
> data_date_sum <- aggregate(data_date$Timestamp, on = "days", k=1,
> by=list(cut(as.Date(data_date$Date))), FUN=sum)
>
>
> R responds:
> Error in as.Date.default(data_date$Date) :
>do not know how to convert 'data_date$Date' to class “Date”
>
>
> I read serveral post, but no other fits to my format.
> Because I am a beginner I am open for completely new ideas to sum the
> daily rain also.
>
> Thanks a lot and all the best
> Torben
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] climate data-set; aggregate date (day)

2017-08-03 Thread T Wan
Hi there,

I am trying to get the sum of rain per day.

That is what the data-set looks like:


TimestampRain_mm_Tot
2017-05-29 23:40:00 4.780
2017-05-29 23:50:00 1.200
2017-05-30 00:10:00 2.580
2017-05-30 00:20:00 1.2009600
2017-05-30 00:30:00 1.206
2017-05-30 00:40:00 2.5002480

First I tried to define the column as date format with:
data_date <-data[as.Date(data_date$Timestamp)<=as.Date("2017-06-15"),]


Then I tried to aggregate as followed:
data_date_sum <- aggregate(data_date$Timestamp, on = "days", k=1, 
by=list(cut(as.Date(data_date$Date))), FUN=sum)


R responds:
Error in as.Date.default(data_date$Date) :
   do not know how to convert 'data_date$Date' to class “Date”


I read serveral post, but no other fits to my format.
Because I am a beginner I am open for completely new ideas to sum the 
daily rain also.

Thanks a lot and all the best
Torben


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Results of vcovCL (sandwich) and of cluster() in Stata

2017-08-03 Thread Igor Sosa Mayor
Hi,

I'm trying to reproduce with R the results of this study:
https://learn.gold.ac.uk/mod/resource/view.php?id=262406

More precisely I want to reproduce the results of the table 6 (pag.280),
which can also be seen here: 
http://picpaste.de/pics/table-robin-llKCOeWV.1501745645.png

Let's take the first column: we have a coeff. of 0.097 and a SE of
0.026, which represents clustered robust standard errors. If I try to
reproduce in R the analysis, I get the same coefficient, but I'm not
able to get the same SE.

The author made the stata file available here:
https://drive.google.com/file/d/0B_QoCd-1jkVXTmNDWmViWkJFdmM/edit?usp=sharing
(see: http://www.jaredcrubin.com/research)

To make the regression, he uses (as far as I can understand the stata
code) the following command:

local conditions "city != "Mainz" & city != "Wittenberg" & city != "Zürich""
reg prot1530 press if `conditions' & pop1500 != ., noconstant robust 
cluster(territory)

I'm trying to translate this into R-code doing the following: 

library(foreign)
library(dplyr)
library(lmtest)
library(sandwich)

# the data are here:
# https://drive.google.com/file/d/0B_QoCd-1jkVXRGdUMTlkYTNiNGc/edit?usp=sharing
cities <- read.dta("data/Printing_and_Protestants_Data-ReStat.dta")

# we filter the data
cities <- filter(cities, !is.na(pop1500))
cities <- filter(cities, city != "Zürich" & city != "Mainz" &
city != "Wittenberg")

# the model
m1 <- lm(prot1530 ~ press - 1, data = cities)
# the clustered standard errors
coeftest(m1, vcov. = vcovCL(m1, cluster=cities$territory))

I tried different types (HC1, HC2, etc), but always the value for the SE
is not the same as in the table.

Any ideas?

Many thanks in advance.

-- 
:: Igor Sosa Mayor :: joseleopoldo1...@gmail.com ::
:: GnuPG: 0x1C1E2890   :: http://www.gnupg.org/  ::
:: jabberid: rogorido  ::::

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] switch of cex adjustment with mfrow?

2017-08-03 Thread S Ellison
> use
> 
> par(mfrow=c(2,2), cex = 1)

This does work as written. But when I first checked single-call setting, an  
mfrow change to cex in the same call superseded cex=1; hence my suggestion to 
use separate calls to par().
Further checking confirms that the result of a call to par is dependent on 
argument specification order in the call:

par(mfrow=c(2,2), cex = 1)
par("cex")
# 1

par(cex=1, mfrow=c(2,2))
par("cex")
# 0.83

This obviously isn't a problem, though as an aside I didn't immediately find 
any comment on the effect of argument order in ?par. It just means you have to 
be careful exactly what you specify.
It may also mean that for future-proofing against possible changes in par 
execution order, you might want to use separate calls anyway.

Steve Ellison


***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Use case for HDF5 dataspace interface

2017-08-03 Thread Koustav Pal
1. This relates to the package *rhdf5* and its implementation of the HDF5
dataspace interface. I am asking for an example of how other people who use
this package make use of the HDF5 data space interface exposed by the
library.

Longer answer:

As per my understanding, the dataspace interface exposes data locations
within a dataspace, but even while retrieving data from an hdf5 file using
methods implemented within rhdf5, the methods need to convert the requested
data retrieval call to dataspace locations. So what is the usefulness in
using the dataspace interface, and can I see this usefulness in a code
example using the rhdf5 dataspace interface.


2. Not related to comp bio.



---
Koustav Pal,
PhD student in Computational Biology,
Francesco Ferrari's group,
IFOM - The FIRC Institute of Molecular Oncology
Milan, Italy.

On 1 August 2017 at 16:30, Bert Gunter  wrote:

> 1. What does this have to do with R?
>
> 2. If it concerns computational biology, the Bioconductor Help list
> may be a better place to post.
>
>
> Cheers,
> Bert
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Tue, Aug 1, 2017 at 2:28 AM, Koustav Pal 
> wrote:
> > This question is a clone of my stackoverflow question which never got
> > answered (o_O). Therefore I am posting it here. I would really like some
> > inputs if possible.
> >
> > I am currently building some applications which make use of HDF5 files.
> >
> > I have already taken a look at the hdfgroup website with regards to
> > dataspace  >
> > and I think I understand the concept. But I am very much unable to
> > understand it's real world use.
> >
> > Can someone please, tell me what is the dataspace interface supposed to
> be
> > used for?
> >
> > Currently, I think if I load a matrix of size 1.5M x 1.5M I may be able
> to
> > store dataspace coordinates and then retrieve that piece of data much
> > faster. Is this correct?
> >
> > It would be great if you can provide some example use cases.
> >
> >
> > Link to original question:
> > https://stackoverflow.com/questions/44697599/hdf5-
> dataspace-interface-what-does-it-do-and-what-is-its-real-world-applicati
> >
> > 
> ---
> > Koustav Pal,
> > PhD student in Computational Biology,
> > Francesco Ferrari's group,
> > IFOM - The FIRC Institute of Molecular Oncology
> > Milan, Italy.
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.