Re: [R] Regexp pattern but fixed replacement?

2024-05-23 Thread Enrico Schumann
On Thu, 11 Apr 2024, Duncan Murdoch writes:

> I noticed this issue in stringr::str_replace, but it
> also affects sub() in base R.
>
> If the pattern in a call to one of these needs to be a
> regular expression, then backslashes in the replacement
> text are treated specially.
>
> For example,
>
>   gsub("a|b", "\\", "abcdef")
>
> gives "def", not "def" as I wanted.  To get the
> latter, I need to escape the replacement backslashes,
> e.g.
>
>   gsub("a|b", "", "abcdef")
>
> which gives "cdef".
>
> I have two questions:
>
> 1.  Is there a variant on sub or str_replace which
> allows the pattern to be declared as a regular
> expression, but the replacement to be declared as
> fixed?

I realize that this reply is late, but you can use raw
strings for the replacement:

   gsub("a|b", r"(\\)", "abcdef")
   ## [1] "cdef"

which might be easier to read, sometimes.

[...]

-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] refresh.console() function?

2024-03-13 Thread Enrico Schumann
On Wed, 13 Mar 2024, Christofer Bogaso writes:

> Hi,
>
> I run a lengthy for loop and I want to display loop status for each
> step in my loop.
>
> I previously heard of a R function namely refresh.console() which
> would print the status within the loop as it progresses.
>
> However I see this
>
>> help.search("refresh.console")
>
> No vignettes or demos or help files found with alias or concept or
>
> title matching ‘refresh.console’ using regular expression matching.
>
> Could you please help me find the correct function name?
>

?flush.console

... but you'll need to print/message/cat/... explicitly in the
loop, or the output won't be shown.  [Also, options(warn=1) might
be useful.]


-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Basic astronomy package recommendation wanted.

2024-01-30 Thread Enrico Schumann
On Tue, 30 Jan 2024, Richard O'Keefe writes:

> Given
>  - UTC timestamp
>  - a location (latitude,longitude,elevation)
> I want to know
>  - the sun angles
>  - the moon angles
>  - the phase of the moon.
> I looked on CRAN for astronomy, but didn't notice anything that seems
> to offer what I want.  I could try coding these functions myself, but
> "if you didn't write it you didn't wrong it".
>

A quick search showed several candidate packages:

  https://cran.r-project.org/package=suntools
  https://cran.r-project.org/package=suncalc

(but I don't use any of those packages)

Perhaps also worth asking at:
  https://stat.ethz.ch/mailman/listinfo/R-SIG-Geo/


-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] replace character by numeric value

2023-09-28 Thread Enrico Schumann
On Wed, 27 Sep 2023, arnaud gaboury writes:

> I have two data.frames:
>
> mydf1 <- structure(list(symbol = "ETHUSDT", cummulative_quote_qty =
> 1999.9122, side = "BUY", time = structure(1695656875.805, tzone = "", class
> = c("POSIXct", "POSIXt"))), row.names = c(NA, -1L), class = c("data.table",
> "data.frame"))
>
> mydf2 <- structure(list(symbol = c("ETHUSDT", "ETHUSDT", "ETHUSDT"),
> cummulative_quote_qty = c(1999.119408,
> 0, 2999.890985), side = c("SELL", "BUY", "BUY"), time =
> structure(c(1695712848.487,
> 1695744226.993, 1695744509.082), class = c("POSIXct", "POSIXt"
> ), tzone = "")), row.names = c(NA, -3L), class = c("data.table",
> "data.frame"))
>
> I use this line to replace 'BUY' by numeric 1 and 'SELL' by numeric -1 in
> mydf1 and mydf2:
> mynewdf <- mydf |> dplyr::mutate(side = ifelse(side == 'BUY', 1,
> ifelse(side == 'SELL', -1, side)))
>
> This does the job but I am left with an issue: 1 and -1 are characters for
> mynewdf2 when it is numeric for mynewdf1. The result I am expecting is
> getting numeric values.
> I can't solve this issue (using as.numeric(1) doesn't work) and don't
> understand why I am left with num for mynewdf1 and characters for mynewdf2.
>
>> mynewdf1 <- mydf1 |> dplyr::mutate(side = ifelse(side == 'BUY', 1,
> ifelse(side == 'SELL', -1, side)))
>> str(mynewdf1)
> Classes ‘data.table’ and 'data.frame': 1 obs. of  4 variables:
>  $ symbol   : chr "ETHUSDT"
>  $ cummulative_quote_qty: num 2000
>  $ side : num 1  <<<--
>  $ time : POSIXct, format: "2023-09-25 17:47:55"
>  - attr(*, ".internal.selfref")=
>
>> mynewdf2 <- mydf2 |> dplyr::mutate(side = ifelse(side == 'BUY', 1,
> ifelse(side == 'SELL', -1, side)))
>>  str(mynewdf2)
> Classes ‘data.table’ and 'data.frame': 3 obs. of  4 variables:
>  $ symbol   : chr  "ETHUSDT" "ETHUSDT" "ETHUSDT"
>  $ cummulative_quote_qty: num  1999 0 3000
>  $ side : chr  "-1" "1" "1"   <<<--
>  $ time : POSIXct, format: "2023-09-26 09:20:48"
> "2023-09-26 18:03:46" "2023-09-26 18:08:29"
>  - attr(*, ".internal.selfref")=
>
> Thank you for help
>

I'd use something like this:

map <- c(BUY = 1, SELL = -1)
mydf1$side <- map[mydf1$side]
str(mydf1)
## Classes ‘data.table’ and 'data.frame':   1 obs. of  4 variables:
##  $ symbol   : chr "ETHUSDT"
##  $ cummulative_quote_qty: num 2000
##  $ side : num 1

mydf2$side <- map[mydf2$side]
str(mydf2)
## Classes ‘data.table’ and 'data.frame':   3 obs. of  4 variables:
##  $ symbol   : chr  "ETHUSDT" "ETHUSDT" "ETHUSDT"
##  $ cummulative_quote_qty: num  1999 0 3000
##  $ side : num  -1 1 1
##  $ time : POSIXct, format: "2023-09-26 09:20:48" ...



-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Noisy objective functions

2023-08-14 Thread Enrico Schumann
On Sun, 13 Aug 2023, Hans W writes:

> While working on 'random walk' applications, I got interested in
> optimizing noisy objective functions. As an (artificial) example, the
> following is the Rosenbrock function, where Gaussian noise of standard
> deviation `sd = 0.01` is added to the function value.
>
> fn <- function(x)
>   (1+rnorm(1, sd=0.01)) * adagio::fnRosenbrock(x)
>
> To smooth out the noise, define another function `fnk(x, k = 1)` that
> calls `fn` k times and returns the mean value of those k function
> applications.
>
> fnk <- function(x, k = 1) { # fnk(x) same as fn(x)
> rv = 0.0
> for (i in 1:k) rv <- rv + fn(x)
> return(rv/n)
> }
>
> When we apply several optimization solvers to this noisy and smoothed
> noise functions we get for instance the following results:
> (Starting point is always `rep(0.1, 5)`, maximal number of iterations 5000,
>  relative tolerance 1e-12, and the optimization is successful if the
> function value at the minimum is below 1e-06.)
>
>   k   nmk   anms neldermead ucminf optim_BFGS
>  ---
>   1  0.21   0.32   0.13   0.00   0.00
>   3  0.52   0.63   0.50   0.00   0.00
>  10  0.81   0.91   0.87   0.00   0.00
>
> Solvers: nmk = dfoptim::nmk, anms = pracma::anms [both Nelder-Mead codes]
>  neldermead = nloptr::neldermead,
>  ucminf = ucminf::ucminf, optim_BFGS = optim with method "BFGS"
>
> Read the table as follows: `nmk` will be successful in 21% of the
> trials, while for example `optim` will never come close to the true
> minimum.
>
> I think it is reasonable to assume that gradient-based methods do
> poorly with noisy objectives, though I did not expect to see them fail
> so clearly. On the other hand, Nelder-Mead implementations do quite
> well if there is not too much noise.
>
> In real-world applications, it will often not be possible to do the
> same measurement several times. That is, we will then have to live
> with `k = 1`. In my applications with long 'random walks', doing the
> calculations several times in a row will really need some time.
>
> QUESTION: What could be other approaches to minimize noisy functions?
>
> I looked through some "Stochastic Programming" tutorials and did not
> find them very helpful in this situation. Of course, I might have
> looked into these works too superficially.
>

Since Nelder--Mead worked /relatively/ well: in my
experience, its results can sometimes be improved by
restarting it, i.e. re-initializing the simplex.  See this
note here: http://enricoschumann.net/R/restartnm.htm ,
which incidentally also uses the Rosenbrock function.

Just for the record, I think Differential Evolution (DE)
can handle such problems well, though it would usually
need more iterations.  As computational "proof" ;-) , I
run DE 50 times for the k == 1 case, and each time store
whether the resulting objective-function value is below
1e-6.  I let DE take way more function evaluations
(population of 100 times 500 generations = 5 function
evaluations); but it gets a value below 1e-6 in all 50
cases.


library("NMOF")
ofv.below.threshold <- logical(50)
for (i in seq_along(ofv.below.threshold)) {
sol <- DEopt(fnk,
 algo = list(
 nP = 100, nG = 500,
 min = rep(  5, 5), max = rep(   10, 5)))

ofv.below.threshold[i] <- sol$OFvalue < 1e-6
}
sum(ofv.below.threshold)/length(ofv.below.threshold)


(These 50 runs take less than half a minute on my
machine.) Note that I have on purpose initialized the
population in the range 5 to 10, i.e. way off the optimum.




-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Merge with closest (not equal) time stamps

2023-08-08 Thread Enrico Schumann
On Mon, 07 Aug 2023, Naresh Gurbuxani writes:

> I have two dataframes, each with a column for timestamp.  I want to
> merge the two dataframes such that each row from first dataframe
> is matched with the row in the second dataframe with most recent but
> preceding timestamp. Here is an example.
>
> option.trades <- data.frame(timestamp = as.POSIXct(c("2023-08-07 10:23:22", 
> "2023-08-07 10:25:33", "2023-08-07 10:28:41")), option.price = c(2.5, 2.7, 
> 1.8))
>
> stock.trades <- data.frame(timestamp =
> as.POSIXct(c("2023-08-07 10:23:21", "2023-08-07
> 10:23:34", "2023-08-07 10:24:57", "2023-08-07
> 10:28:37", "2023-08-07 10:29:01")), stock.price =
> c(102.2, 102.9, 103.1, 101.8, 101.7))
>
> stock.trades <- stock.trades[order(stock.trades$timestamp),]
>
> library(plyr)
> mystock.prices <- ldply(option.trades$timestamp, function(tstamp) 
> tail(subset(stock.trades, timestamp <= tstamp), 1))
> names(mystock.prices)[1] <- "stock.timestamp"
> myres <- cbind(option.trades, mystock.prices)
>
> This method works. But for large dataframes, it is very slow.  Is there
> a way to speed up the merge?
>
> Thanks,
> Naresh
>

If the timestamps are sorted (or you can sort them),
function ?findInterval might be helpful:

i <- findInterval(option.trades$timestamp, stock.trades$timestamp)
cbind(option.trades, stock.trades[i, ])
## timestamp option.price   timestamp stock.price
## 1 2023-08-07 10:23:22      2.5 2023-08-07 10:23:21   102.2
## 3 2023-08-07 10:25:33  2.7 2023-08-07 10:24:57   103.1
## 4 2023-08-07 10:28:41  1.8 2023-08-07 10:28:37   101.8



-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Cryptic error from stargazer

2023-06-08 Thread Enrico Schumann
On Thu, 08 Jun 2023, Ashim Kapoor writes:

> Dear All,
>
> I had done an automatic upgrade of my Debian 10  system which had also
> upgraded R.
>
> I reinstalled the stargazer package and the error went away.
>
> Query : Do I need to reinstall all packages with each upgrade of R ?
>
> Best,
> Ashim

Your session info says "stargazer 5.2.2".  What version do
you have now?  CRAN has 5.2.3. It seems likely to me that

  i) with R 4.3-0, certain warnings in the package have
 become errors (the NEWS file for R 4.3-0,
 https://cloud.r-project.org/doc/manuals/r-release/NEWS.html ,
 has a prominent first item about using && and errors.)

and

 ii) the package maintainer fixed those warnings/errors in 5.2.3.

kind regards
Enrico


> On Thu, Jun 8, 2023 at 11:11 AM Ashim Kapoor  wrote:
>>
>> Dear All,
>>
>> Here is my reproducible example:
>>
>> > library(stargazer)
>>
>> Please cite as:
>>
>>  Hlavac, Marek (2018). stargazer: Well-Formatted Regression and
>> Summary Statistics Tables.
>>  R package version 5.2.2. https://CRAN.R-project.org/package=stargazer
>>
>> > x1=1:1000 ; y = 2 * x1+ rnorm(1000)
>> > stargazer(lm(y~x1))
>> Error in (.format.s.statistics.list != "p25") &&
>> (.format.s.statistics.list !=  :
>>   'length = 7' in coercion to 'logical(1)'
>>
>> The error returned is cryptic and I am not able to google and find the 
>> solution.
>>
>> Here is my sessionInfo:
>>
>> > sessionInfo()
>> R version 4.3.0 (2023-04-21)
>> Platform: x86_64-pc-linux-gnu (64-bit)
>> Running under: Debian GNU/Linux 10 (buster)
>>
>> Matrix products: default
>> BLAS:   /usr/lib/x86_64-linux-gnu/openblas/libblas.so.3
>> LAPACK: /usr/lib/x86_64-linux-gnu/libopenblasp-r0.3.5.so;  LAPACK version 
>> 3.8.0
>>
>> locale:
>>  [1] LC_CTYPE=en_IN   LC_NUMERIC=C LC_TIME=en_IN
>>  [4] LC_COLLATE=en_IN LC_MONETARY=en_INLC_MESSAGES=en_IN
>>  [7] LC_PAPER=en_IN   LC_NAME=CLC_ADDRESS=C
>> [10] LC_TELEPHONE=C   LC_MEASUREMENT=en_IN LC_IDENTIFICATION=C
>>
>> time zone: Asia/Kolkata
>> tzcode source: system (glibc)
>>
>> attached base packages:
>> [1] stats graphics  grDevices utils datasets  methods   base
>>
>> other attached packages:
>> [1] stargazer_5.2.2
>>
>> loaded via a namespace (and not attached):
>> [1] compiler_4.3.0
>>
>> Any help will be appreciated.
>>
>> Thank you,
>> Ashim
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] wininet deprecation

2023-02-21 Thread Enrico Schumann
On Tue, 21 Feb 2023, Selke, Gisbert W. writes:

> On Mon, 20 Feb 2023 15:58:33 +, Stadler Thomas 
>  wrote:
>
>> as the download method 'wininet' is deprecated,  I'm looking into 
>> alternative ways to install packages from within R.
>> Unfortunately, curl, libcurl and wget refuse to cooperate with Kerberos on 
>> our corporate setup.
>> When exactly will the 'wininet' method stop working? R release 4.3?
> Same problem here. It has been reported before to be a major nuisance if 
> you're working in an environment with a need for security and data protection.
> Unfortunately, it seems it has been decided to go for this regression of 
> functionality.
>
> In our case we managed to circumnavigate the problem by
> building a local CRAN mirror inside our firewall using
> the open source Nexus repository. Seems a bit overkill
> (given that wininet has done the trick nicely up to
> now), but at least it works.
>
> HTH.
>
> \Gisbert

There is also the miniCRAN package
(https://cran.r-project.org/package=miniCRAN).


-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] gmp::bigq vs. MASS::fractions

2023-01-09 Thread Enrico Schumann
On Sat, 07 Jan 2023, Sigbert Klinke writes:

> Hi,
>
> has someone experience which routine should be used for
> creating fractional numbers? The two conversion
> routines deliver different results
>
>> x <- (0:7)/7
>
>> MASS::fractions(x)
>
> [1]   0 1/7 2/7 3/7 4/7 5/7 6/7   1
>
>> gmp::as.bigq(x)
>
> Big Rational ('bigq') object of length 8:
>
> [1] 0 2573485501354569/18014398509481984
> 2573485501354569/9007199254740992
>
> [4] 7720456504063707/18014398509481984
> 2573485501354569/4503599627370496
> 6433713753386423/9007199254740992
>
> [7] 7720456504063707/9007199254740992  1
>
> Following the example I would compute my fractional
> numbers with MASS::fractions and store them for further
> processing as Big Rational.
>
> Thanks Sigbert
>

'gmp' allows you to create the fractions directly:

gmp::as.bigq(n = 0:7, d = 7)
    ## Big Rational ('bigq') object of length 8:
## [1] 0   1/7 2/7 3/7 4/7 5/7 6/7 1  



-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Rate of Reading into R from net

2022-11-14 Thread Enrico Schumann
On Mon, 14 Nov 2022, Nick Wray writes:

> Hello I am trying read  nc rainfall  files directly from the UK centre for
> hydrology and ecology website -
> hourly:about 300Mb
> https://catalogue.ceh.ac.uk/datastore/eidchub/fc9423d6-3d54-467f-bb2b-fc7357a3941f/
> and daily about 90Mb
> https://catalogue.ceh.ac.uk/datastore/eidchub/2ab15bf0-ad08-415c-ba64-831168be7293/precip/
>
> I can download them "by hand" no problem using the ncdf4 package but
> there's lots and I wanted to streamline the process
>
> code I've used (which works for "by hand" downloading) is (for example)
>
> nc_data<-nc_open("
> https://catalogue.ceh.ac.uk/datastore/eidchub/2ab15bf0-ad08-415c-ba64-831168be7293/precip/chess-met_precip_gb_1km_daily_19610301-19610331.nc
> ")
>
> However, when I run this I don't get an error message but R just sits there
> with the little red circle (at least 30 minutes for the 90Mb files)
>
> What I'm wondering is two things - a)is the data in fact downloading but
> it's just taking ages and I need to let it run and go and have a coffee
>b)is there (I can't find
> anything) within R which allows me to monitor the progress of the download,
> if in fact it is taking place?
> Thanks Nick Wray
>

Have you tried 'download.file'?  It should provide a
progress bar if the file size is known.  I'd first
download the file, and then read it.


-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Robust standard error

2022-10-02 Thread Enrico Schumann
On Sun, 02 Oct 2022, Bert Gunter writes:

> On Sun, Oct 2, 2022 at 6:42 AM Simone Mascia 
> wrote:
>
>> Is there a way to estimate Robust standard errors when using a nls()
>> function? I'm trying to fit some data to a complicated model and everything
>> works fine with nls() but I also wanted to obtain a robust estimate of my
>> errors.
>>
>> I tried "coeftest(m, vcov=sandwich)" and it seems to work, but so does
>> "coeftest(m, vcov = NeweyWest(m, lag = 4))" or "coeftest(m, vcov =
>> kernHAC(m, kernel = "Bartlett", bw = 5, prewhite = FALSE, adjust =
>> FALSE))". They return different error estimates so I wanted you to help me
>> understand what I should do, if I'm doing something wrong and other stuff.
>>
>> Thank you
>>
>
> You may get a helpful response here, but generally speaking, this list is
> about R **programming**, and statistical issues/tutorials are off topic.
> You might try
> https://stackoverflow.com/questions/tagged/statistics
> if you don't get adequate help here.
>
> -- Bert
>

Additionally, there is also
https://stat.ethz.ch/mailman/listinfo/R-sig-Robust .

-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fwd: Reading very large text files into R

2022-09-29 Thread Enrico Schumann
On Thu, 29 Sep 2022, Nick Wray writes:

> -- Forwarded message -
> From: Nick Wray 
> Date: Thu, 29 Sept 2022 at 15:32
> Subject: Re: [R] Reading very large text files into R
> To: Ben Tupper 
>
>
> Hi Ben
> Beneath is an example of the text (also in an attachment) and it's the "B",
> of which there are quite a few scattered throughout the text doc which
> causes the reading in error message (btw I don't need the "RAIN" column or
> the 1's after it or the last four elements).   I have also attached the
> snippet as text file
>
> 1980-01-01 10:00, 225620, RAIN, 1, 1, WAHRAIN, 5091, 1001, 0, , 9, 0, , ,
> 1980-01-01 10:00, 226918, RAIN, 1, 1, WAHRAIN, 5124, 1001, 0, , 9, 0, , ,
> 1980-01-01 10:00, 228562, RAIN, 1, 1, WAHRAIN, 491, 1001, 0, , 9, 0, , ,
> 1980-01-01 10:00, 231581, RAIN, 1, 1, WAHRAIN, 5213, 1001, 0, , 9, 0, , ,
> 1980-01-01 10:00, 232671, RAIN, 1, 1, WAHRAIN, 487, 1001, 0, , 9, 0, , ,
> 1980-01-01 10:00, 232913, RAIN, 1, 1, WAHRAIN, 5243, 1001, 0, , 9, 0, , ,
> 1980-01-01 10:00, 234362, RAIN, 1, 1, WAHRAIN, 5265, 1001, 0, , 10009, 0, ,
> , B
> 1980-01-01 10:00, 234682, RAIN, 1, 1, WAHRAIN, 5271, 1001, 0, , 9, 0, , ,
> 1980-01-01 10:00, 235389, RAIN, 1, 1, WAHRAIN, 5279, 1001, 0, , 9, 0, , ,
> 1980-01-01 10:00, 236466, RAIN, 1, 1, WAHRAIN, 497, 1001, 0, , 9, 0, , ,
> 1980-01-01 10:00, 243350, RAIN, 1, 1, SREW, 484, 1001, 0, , 9, 0, , ,
> 1980-01-01 10:00, 243350, RAIN, 1, 1, WAHRAIN, 484, 1001, 0, 0, 9, 9, , ,
>
> Thanks Nick
>
> On Thu, 29 Sept 2022 at 15:12, Ben Tupper  wrote:
>
>> Hi Nick,
>>
>> It's hard to know without seeing at least a snippet of the data.
>> Could you do the following and paste the result into a plain text
>> email?  If you don't set your email client to plain text (from rich
>> text or html) then we are apt to see a jumble of output on our email
>> clients.
>>
>>
>> ## start
>> x <- readLines(filename, n = 20)
>> cat(x, sep = "\n")
>> ## end
>>
>> Cheers,
>> Ben
>>
>>
>> On Thu, Sep 29, 2022 at 9:54 AM Nick Wray  wrote:
>> >
>> > Hello   I may be offending the R purists with this question but it is
>> > linked to R, as will become clear.  I have very large data sets from the
>> UK
>> > Met Office in notepad form.  Unfortunately,  I can’t read them directly
>> > into R because, for some reason, although most lines in the text doc
>> > consist of 15 elements, every so often there is a sixteenth one and R
>> > doesn’t like this and gives me an error message because it has assumed
>> that
>> > every line has 15 elements and doesn’t like finding one with more.  I
>> have
>> > tried playing around with the text document, inserting an extra element
>> > into the top line etc, but to no avail.
>> >
>> > Also unfortunately you need access permission from the Met Office to get
>> > the files in question so this link probably won’t work:
>> >
>> > https://catalogue.ceda.ac.uk/uuid/bbd6916225e7475514e17fdbf11141c1
>> >
>> > So what I have done is simply to copy and paste the text docs into excel
>> > csv and then read them in, which is time-consuming but works.  However
>> the
>> > later datasets are over the excel limit of 1048576 lines.  I can paste in
>> > the first 1048576 lines but then trying to isolate the remainder of the
>> > text doc to paste it into a second csv doc is proving v difficult – the
>> > only way I have found is to scroll down by hand and that’s taking ages.
>> I
>> > cannot find another way of editing the notepad text doc to get rid of the
>> > part which I have already copied and pasted.
>> >
>> > Can anyone help with a)ideally being able to simply read the text tables
>> > into R  or b)suggest a way of editing out the bits of the text file I
>> have
>> > already pasted in without laborious scrolling?
>> >
>> > Thanks Nick Wray
>> >

[...]

>>
>> --
>> Ben Tupper (he/him)
>> Bigelow Laboratory for Ocean Science
>> East Boothbay, Maine
>> http://www.bigelow.org/
>> https://eco.bigelow.org
>>
>

Maybe I have missed it, but could you please show how
you tried to read the table?

When I use your file with 

read.table("sample text.txt", header = FALSE, sep = ",")

I get

##  V1 V2V3 V4 V5   V6   V7   V8 V9 V10   V11 
V12 V13 V14 V15
## 1  1980-01-01 10:00 225620  RAIN  1  1  WAHRAIN 5091 1001  0  NA 9   
0  NA  NA
## 2  1980-01-01 10:00 226918  RAIN  1  1  WAHRAIN 5124 1001  0  NA 9   
0  NA  NA
## ## .
## 7  1980-01-01 10:00 234362  RAIN  1  1  WAHRAIN 5265 1001  0  NA 10009   
0  NA  NA   B
## 8  1980-01-01 10:00 234682  RAIN  1  1  WAHRAIN 5271 1001  0  NA 9   
0  NA  NA



-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Question about Line Ending Choice

2022-09-29 Thread Enrico Schumann
On Tue, 27 Sep 2022, Stephen H. Dawson, DSL via R-help writes:

> Hi All,
>
>
> I am writing with a question about choosing the line
> ending aspect of a file, please.
>
> I use write.csv and write.table to export work to CSV
> files and TXT files. I am planning now on how to share
> my work with the Windows crowd beyond only sharing with
> the Linux crowd. I use my text editor to flip the line
> ending option from Linux to Windows after
> exporting. This is inefficient for me to accomplish if
> I ramp up production as I expect will occur.
>
> Staying with the character encoding of UTF-8 seems fine
> for now from what I understand I need to deliver to my
> customers.
>
> What seems more efficient to me is to learn how to use
> R to define the line ending aspect of the exported
> file. I have not found if this is an option within R.
>
> QUESTION
> Is it possible within R to define the line ending aspect of file output?
>
>
> Kindest Regards,

Just a remark: there is a "standard" for CSV,
https://datatracker.ietf.org/doc/html/rfc4180.
It always requires CRLF as the line ending.

-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Unicode chars

2022-08-25 Thread Enrico Schumann
On Thu, 25 Aug 2022, dulcalma dulcalma writes:

> Dear All
>
>
> I was trying the supplementary file GS_main.R from
> https://esajournals.onlinelibrary.wiley.com/doi/abs/10.1002/ecy.3475
>
> I have tried to prevent latex compilation from failing using Sweave 
> after trying all the online fixes I could find including using Rterm 
> I could fix it if it was in the input but not in the output 
> I am using R version 4.2 on windows 11 with 64 GB memory
>
>
> Sweave code
>
> \begin{small}
> <>=
> library(emdbook) # version 1.3.12
> library(bbmle) # version 1.0.23.1
> library(pbmcapply) # version 1.5.0 
> library(tidyverse) # version 1.3.0
> library(ggpubr) # version 0.4.0
> @ %%
>
>
> <>=
> summaryTable <-
> tibble(model = m.names,
>        dim = m.dims[model],                 
>        score = m.loo[model],                
>        delScore = score - min(score),       
>        se_ose = se_ose[model],              
>        se_mod = se_mod[model]) %>% arrange(dim) %>%  mutate(index = 
> 1:length(dim))
> summaryTable
> @ %%
>
>
> Output
> \begin{Schunk}
> \begin{Sinput}
>   summaryTable <-
>   tibble(model = m.names,
>          dim = m.dims[model],                 
>          score = m.loo[model],                
>          delScore = score - min(score),       
>          se_ose = se_ose[model],              
>          se_mod = se_mod[model]) %>% arrange(dim) %>%  mutate(index = 
> 1:length(dim))
>   summaryTable
> \end{Sinput}
> \begin{Soutput}
> # A tibble: 10 × 7
>    model   dim score delScore se_ose se_mod index
>               
>  1 zero      2  908.    5.84    40.1   4.14     1
>  2 d         3  904.    1.71    40.6   2.52     2
>  3 q         3  907.    4.92    40.2   3.80     3
>  4 qd        4  902.    0       40.7   0        4
>  5 qdi       5  903.    0.632   40.5   1.60     5
>  6 x         6  908.    5.58    40.2   5.53     6
>  7 xq        7  907.    4.81    40.3   5.36     7
>  8 xd        7  905.    2.96    40.5   5.04     8
>  9 xqd       8  903.    0.908   40.5   4.52     9
> 10 xqdi      9  904.    1.89    40.4   4.70    10
> \end{Soutput}
> \end{Schunk}
>
>
> The problem is the output from tibble 
> # A tibble: 10 × 7
>
>
> the \times character is Unicode U+00D7 or hex \xd7 and pdflatex lualatex 
> etc fail where this occurs
> Is there a way of adding "sanitizing" code in the output before 
> compiling 
> Or do I have to change it manually before compiling
>
>
> I do not want to switch to knitr. 
>
>
> Regards
>
>
> Duncan Mackay
>

You could try to automatically clean the code, by using
?iconv, say. But the results by not be satisfactory,
depending on what characters were used.

Sweave itself does not compile the LaTeX code. If you
run (in R) 

Sweave(, encoding = "utf8")

then it will produce the TeX file, which you can then
compile via LuaLaTeX or XeLaTeX [see
e.g. https://www.ctan.org/pkg/lualatex-doc].

For instance, on the command line, just say

lualatex 

or another programme (such as latexmk) that your TeX
distribution provides.


If this is a vignette, you can specify a Makefile, see
https://cran.r-project.org/doc/manuals/R-exts.html#Writing-package-vignettes



>
>
>   [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] What are the pros and cons of the various R functions and methods for conducting least median of squares regression analysis?

2022-04-17 Thread Enrico Schumann
On Sat, 16 Apr 2022, Kelly Thompson writes:

> What are the pros and cons of the various R functions and methods for
> conducting least median of squares regression analysis?
>
> I know about these:
>
> lqs, wth method = "lms" and lmsreg, which as I understan dit are equivalent
>
> Mentions:
> https://www.rdocumentation.org/packages/MASS/versions/7.3-56/topics/lqs
>
> https://stat.ethz.ch/pipermail/r-help/2006-October/115681.html
>
> https://stat.ethz.ch/pipermail/r-help/2007-March/126564.html
> -
>
> ltsReg
>
> https://www.rdocumentation.org/packages/robustbase/versions/0.1-2/topics/ltsReg
> -
> nl.lmsNM
> https://rdrr.io/cran/nlr/man/nl.lmsNM.html
>

There is also a special mailing list for robust statistics with R:
https://stat.ethz.ch/mailman/listinfo/R-sig-Robust

As others have already suggested: you'd probably get
better answers if you ask more specific questions,
e.g. why/for what application you want to use LMS.


-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R vs Numpy

2021-10-28 Thread Enrico Schumann
On Thu, 28 Oct 2021, Catherine Walt writes:

> Hello members,
>
> I am familiar with python's Numpy.
> Now I am looking into R language.
> What is the main difference between these two languages? including advantages 
> or disadvantages.
>
> Thanks.
>

Perhaps also of interest:
https://github.com/matloff/R-vs.-Python-for-Data-Science


-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [R-pkgs] NMOF 2.5-0 (Numerical Methods and Optimization in Finance)

2021-10-21 Thread Enrico Schumann
Dear all,

version 2.5-0 of package NMOF is on CRAN now.
(Incidentally, today is 20 October 2021, which marks the
10th anniversary of NMOF on CRAN.)

NMOF stands for 'Numerical Methods and Optimization in
Finance', and it accompanies the book with the same name,
written by Manfred Gilli, Dietmar Maringer and Enrico
Schumann.[1]

Since the last announcement on this list, functionality has
been added to the package, e.g. for computing minimum-CVaR
and tracking portfolios, or downloading IPO data.  See the
NEWS file [2] for all changes.  The documentation has also
been expanded.

Comments/corrections/remarks/suggestions are -- as always --
very welcome; please send them to the maintainer (me)
directly.

Kind regards
  Enrico

[1] http://enricoschumann.net/NMOF.htm
[2] https://gitlab.com/NMOF/NMOF/-/blob/master/NEWS


-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Do not show a "message d'avis" that qbeta reports

2021-10-20 Thread Enrico Schumann
On Wed, 20 Oct 2021, Marc Girondot via R-help writes:

> Dear R-helpers
>
> Do you know how to not show a "message d'avis" that
> qbeta reports. suppressMessages and capture.output are
> not working.
>
> (PS I Know that qbeta with shape2 being 1e-7 is
> "strange"... this is obtained during an optimization
> process. Just I don't want see the messages).
>
> Thanks
>
> Marc Girondot
>
> q <- qbeta(p=c(0.025, 0.975), shape1 = 3.3108797, shape2 = 0.001)
> Message d'avis :
> Dans qbeta(p = c(0.025, 0.975), shape1 = 3.3108797, shape2 = 1e-07) :
>   qbeta(a, *) =: x0 with |pbeta(x0,*) - alpha| = 0.024997 is not accurate
>
> suppressMessages(q <- qbeta(p=c(0.025, 0.975), shape1 =
> 3.3108797, shape2 = 0.001))
> Message d'avis :
> Dans qbeta(p = c(0.025, 0.975), shape1 = 3.3108797, shape2 = 1e-07) :
>   qbeta(a, *) =: x0 with |pbeta(x0,*) - alpha| = 0.024997 is not accurate
>
> capture.output(q <- qbeta(p=c(0.025, 0.975), shape1 =
> 3.3108797, shape2 = 0.001), type = "message")
> character(0)
> Message d'avis :
> Dans qbeta(p = c(0.025, 0.975), shape1 = 3.3108797, shape2 = 1e-07) :
>   qbeta(a, *) =: x0 with |pbeta(x0,*) - alpha| = 0.024997 is not accurate
>
> capture.output(q <- qbeta(p=c(0.025, 0.975), shape1 =
> 3.3108797, shape2 = 0.001), type = "output")
> character(0)
> Message d'avis :
> Dans qbeta(p = c(0.025, 0.975), shape1 = 3.3108797, shape2 = 1e-07) :
>   qbeta(a, *) =: x0 with |pbeta(x0,*) - alpha| = 0.024997 is not accurate
>

Try 'suppressWarnings'.


-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting Comments from Functions/Packages

2021-10-07 Thread Enrico Schumann
On Thu, 07 Oct 2021, Leonard Mada via R-help writes:

> Dear R Users,
>
>
> I wrote a minimal parser to extract strings and
> comments from the function definitions.
>
>
> The string extraction works fine. But there are no comments:
>
> a.) Are the comments stripped from the compiled packages?
>
> b.) Alternatively: Is the deparse() not suited for this task?
>
> b.2.) Is deparse() parsing the function/expression itself?
>
> [see code for extract.str.fun() function below]
>
>
> ### All strings in "base"
> extract.str.pkg("base")
> # type = 2 for Comments:
> extract.str.pkg("base", type=2)
> extract.str.pkg("sp", type=2)
> extract.str.pkg("NetLogoR", type=2)
>
> The code for the 2 functions (extract.str.pkg &
> extract.str.fun) and the code for the parse.simple()
> parser are below.
>
>
> Sincerely,
>
>
> Leonard
>
> ===
>
> The latest code is on GitHub:
>
> https://github.com/discoleo/R/blob/master/Stat/Tools.Formulas.R
>
>
> ### Code to process functions in packages:
> extract.str.fun = function(fn, pkg, type=1, strip=TRUE) {
>     fn = as.symbol(fn); pkg = as.symbol(pkg);
>     fn = list(substitute(pkg ::: fn));
>     # deparse
>     s = paste0(do.call(deparse, fn), collapse="");
>     npos = parse.simple(s);
>     extract.str(s, npos[[type]], strip=strip)
> }
> extract.str.pkg = function(pkg, type=1, exclude.z = TRUE, strip=TRUE) {
>     nms = ls(getNamespace(pkg));
>     l = lapply(nms, function(fn) extract.str.fun(fn,
> pkg, type=type, strip=strip));
>     if(exclude.z) {
>         hasStr = sapply(l, function(s) length(s) >= 1);
>         nms = nms[hasStr];
>         l = l[hasStr];
>     }
>     names(l) = nms;
>     return(l);
> }
>
> ### minimal Parser:
> # - proof of concept;
> # - may be useful to process non-conformant R "code", e.g.:
> #   "{\"abc\" + \"bcd\"} {FUN}"; (still TODO)
> # Warning:
> # - not thoroughly checked &
> #   may be a little buggy!
>
> parse.simple = function(x, eol="\n") {
>     len = nchar(x);
>     n.comm = list(integer(0), integer(0));
>     n.str  = list(integer(0), integer(0));
>     is.hex = function(ch) {
>         # Note: only for 1 character!
>         return((ch >= "0" && ch <= "9") ||
>             (ch >= "A" && ch <= "F") ||
>             (ch >= "a" && ch <= "f"));
>     }
>     npos = 1;
>     while(npos <= len) {
>         s = substr(x, npos, npos);
>         # State: COMMENT
>         if(s == "#") {
>             n.comm[[1]] = c(n.comm[[1]], npos);
>             while(npos < len) {
>                 npos = npos + 1;
>                 if(substr(x, npos, npos) == eol) break;
>             }
>             n.comm[[2]] = c(n.comm[[2]], npos);
>             npos = npos + 1; next;
>         }
>         # State: STRING
>         if(s == "\"" || s == "'") {
>             n.str[[1]] = c(n.str[[1]], npos);
>             while(npos < len) {
>                 npos = npos + 1;
>                 se = substr(x, npos, npos);
>                 if(se == "\\") {
>                     npos = npos + 1;
>                     # simple escape vs Unicode:
>                     if(substr(x, npos, npos) != "u") next;
>                     len.end = min(len, npos + 4);
>                     npos = npos + 1;
>                     isAllHex = TRUE;
>                     while(npos <= len.end) {
>                         se = substr(x, npos, npos);
>                         if( ! is.hex(se)) { isAllHex = FALSE; break; }
>                         npos = npos + 1;
>                     }
>                     if(isAllHex) next;
>                 }
>                 if(se == s) break;
>             }
>             n.str[[2]] = c(n.str[[2]], npos);
>             npos = npos + 1; next;
>         }
>         npos = npos + 1;
>     }
>     return(list(str = n.str, comm = n.comm));
> }
>
>
> extract.str = function(s, npos, strip=FALSE) {
>     if(length(npos[[1]]) == 0) return(character(0));
>     strip.FUN = if(strip) {
>             function(id) {
>                 if(npos[[1]][[id]] + 1 < npos[[2]][[id]]) {
>                     nStart = npos[[1]][[id]] + 1;
>                     nEnd = npos[[2]][[id]] - 1; # TODO:
> Error with malformed string
>                     return(substr(s, nStart, nEnd));
>                 } else {
>                     return("");
>                 }
>       

Re: [R] read.csv() error

2021-09-02 Thread Enrico Schumann
On Thu, 02 Sep 2021, Rich Shepard writes:

> The first three commands in the script are:
> stage <- read.csv('../data/water/gauge-ht.dat', header
> = TRUE, sep = ',', stringsAsFactors = FALSE)
> stage$sampdate <- as.Date(stage$sampdate)
> stage$ht <- as.numeric(stage$ht, length = 6)
>
> Running the script produces this error:
>> source('stage.R')
> Error in `$<-.data.frame`(`*tmp*`, ht, value = numeric(0)) :
>   replacement has 0 rows, data has 486336
>
> Sample lines from the data file:
> sampdate,samptime,elev
> 2007-10-01,01:00,2.80
> 2007-10-01,01:15,2.71
> 2007-10-01,01:30,2.63
> 2007-10-01,01:45,2.53
> 2007-10-01,02:00,2.45
> 2007-10-01,02:15,2.36
> 2007-10-01,02:30,2.27
> 2007-10-01,02:45,2.17
> 2007-10-01,03:00,2.07
>
> Maximum value for elev is about 11.00, 5 digits.
>
> I don't understand this error because the equivalent commands for another
> data source file completes without error.
>
> What is that error message telling me?
>
> TIA,
>
> Rich
>

(Sorry, sent too early.)

There is no column 'ht'.

df <- data.frame(a = 1:5)
df$b <- as.numeric(df$b)
    ## Error in `$<-.data.frame`(`*tmp*`, b, value = numeric(0)) : 
##   replacement has 0 rows, data has 5

-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] read.csv() error

2021-09-02 Thread Enrico Schumann
On Thu, 02 Sep 2021, Rich Shepard writes:

> The first three commands in the script are:
> stage <- read.csv('../data/water/gauge-ht.dat', header
> = TRUE, sep = ',', stringsAsFactors = FALSE)
> stage$sampdate <- as.Date(stage$sampdate)
> stage$ht <- as.numeric(stage$ht, length = 6)
>
> Running the script produces this error:
>> source('stage.R')
> Error in `$<-.data.frame`(`*tmp*`, ht, value = numeric(0)) :
>   replacement has 0 rows, data has 486336
>
> Sample lines from the data file:
> sampdate,samptime,elev
> 2007-10-01,01:00,2.80
> 2007-10-01,01:15,2.71
> 2007-10-01,01:30,2.63
> 2007-10-01,01:45,2.53
> 2007-10-01,02:00,2.45
> 2007-10-01,02:15,2.36
> 2007-10-01,02:30,2.27
> 2007-10-01,02:45,2.17
> 2007-10-01,03:00,2.07
>
> Maximum value for elev is about 11.00, 5 digits.
>
> I don't understand this error because the equivalent commands for another
> data source file completes without error.
>
> What is that error message telling me?
>
> TIA,
>
> Rich
>

There is no column 'ht'.

-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Show only header of str() function

2021-09-02 Thread Enrico Schumann
On Thu, 02 Sep 2021, Luigi Marongiu writes:

> Hello, is it possible to show only the header (that is: `'data.frame':
> x obs. of  y variables:` part) of the str function?
> Thank you

Perhaps one more solution. You could limit the number
of list components to be printed, though it will leave
a "truncated" message.

str(iris, list.len = 0)
## 'data.frame':150 obs. of  5 variables:
##   [list output truncated]

Since 'str' is a generic function, you could also
define a new 'str' method. Perhaps something among
those lines:

str.data.frame.oneline <- function (object, ...) {
cat("'data.frame':\t", nrow(object), " obs. of  ",
(p <- length(object)), 
" variable", if (p != 1) "s", "\n", sep = "")
invisible(NULL)
}

(which is essentially taken from 'str.data.frame').

Then:

class(iris) <- c("data.frame.oneline", class(iris))

str(iris)
## 'data.frame':  150 obs. of  5 variables

str(list(a = 1,
 list(b = 2,
  c = iris)))
## List of 2
##  $ a: num 1
##  $  :List of 2
    ##   ..$ b: num 2
##   ..$ c:'data.frame':   150 obs. of  5 variables




-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Converting characters back to Date and Time

2021-08-31 Thread Enrico Schumann
On Tue, 31 Aug 2021, Eliza Botto writes:

> DeaR useR,
>
> I read an excel column in R having Date and time (written in the same cell) 
> as follow,
>
> 06/18/18 10:00
>
> 06/18/18 11:00
>
> 06/18/18 12:00
>
> In R environment, they are read as
>
> 43269.42
>
> 43269.46
>
> 43269.50
>
> Is there a way to covert these characters back to the original format?
>
> Thank-you very much in advance.
>
>
> Eliza Botto
>

If using a package is an option:

  library("datetimeutils")
  convert_date(c(43269.42, 43269.46, 43269.50), "excel")
  ## [1] "2018-06-18" "2018-06-18" "2018-06-18"

  convert_date(c(43269.42, 43269.46, 43269.50), "excel", fraction = TRUE)
  ## [1] "2018-06-18 10:04:48 CEST" "2018-06-18 11:02:24 CEST"
  ## [3] "2018-06-18 12:00:00 CEST"

Note that the times differ: the numbers are probably
not /displayed/ to full precision in R.

You may also want to search the archives of this list,
as this question has been discussed before.


-- 
Enrico Schumann (maintainer of package datetimeutils)
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] coercion to an object...

2021-08-29 Thread Enrico Schumann
On Sun, 29 Aug 2021, akshay kulkarni writes:

> Dear members,
>  I think the following question is rudimentary, 
> but I couldn't find an answer in the Internet.
>
> Suppose there is an object A, and ls() lists it as "A". How do you convert 
> this character object to the object A. i.e I want a function f such that 
> class(f("A")) = class(A) (of course, class("A") = "character") . f should 
> just coerce the character to the object represented by it.


Perhaps ?get is what you're looking for:

  A <- 42
  get("A")
  ## [1] 42


> Thank you,
> Yours sincerely,
> AKSHAY M KULKARNI
>

-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [ESS] [OT] Enabled a hook and no longer find which

2021-08-03 Thread Enrico Schumann via ESS-help
On Fri, 30 Jul 2021, Dirk Eddelbuettel via ESS-help writes:

> tl;dr: I enabled an action 'on save' and I no longer find where :-/
>
>
> Longer story: I settled upon GNU global and ggtags at some point for a
> (mostly multilingual) system of tags in R and C++ (and some more). So a few
> of source directories have these (ugly names) files GPATH, GTAGS, GRTAGS.
> And at one point I became too clever by half and somehow enabled updating of
> these files for at least some repos. But I no longer remember _how_ I did
> that and can, for the life of me, find any trace in my .emacs (and, older
> story, .elisp/*el) file(s). It is definitely active in ~/git/rcpp and
> ~/git/tiledb-r -- which maybe the two repos in work in the most.
>
> This list has been very kind helpful in the past when I puzzled myself with
> this one true editor.  I suspect I used an ESS hook somewhere.  How would I
> find or debug this?
>
> The net effect seems to be that as soon I save (C-x s) the files GTAGS and
> GRTAGS get updated.  Which I otherwise do via a one-line script setting the
> proper options (as GNU global at some point in the past needed a patch, I
> think that is no longer needed)
> gtags --gtagsconf=/home/edd/.globalrc --gtagslabel=pygments --verbose 
> --statistics "$@"
>
> I have been half-amused by this for a few weeks but I am approaching "white
> flag" territory here on my own workstation: how do I find out how I enabled
> this?  I looked into git commit hooks (nope), GNU global config (nope), and,
> by looking more closely at the timestamps, have to suspect that is from
> Emacs.  But nothing in the Emacs config gives it away. I have a vague memory
> that it went via one of the XDG standard directories
> (~/.config/share/SOMETHING ?) but no mas.  
>
> Dirk, only moderately amuzed by now

More wild guesses:

- maybe some Emacs minor mode does it? Any
  configuration for gtags-mode?

- do you happen to use an Emacs package named ggtags
  [https://github.com/leoliu/ggtags]? I think it
  automatically updates tag files per default (see
  'ggtags-update-on-save')

Good luck.

-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
ESS-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/ess-help


Re: [R] Unexpected date format coercion

2021-07-01 Thread Enrico Schumann
On Thu, 01 Jul 2021, Jeremie Juste writes:

> Hello 
>
> On Thursday,  1 Jul 2021 at 08:25, PIKAL Petr wrote:
>> Hm.
>>
>> Seems to me, that both your codes are wrong but printing in Linux is
>> different from Windows.
>>
>> With
>> as.Date("20-12-2020","%Y-%m-%d")
>> you say that 20 is year (actually year 20) and 2020 is day and only first
>> two values are taken (but with some valueas result is NA)
>>
>> I can confirm 4.0.3 in Windows behaves this way too.
>>> as.Date("20-12-2020","%Y-%m-%d")
>> [1] "0020-12-20"
>
> Many thanks for confirming this.
>
>
> On Thursday,  1 Jul 2021 at 18:22, Jim Lemon wrote:
>> Hi Jeremie,
>> Try:
>>
>> as.Date("20-12-2020","%y-%m-%d")
>> [1] "2020-12-20"
>
> Thanks for this info. I'm looking for something that produce NA if the
> date is not exactly in the specified format so that it can be
> corrected. I was relying on the format parameter of the date for that.
>
> The issue is that there can be so many variations in date format that for the 
> time
> being I still find it easier to delegate the correction to the user. A
> particular nasty case is when there are multiple date format in the same
> column.
>
>
> Best regards,
> Jeremie
>

You could explicitly test whether the specified format
is as expcected, perhaps with a regex such as

s <- c("2020-01-20", "20-12-2020")
grepl("^[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]$", s)

and/or by checking the components of the dates:

valid_Date <- function(s) {
tmp <- strsplit(s, "[-]")

year <- as.numeric(sapply(tmp, `[[`, 1))
valid.year <- year < 2500 & year > 1800

    month <- as.numeric(sapply(tmp, `[[`, 2))
valid.month <- month >= 0 & month <= 12

day <- as.numeric(sapply(tmp, `[[`, 3))
valid.day <- day >= 1 & day <= 31

ans <- as.Date(s)
ans[!(valid.year & valid.month & valid.day)] <- NA
ans
}



-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Finding the package providing funtion "%du%"

2021-06-30 Thread Enrico Schumann
On Wed, 30 Jun 2021, Federico Calboli writes:

> Hello All,
>
> I am playing with igraph (which seems to work for what I have used it).  
> Nevetheless:
>
> demo('community', package="igraph")
>
>
>
>   demo(community)
>    ~
>
> Type   to start : 
>
>> pause <- function() {}
>
>> ### A modular graph has dense subgraphs
>> mod <- make_full_graph(10) %du% make_full_graph(10) %du% make_full_graph(10)
> Error in make_full_graph(10) %du% make_full_graph(10) %du% 
> make_full_graph(10) : 
>  could not find function "%du%”
>
>
> For the life of mine I cannot find where %du5 is meant
> to come from.  Any clues?  also, any suggestion 9other
> than Google) to *efficiently* find whatever
> dependencies I might be missing?  I did install igraph
> with dependencies = T… but one never knows.
>
> My: 
>> sessionInfo()
> R version 4.1.0 (2021-05-18)
> Platform: x86_64-apple-darwin17.0 (64-bit)
> Running under: macOS Mojave 10.14.6
>
> Matrix products: default
> BLAS:   
> /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRblas.dylib
> LAPACK: 
> /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib
>
> locale:
> [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
>
> attached base packages:
> [1] stats graphics  grDevices utils datasets  methods   base 
>
> other attached packages:
> [1] dplyr_1.0.7magrittr_2.0.1
>
> loaded via a namespace (and not attached):
> [1] fansi_0.5.0  utf8_1.2.1   crayon_1.4.1 R6_2.5.0 
> lifecycle_1.0.0 pillar_1.6.1 rlang_0.4.11 vctrs_0.3.8  
> generics_0.1.0  
> [10] ellipsis_0.3.2   tools_4.1.0  glue_1.4.2   purrr_0.3.4  
> compiler_4.1.0   pkgconfig_2.0.3  tidyselect_1.1.1 tibble_3.1.2  
>
>
>
>
>
>
>
> --
> Federico Calboli
> LBEG - Laboratory of Biodiversity and Evolutionary Genomics
> Charles Deberiotstraat 32 box 2439
> 3000 Leuven
> +32 16 32 87 67
>

You need to quote the name:

   ?`%du%`
   igraph::`%du%`


You may need to say

   library("igraph")

before you run the demo.



-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Date Time, as.POSIXct used locale, strange plot behavior

2021-04-30 Thread Enrico Schumann
On Fri, 30 Apr 2021, Tilmann Faul writes:

> Dear Jeff,
>
> Thanks for your answer.
> Sys.timezone()  gives
> [1] "Europe/Berlin"
> I tried "Europe/Berlin" as tz argument, giving the same result als using
> "CEST" (Central European Summer Time).
> It seems to me, that using as.POSIXct without tz argument defaults to tz
> UTC and with tz argument, either "CEST" or "Europ/Berlin" uses the
> European tz, regarding the plot.
> Never the less i do not understand why all of them have the same time
> printout on my system.
>
> as.POSIXct("2021-04-21 00:00:00", tz="CEST")
> # [1] "2021-04-21 CEST"
> as.POSIXct("2021-04-21 00:00:00", tz="Europ/Berlin")
> # [1] "2021-04-21 Europ"
> as.POSIXct("2021-04-21 00:00:00")
> # [1] "2021-04-21 CEST"
>
> Can someone comment on that, please?
>
> best Regards
> Tilmann

Timezone names in general are not portable (i.e. not
safe to use).  You may always specify a timezone name,
but it may be ignored:

as.POSIXct("2021-04-21 00:00:00", tz = "PartyTime")
## "2021-04-21 PartyTime"

So just because your system prints "CEST", it does not
mean it has recognised it as a timezone. (To make it
even more complicated: your system may display a time
as "CEST", meaning a time in Central European Summer
Time, but still not accept "CEST" as _input_ for a
timezone name.)

'POSIXct' represents time as a number (seconds since
1970). To compare the results of different calls, it is
easier to compare those numbers.

dput(as.POSIXct("2021-04-21 00:00:00", tz="CEST"))
## structure(1618963200, class = c("POSIXct", "POSIXt"), tzone = "CEST")
dput(as.POSIXct("2021-04-21 00:00:00", tz="PartyTime"))
## structure(1618963200, class = c("POSIXct", "POSIXt"), tzone = 
"PartyTime")

dput(as.POSIXct("2021-04-21 00:00:00", tz="Europe/Berlin"))
## structure(1618956000, class = c("POSIXct", "POSIXt"), tzone = 
"Europe/Berlin")
dput(as.POSIXct("2021-04-21 00:00:00"))
## structure(1618956000, class = c("POSIXct", "POSIXt"), tzone = "")

So in your case, "CEST" is (most likely) not recognised
as a valid input for a timezone name, and hence it is
ignored, but still displayed.

HTH
Enrico


> On 29.04.21 23:19, Jeff Newmiller wrote:
>> What is your TZ environment variable set to? That's what time conversion 
>> defaults to  ?DateTimeClasses
>> 
>> Also, I am not sure CEST is a valid timezone designation... it can be system 
>> dependent, but using one of the elements listed in ?OlsonNames.
>> 
>> On April 29, 2021 12:22:44 PM PDT, Tilmann Faul  
>> wrote:
>>> Hy,
>>>
>>> stumbled over the following problem while plotting DateTime Objects.
>>>
>>> plot(as.POSIXct(c("2021-04-21 00:00:00", "2021-04-21 23:59:59")), c(0,
>>> 1), type='l')
>>>
>>> arrows(as.POSIXct("2021-04-21 00:00:00", tz="CEST"),
>>>   0.3,
>>>   as.POSIXct("2021-04-21 00:00:00", tz="CEST"),
>>>   0.2,
>>>   length=0.07, angle=15)
>>>
>>> # arrow at 02:00, why?
>>>
>>> arrows(as.POSIXct("2021-04-21 00:00:00"),
>>>   0.3,
>>>   as.POSIXct("2021-04-21 00:00:00"),
>>>   0.2,
>>>   length=0.07, angle=15, col='red')
>>>
>>> # arrow at 00:00 as expected
>>>
>>> as.POSIXct(c("2021-04-21 00:00:00", "2021-04-21 23:59:59"))[1]
>>> # [1] "2021-04-21 CEST"
>>> as.POSIXct("2021-04-21 00:00:00", tz="CEST")
>>> # [1] "2021-04-21 CEST"
>>> as.POSIXct("2021-04-21 00:00:00")
>>> # [1] "2021-04-21 CEST"
>>>
>>> all representations on my system are the same, why is the plot location
>>> of the arrows different??
>>> I am located in Germany, my locale:
>>> Sys.getlocale()
>>> [1]
>>> "LC_CTYPE=de_DE.UTF-8;LC_NUMERIC=C;LC_TIME=de_DE.UTF-8;LC_COLLATE=de_DE.UTF-8;LC_MONETARY=de_DE.UTF-8;LC_MESSAGES=de_DE.UTF-8;LC_PAPER=de_DE.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=de_DE.UTF-8;LC_IDENTIFICATION=C"
>>>
>>> Any Idea?
>>>
>>> Best regards
>>> Tilmann
>>>
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] union of two sets are smaller than one set?

2021-01-31 Thread Enrico Schumann
On Sun, 31 Jan 2021, Martin Møller Skarbiniks Pedersen writes:

> This is really puzzling me and when I try to make a small example
> everything works like expected.
>
> The problem:
>
> I got these two large vectors of strings.
>
>> str(s1)
>  chr [1:766608] "0.dk" ...
>> str(s2)
>  chr [1:59387] "043.dk" "0606.dk" "0618.dk" "0888.dk" "0iq.dk" "0it.dk" ...
>
> And I need to create the union-set of s1 and s2.
> I expect the size of the union-set to be between 766608 and 766608+59387.
> However it is 681193 which is less that number of elements in s1!
>
>> length(base::union(s1, s2))
> [1] 681193
>
> Any hints?
>
> Regards
> Martin
>

Duplicates?

kind regards
Enrico

-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] FREDR and R 3.6

2020-10-30 Thread Enrico Schumann
On Thu, 29 Oct 2020, H writes:

> I tried to install the fredr package yesterday to
> access the data series hosted by the St. Louis Fed but
> my installation of R, version 3.6, tells me it is not
> available from a cran repository.
>
> I could not find any information on this on the fredr information package and 
> was wondering if anyone here might know?
>

Just for completeness: there is also the 'alfred' package
(https://cran.r-project.org/package=alfred), with which
you can also access data of the St. Louis Fed.

-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to convert column from millisecond epoch time to yyyy-mm-dd GMT

2020-06-23 Thread Enrico Schumann
On Tue, 23 Jun 2020, Gregg via R-help writes:

> Hello to all the smart people out there
>
> I have a data.frame labeled itsm_service_type_field. I need to convert
> the Timestamp field which is epoch time in milliseconds to a
> -mm-dd GMT Date. 
>
> Data.frame format is below.
>
> I've attempted to use the lapply and as.POSIXct
> functions to convert the time field in the original
> data.frame to a new data.frame I've labeled
> "itsm_service_type_field_adjusted_time",
> but I've got the order and syntax wrong.
>
> Help would be so much appreciated.
>
> Thanks in advance.
> Gregg
> Arizona
>
> Details - See Below:
>
> itsm_service_type_field <- fread("itsm_service_type_2018-2019_CONUS.csv")
>
>>>>>>>>>>>>>>???itsm_service_type_field_adjusted_time <- 
>>>>>>>>>>>>>>lapply(itsm_service_type_field[ , Timestamp], 
>>>>>>>>>>>>>>as.POSIXct(Timestamp, origin="1970-01-01", tz="GMT"))
>
> head(itsm_service_type_field)
>     Id                              Timestamp         Data Type Visibility    
>                   TYPE_SERVICE
> 1: INCBR0005072277 157705920 itsm-ticket U    0
> 2: INCBR0005073034 157705920 itsm-ticket U    1
> 3: INCBR0005073131 157705920 itsm-ticket U    0
> 4: INCBR0005074186 157705920 itsm-ticket U    0
> 5: INCBR0005074188 157705920 itsm-ticket U    0
> 6: INCBR0005074546 157705920 itsm-ticket U    0
>

You should divide by 1000:

.POSIXct(157705920/1000, tz = "GMT")
## [1] "2019-12-23 GMT"

as.POSIXct(157705920/1000,
   origin = as.POSIXct("1970-01-01 00:00:00", tz = "GMT"),
   tz = "GMT")
## [1] "2019-12-23 GMT"


-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Output multiple sheets to Excel files with openxlsx::write.xlsx

2020-05-27 Thread Enrico Schumann
On Wed, 27 May 2020, John writes:

> Hi,
>
>This is my code a few years ago. I was able to output multiple sheet to
> an excel file. Nevertheless, the "append" argument appears to be obsolete.
> Now I see only one sheet, the latest added sheet, in the output. Is there
> any other way to do it with openxlsx::write.xlsx or other
> functions/packages?
>
>
> openxlsx::write.xlsx(df1, file=fl_out, sheetName="a",
>  col.names=TRUE, row.names=FALSE, append=TRUE, showNA=FALSE)
>
> openxlsx::write.xlsx(df2, file=fl_out, sheetName="b",
>  col.names=TRUE, row.names=FALSE,
> append=TRUE, showNA=FALSE)
>
> Thanks!!
>

I think you need to create a workbook first, then add
the sheets, and finally write the workbook to a file.
Something like this:

df <- data.frame(a = 1:3,
 b = 4:6)

library("openxlsx")
wb <- createWorkbook()

sheet <- "sheet1"
addWorksheet(wb, sheet)
writeData(wb, sheet = sheet, x = df)

sheet <- "sheet2"
addWorksheet(wb, sheet)
    writeData(wb, sheet = sheet, x = df + 1)

saveWorkbook(wb, file = "~/Desktop/two_sheets.xlsx")



-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Date from text

2020-05-15 Thread Enrico Schumann
On Fri, 15 May 2020, Poizot Emmanuel writes:

> Dear all,
>
> I've a data frame with a column "Date":
>
> [1] 11-1993 11-1993 11-1993 11-1993 11-1993 11-1993 11-1996 11-1996 11-1996
> [10] 11-1996 11-1996 11-1996 02-1998 02-1998 02-1998 02-1998 02-1998 02-1998
> [19] 11-1998 11-1998 11-1998 11-1998 11-1998 11-1998 10-2001 10-2001 10-2001
> [28] 10-2001 10-2001 10-2001 02-2003 02-2003 02-2003 02-2003 02-2003 02-2003
> [37] 11-2004 11-2004 11-2004 11-2004 11-2004 11-2004 11-2005 11-2005 11-2005
> [46] 11-2005 11-2005 11-2005 11-2007 11-2007 11-2007 11-2007 11-2007 11-2007
> [55] 10-2008 10-2008 10-2008 10-2008 10-2008 10-2008 03-2009 03-2009 03-2009
> [64] 03-2009 03-2009 03-2009 10-2012 10-2012 10-2012 10-2012 10-2012 10-2012
> [73] 12-2017 12-2017 12-2017 12-2017 12-2017 12-2017 12-2018 12-2018 12-2018
> [82] 12-2018 12-2018 12-2018
>
> I want to convert that into real dates:
> as.POSIXct(Date, format="%m-%Y") always return "NA" values.
> Where am I wrong ?
>
> regards

If you really want a date, I'd suggest 'as.Date'. The help
for ?as.Date says:

  "If the date string does not specify the date completely,
   the returned answer may be system-specific."

So perhaps try something like

as.Date(paste0("01-", "11-1993"), format = "%d-%m-%Y")
## [1] "1993-11-01"

Or look at 'yearmon' in package 'zoo':

library("zoo")
as.yearmon("11-1993", format = "%m-%Y")
## [1] "Nov 1993"

-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sometimes commands do not terminate after upgrading to R 4.0 and Ubuntu 20.04

2020-05-10 Thread Enrico Schumann
>>>>> "Adrien" == Adrien FABRE  writes:

  Adrien> I have upgraded R (from 3.6 to 4.0) and RStudio (from 1.1 to 1.2.5) a 
few
  Adrien> days ago, and Ubuntu from 18.04 to 20.04 yesterday.

  Adrien> Since then, R sometimes never terminates when executing certain 
commands:
  Adrien> ivreg (from package AER), summary (of a logit regression) and logitmfx
  Adrien> (from package mfx). Sometimes these commands run fine, but most of 
the time
  Adrien> I have to kill the process because R won't terminate the execution, 
even
  Adrien> when pressing the red Stop button in RStudio.

  Adrien> When I tried example('AER'), it worked fine. Then I re-installed the
  Adrien> package AER. It threw 10 warnings of type In readLines(file, skipNul =
  Adrien> TRUE) :  cannot open compressed file
  Adrien> '/usr/lib/R/site-library/[package]/DESCRIPTION', probable reason 'No 
such
  Adrien> file or directory' where [package] is abind, colorspace, dichromat... 
(but
  Adrien> not AER).

  Adrien> Since then example('AER') throws a warning: no help found for ‘AER’.

  Adrien> I've removed and reinstalled R 4.0: it didn't help. Besides, the apt 
purge
  Adrien> r-base* r-recommended r-cran-* threw a warning: dpkg: warning: while
  Adrien> removing r-base-core, directory '/usr/lib/R/site-library' not empty 
so not
  Adrien> removed. Also, there was a bunch of Package [package] is not 
installed, so
  Adrien> not removed, including for [package] equal to r-cran-abind and the 
other
  Adrien> listed above (this purge also returned a bunch of Note, selecting 
[package]
  Adrien> for glob 'r-cran-*').

  Adrien> I have the same bug when using R from the terminal. For the record, I 
was
  Adrien> probably working on RStudio during the upgrade to Ubuntu 20.04. Also, 
I
  Adrien> can't recall if this issue started after I upgraded R and RStudio 
(which
  Adrien> would be my best guess) or after I upgraded Ubuntu (a day or two 
later).

  Adrien> I hope someone can help.

There has been a discussion on R-SIG-Debian recently,
and /perhaps/ it is related to your troubles.

See https://stat.ethz.ch/pipermail/r-sig-debian/2020-April/003159.html
and in particular
https://stat.ethz.ch/pipermail/r-sig-debian/2020-April/003166.html
.


-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Convert Long DOY time format to yyyymmdd hh format

2020-01-23 Thread Enrico Schumann



Quoting Ogbos Okike :


Dear Experts,
I have a data spanning 56 years from 1963 to 2018.
The datetime format is in DOY hour:
1963 335 0
1963 335 1
1963 335 2
1963 335 3
1963 335 4
1963 335 5
1963 335 6
1963 335 7
1963 335 8
1963 335 9
1996 202 20
1996 202 21
1996 202 22
1996 202 23
1996 203 0
1996 203 1
1996 203 2
1996 203 3
2018 365 20
2018 365 21
2018 365 22
2018 365 23
When I used:
as.Date(335,origin="1963-01-01"), for the first row, I got:
[1] "1963-12-02"
This is the format I want, though it is not yet complete. Time is missing.

Again, I can't be doing this one after the other. I guess you have a
better way of handling this. I have spent some time trying to get it
right but I am really stuck. I would be most glad if you could spare
your busy time to help me again.

Thank you very much for your usual kind assistance.

Best regards
Ogbos



Perhaps something like this:

read.table(text="
1963 335 0
1963 335 1
1963 335 2
1963 335 3
1963 335 4
1963 335 5
1963 335 6
1963 335 7
1963 335 8
1963 335 9
1996 202 20
1996 202 21
1996 202 22
1996 202 23
1996 203 0
1996 203 1
1996 203 2
1996 203 3
2018 365 20
2018 365 21
2018 365 22
2018 365 23
", header = FALSE, sep = " ") -> data

as.POSIXct(paste(as.Date(paste0(data[[1]], "-1-1")) + data[[2]] - 1,  
data[[3]]),

   format = "%Y-%m-%d %H")

You might want to specify a different timezone, and also check for  
"off-by-one" error

when it comes to day of year.

--
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reporting missing dates

2020-01-16 Thread Enrico Schumann



Quoting Duncan Murdoch :


On 15/01/2020 4:28 p.m., Jeff Reichman wrote:

R-help Forum

I have a 20 year data set and I am looking for a way to find missing dates.
I wrote this and its works, but am wounding if there is a better way?

d <- c('2020-01-01', '2020-01-02', '2020-01-04', '2020-01-05')
d <- as.Date(d)
date_range <- seq(min(d), max(d), by = 1)
date_range[!date_range %in% d]


Another approach would be based on diff(d) - 1.  That will count the  
number of missing dates between any pair of dates that are present:


diff(d) - 1
# Time differences in days
# [1] 0 1 0

That shows that the second date is followed by one missing day.

Duncan Murdoch


But you might want to check if the dates in 'd' are really sorted.

--
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Incorrect Conversion of Datetime

2020-01-08 Thread Enrico Schumann
>>>>> On Wed, 8 Jan 2020 17:43:29 +0100, Ogbos Okike  
>>>>> writes:

  Ogbos> Dear Enrico,
  Ogbos> Thanks for your time.
  Ogbos> I have tried to learn how to use dput in R. I have not yet made much 
progress.

  Ogbos> I have succeeded in using dput to store my data frame. I first
  Ogbos> converted my data into a data frame and then used:
  Ogbos> dput(dd,file="Ogbos2",control = c("keepNA", "keepInteger",
  Ogbos> "showAttributes")) to output the dput file. dd is my data frame.

  Ogbos> When I opened the file, I didn't like its content as it differs very
  Ogbos> much from my data frame. But I don't know whether that makes sense to
  Ogbos> you. I am attaching the file.
  Ogbos> I am thanking you in advance for additional suggestions.
  Ogbos> Best wishes
  Ogbos> Ogbos

Hello Ogbos

your attempt worked fine: this is the data you sent


dta <- structure(list(dta.year = c(98L, 98L, 98L, 98L, 98L, 98L, 98L, 
  98L, 98L, 98L, 98L, 98L, 98L, 98L, 98L, 98L, 98L, 98L, 98L, 98L, 
  98L, 98L, 98L, 98L, 98L, 98L, 98L, 98L, 98L), dta.month = c(1L, 
  1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
  1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), dta.day = c(5L, 
  5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
  5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 6L, 6L, 6L), dta.hour = c(2L, 
  3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 
  17L, 18L, 19L, 20L, 21L, 22L, 23L, 0L, 1L, 2L, 3L, 4L, 5L, 6L
  ), dta.counts = c(6462L, 6450L, 6423L, 6467L, 6480L, 6457L, 6417L, 
  6467L, 6467L, 6468L, 6500L, 6482L, 6465L, 6465L, 6475L, 6452L, 
  6440L, 6478L, 6470L, 6422L, 6448L, 6462L, 6485L, 6462L, 6485L, 
  6470L, 6487L, 6515L, 6488L)), .Names = c("dta.year", "dta.month", 
  "dta.day", "dta.hour", "dta.counts"), row.names = c(NA, -29L),
  class = "data.frame")

But then:

  head(dta)
  ##   dta.year dta.month dta.day dta.hour dta.counts
  ## 1   98 1   52   6462
  ## 2   98 1   53   6450
  ## 3   98 1   54   6423
  ## 4   98 1   55   6467
  ## 5   98 1   56   6480
  ## 6   98 1   57   6457

The data that you read in has January (1) as month.
So whatever goes wrong, seems to go wrong when you read
the data.  Are you quite sure you read the file you
read is the file you have shown?


kind regards
Enrico

  Ogbos> On Wed, Jan 8, 2020 at 1:07 PM Enrico Schumann 
 wrote:
  >> 
  >> 
  >> Quoting Ogbos Okike :
  >> 
  >> > Dear Friends,
  >> > A sample of my data is:
  >> > 98 05 01 028541
  >> > 98 05 01 038548
  >> > 98 05 01 048512
  >> > 98 05 01 058541
  >> > 98 05 01 068509
  >> > 98 05 01 078472
  >> > 98 05 01 088454
  >> > 98 05 01 098461
  >> > 98 05 01 108462
  >> > 98 05 01 118475
  >> > 98 05 01 128433
  >> > 98 05 01 138479
  >> > 98 05 01 148417
  >> > 98 05 01 158463
  >> > 98 05 01 168473
  >> > 98 05 01 178450
  >> > 98 05 01 188433
  >> > 98 05 01 198437
  >> > 98 05 01 208437
  >> > 98 05 01 218438
  >> > 98 05 01 228421
  >> > 98 05 01 238420
  >> > 98 05 02 008371
  >> > 98 05 02 018338
  >> > 98 05 02 028251
  >> > 98 05 02 038204
  >> > 98 05 02 048183
  >> > 98 05 02 058231
  >> > 98 05 02 068242
  >> > Columns 1, 2, 3, 4 and 5 stands for year, month, day , hour and count.
  >> >
  >> > Using:
  >> > Sys.setenv( TZ="GMT" )
  >> >
  >> >
  >> > dta <- read.table("Ohr1may98", col.names = c("year", "month", "day",
  >> > "hour", "counts"))
  >> > dta$year <- with( dta, ifelse(year < 50, year + 2000, year + 1900))
  >> > dta$datetime <- with( dta, as.POSIXct(ISOdatetime(year, 
month,day,hour,0,0)))
  >> > a =  dta$datetime
  >> > I converted the datetime and plotted the graph of count vs a. The plot
  >> > was great but I have issues with the date.
  >> >
  >> > The raw data is for some hours for Ist and second day of may 1998 as
  >> > is evident from the sample data. But the result of date stored in "a"
  >> > above shows:
  >> >> a
  >> >  [1] "1998-01-05 02:00:00 GMT" "19

Re: [R] Incorrect Conversion of Datetime

2020-01-08 Thread Enrico Schumann



Quoting Ogbos Okike :


Dear Friends,
A sample of my data is:
98 05 01 028541
98 05 01 038548
98 05 01 048512
98 05 01 058541
98 05 01 068509
98 05 01 078472
98 05 01 088454
98 05 01 098461
98 05 01 108462
98 05 01 118475
98 05 01 128433
98 05 01 138479
98 05 01 148417
98 05 01 158463
98 05 01 168473
98 05 01 178450
98 05 01 188433
98 05 01 198437
98 05 01 208437
98 05 01 218438
98 05 01 228421
98 05 01 238420
98 05 02 008371
98 05 02 018338
98 05 02 028251
98 05 02 038204
98 05 02 048183
98 05 02 058231
98 05 02 068242
Columns 1, 2, 3, 4 and 5 stands for year, month, day , hour and count.

Using:
Sys.setenv( TZ="GMT" )


dta <- read.table("Ohr1may98", col.names = c("year", "month", "day",
"hour", "counts"))
dta$year <- with( dta, ifelse(year < 50, year + 2000, year + 1900))
dta$datetime <- with( dta, as.POSIXct(ISOdatetime(year, month,day,hour,0,0)))
a =  dta$datetime
I converted the datetime and plotted the graph of count vs a. The plot
was great but I have issues with the date.

The raw data is for some hours for Ist and second day of may 1998 as
is evident from the sample data. But the result of date stored in "a"
above shows:

a

 [1] "1998-01-05 02:00:00 GMT" "1998-01-05 03:00:00 GMT"
 [3] "1998-01-05 04:00:00 GMT" "1998-01-05 05:00:00 GMT"
 [5] "1998-01-05 06:00:00 GMT" "1998-01-05 07:00:00 GMT"
 [7] "1998-01-05 08:00:00 GMT" "1998-01-05 09:00:00 GMT"
 [9] "1998-01-05 10:00:00 GMT" "1998-01-05 11:00:00 GMT"
[11] "1998-01-05 12:00:00 GMT" "1998-01-05 13:00:00 GMT"
[13] "1998-01-05 14:00:00 GMT" "1998-01-05 15:00:00 GMT"
[15] "1998-01-05 16:00:00 GMT" "1998-01-05 17:00:00 GMT"
[17] "1998-01-05 18:00:00 GMT" "1998-01-05 19:00:00 GMT"
[19] "1998-01-05 20:00:00 GMT" "1998-01-05 21:00:00 GMT"
[21] "1998-01-05 22:00:00 GMT" "1998-01-05 23:00:00 GMT"
[23] "1998-01-06 00:00:00 GMT" "1998-01-06 01:00:00 GMT"
[25] "1998-01-06 02:00:00 GMT" "1998-01-06 03:00:00 GMT"
[27] "1998-01-06 04:00:00 GMT" "1998-01-06 05:00:00 GMT"
[29] "1998-01-06 06:00:00 GMT"
This seems to suggest day 5 and 6 in January 1998 instead of day 1 and
2 in May of 1998.

I have spent some time trying to resolve this but I have not been successful.

I would be thankful if you could help me to check where I went astray.

Thank you.
Best wishes
Ogbos



I cannot reproduce these results. Could you please provide a fully
reproducible example, by providing a small example dataset via 'dput(dta)'?


--
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] outer join of xts's

2020-01-02 Thread Enrico Schumann



Quoting Eric Berger :


Hi,
I have a list L of about 2,600 xts's.
Each xts has a single numeric column. About 90% of the xts's have
approximately 500 rows, and the rest have fewer than 500 rows.
I create a single xts using the command

myXts <- Reduce( merge.xts, L )

By default, merge.xts() does an outer join (which is what I want).

The command takes about 80 seconds to complete.
I have plenty of RAM on my computer.

Are there faster ways to accomplish this task?

Thanks,
Eric



Since you already know the number of series and all possible timestamps,
you could preallocate a matrix (number of timestamps times number of series).
You could use the fastmatch package to match the timestamps against the rows.
This what 'pricetable' in the PMwR package does.  Calling

library("PMwR")
do.call(pricetable, L)

should give you matrix of the merged series, with an attribute 'timestamp',
from which you could create an xts object again.

I am not sure if it is the fastest way, but it's probably faster than calling
merge repeatedly.

kind regards
Enrico  (the maintainer of PMwR)

--
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] date

2019-12-19 Thread Enrico Schumann



Quoting Eric Berger :


Martin  writes: "there's really no reason for going beyond base R"

I disagree. Lubridate is a fantastic package. I use it all the time. It
makes working with dates really easy, as evidenced by John Kane's
suggestion. I strongly recommend learning to work with it.

The bottom line: as is often the case, there are many different ways to
accomplish a task in R.


I apologise beforehand if this sparks an unnecessary discussion ;-)

But the important point is:
If you know the structure of the data you want to
parse, then it is best to tell R (or any other language)
this structure explicitly.



On Thu, Dec 19, 2019 at 10:31 AM Martin Maechler 
wrote:


>>>>> John Kane
>>>>> on Tue, 17 Dec 2019 20:28:17 -0500 writes:

> library(lubridate)
> gs$dat1  <-  mdy(gs$date)

there's really no reason for going beyond base R.

Using the proper format as per Patrick and Peter's advice
(below) is perfectly clear and actually
more robust (for the next data set etc)
than going via "good guessing" in extra packages.

> On Tue, 17 Dec 2019 at 18:38, peter dalgaard 
wrote:
>>
>> ...and switch the order, and use %y for 2-digit years.
>>
>> > On 17 Dec 2019, at 23:57 , Patrick (Malone Quantitative) <
mal...@malonequantitative.com> wrote:
>> >
>> > Try putting / instead of - in your format, to match the data.
>> >
>> > On Tue, Dec 17, 2019 at 5:52 PM Val  wrote:
>> >>
>> >> Hi All,
>> >>
>> >> I wanted to to convert character date  mm/dd/yy  to -mm-dd
>> >> The sample data and my attempt is shown below
>> >>
>> >> gs <-read.table(text="ID date
>> >> A1   09/27/03
>> >> A2   05/27/16
>> >> A3   01/25/13
>> >> A4   09/27/19",header=TRUE,stringsAsFactors=F)
>> >>
>> >> Desired output
>> >>  ID date  d1
>> >> A1 09/27/03 2003-09-27
>> >> A2 05/27/16 2016-05-27
>> >> A3 01/25/13 2012-04-25
>> >> A4 09/27/19 2019-09-27
>> >>
>> >> I used this
>> >> gs$d1 = as.Date(as.character(gs$date), format = "%Y-%m-%d")
>> >>
>> >> but I got NA's.
>> >>
>> >> How do I get my desired result?
>> >> Thank you.
>> >>
>>
>> --
>> Peter Dalgaard, Professor,
>> Center for Statistics, Copenhagen Business School
>> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
>> Phone: (+45)38153501
>> Office: A 4.23
>> Email: pd@cbs.dk  Priv: pda...@gmail.com
>>

> --
> John Kane
> Kingston ON Canada




--
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] class of 'try' if error is raised

2019-12-15 Thread Enrico Schumann
>>>>> "HW" == Hans W Borchers  writes:

  HW> I have been informed by CRAN administrators that the development
  HW> version of R issues warnings for my package(s). Some are easy to mend
  HW> (such as Internet links not working anymore), but this one I don't
  HW> know how to avoid:

  HW> Error in if (class(e) == "try-error") { : the condition has length > 1

  HW> I understand that `class` can return more than one value. But what
  HW> would be the appropriate way to catch an error in a construct like
  HW> this:

  HW> e <- try(b <- solve(a), silent=TRUE)
  HW> if (class(e) == "try-error") {
  HW> # ... do something
  HW> }

  HW> Should I instead compare the class with "matrix" or "array" (or
  HW> both)?. That is, in each case check with a correct result class
  HW> instead of an error?

  HW> Thanks, HW


You should probably use

   if (inherits(e, "try-error")) {
   # ... do something
   }

kind regards
ENrico

-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [R-pkgs] NMOF 2.0-1 (Numerical Methods and Optimization in Finance)

2019-10-22 Thread Enrico Schumann

Dear all,

version 2.0-1 of package NMOF is on CRAN now.

NMOF stands for 'Numerical Methods and Optimization in Finance',
and it accompanies the book with the same name, written by
Manfred Gilli, Dietmar Maringer and Enrico Schumann.[1]

The new version of the package provides all R code and datasets
of the book's second edition, which has been published a few weeks ago.
All updates and new features are listed in the NEWS file.[2]
Sample materials of the book, on backtesting and on optimization
heuristics, are available as well.[3][4]

Kind regards
 Enrico

[1] http://enricoschumann.net/NMOF
[2] https://github.com/enricoschumann/NMOF/blob/master/NEWS
[3] optimisation heuristics: https://ssrn.com/abstract=3391756
[4] backtesting: https://ssrn.com/abstract=3374195



--
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Strange behaviour of sapply function.

2019-09-12 Thread Enrico Schumann



Quoting bic...@math.usask.ca:


Here is are a few lines of my R session:


class(income)

[1] "integer"

class(sapply(1000*income-999,atv,sktaxb,sktax))

[1] "numeric"

class(sapply(1000*income-1001,atv,sktaxb,sktax))

[1] "list"

Although "income" is a numeric array, and sapply works as expected
returning an array (the function "atv" returns a single numeric argument),
if subtract a large enough number from the first argument, the sapply
function now wants to return a list?   Am I missing something?

I am running version 3.3.2 on Mac OS 10.9.9



You have not shown what 'income', 'atv', and so on are; so there is an  
infinity

of possible reasons why you get a list instead of a numeric vector.

One possible reason: what if 'atv' sometimes returns no value at all?

f <- function(x) x[x>0]
str(sapply(1:10, f))
## int [1:10] 1 2 3 4 5 6 7 8 9 10

str(sapply(-5:5, f))
## List of 11
##  $ : int(0)
##  $ : int(0)
##  $ : int(0)
##  $ : int(0)
##  $ : int(0)
##  $ : int(0)
##  $ : int 1
##  $ : int 2
##  $ : int 3
##  $ : int 4
##  $ : int 5

--
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Creating a web site checker using R

2019-08-09 Thread Enrico Schumann
>>>>> "Chris" == Chris Evans  writes:

Chris> I use R a great deal but the huge web crawling power of
Chris> it isn't an area I've used. I don't want to reinvent a
Chris> cyberwheel and I suspect someone has done what I want.
Chris> That is a program that would run once a day (easy for
Chris> me to set up as a cron task) and would crawl a single
Chris> root of a web site (mine) and get the file size and a
Chris> CRC or some similar check value for each page as pulled
Chris> off the site (and, obviously, I'd want it not to follow
Chris> off site links). The other key thing would be for it to
Chris> store the values and URLs and be capable of being run
Chris> in "create/update database" mode or in "check pages"
Chris> mode and for the change mode run to Email me a warning
Chris> if a page changes.  The reason I want this is that two
Chris> of my sites have recently had content "disappear":
Chris> neither I nor the ISP can see what's happened and we
Chris> are lacking the very useful diagnostic of the date when
Chris> the change happened which might have mapped it some
Chris> component of WordPress, plugins or themes having
Chris> updated.

Chris> I am failing to find anything such and all the services
Chris> that offer site checking of this sort are prohibitively
Chris> expensive for me (my sites are zero income and either
Chris> personal or offering free utilities and information).

Chris> If anyone has done this, or something similar, I'd love
Chris> to hear if you were willing to share it.  Failing that,
Chris> I think I will have to create this but I know it will
Chris> take me days as this isn't my area of R expertise and
Chris> as, to be brutally honest, I'm a pretty poor
Chris> programmer.  If I go that way, I'm sure people may be
Chris> able to point me to things I may be (legitimately) able
Chris> to recycle in parts to help construct this.

Chris> Thanks in advance,

Chris> Chris

Chris> -- 
Chris> Chris Evans  Skype: chris-psyctc
Chris> Visiting Professor, University of Sheffield 

Chris> I do some consultation work for the University of Roehampton 
 and other places but this  
remains my main Email address.
Chris> I have "semigrated" to France, see: 
https://www.psyctc.org/pelerinage2016/semigrating-to-france/ if you want to 
book to talk, I am trying to keep that to Thursdays and my diary is now 
available at: https://www.psyctc.org/pelerinage2016/ecwd_calendar/calendar/
Chris> Beware: French time, generally an hour ahead of UK.  That page will 
also take you to my blog which started with earlier joys in France and Spain!

Not an answer, but perhaps two pointers/ideas:

1) Since you know cron, I suppose you work on a
   Unix-like system, and you likely have a programme
   called 'wget' either installed or can easily install
   it. 'wget' has an option 'mirror', which allows you
   to mirror a website.

2) There is tools::md5sum for computing checksums. You
   could store those to a file and check changes in the
   files content (e.g. via 'diff').


regards
Enrico
-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Open a file which name contains a tilde

2019-06-06 Thread Enrico Schumann



Quoting Frank Schwidom :


On 2019-06-05 20:32:07, Enrico Schumann wrote:

>>>>> "FS" == Frank Schwidom  writes:

FS> Hi,
FS> As I can see via path.expand a filename which contains a 
 FS> tilde anywhere gets automatically crippled.


FS> +> path.expand("a ~ b")
FS> [1] "a /home/user b"

FS> +> path.expand("a ~ b ~")
FS> [1] "a /home/user b /home/user"

FS> I want to open a file regardless whether its name contains   
   FS> any character unless 0.


FS> The unix filesystem allow the creation of such files, it 
 FS> sould be possible to open these.


FS> How can I switch off any file crippling activity?

FS> Kind regards,
FS> Frank

Do you need 'path.expand'? For example,

readLines("~/Desktop/a ~ b")

reads just fine the content of a file named
'a ~ b' on my desktop.



Appendix:

I found out in the meantime that I can use 'R --no-readline' but I  
want to use readline and I found no possible readline configuration  
/etc/inputrc).


And maybe it works as Rscript.

But that should be more consistent because it is in fact very basic.

Kind regards,
Frank


You're right. I ran the example from Emacs/ESS.
There it worked, but only because ESS uses '--no-readline'
as a default (i.e. 'ess-R-readline' is set to nil).
With readline enabled, it fails (with R 3.6.0).


--
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Open a file which name contains a tilde

2019-06-05 Thread Enrico Schumann
>>>>> "FS" == Frank Schwidom  writes:

FS> Hi,
FS> As I can see via path.expand a filename which contains a tilde anywhere 
gets automatically crippled.

FS> +> path.expand("a ~ b")
FS> [1] "a /home/user b"

FS> +> path.expand("a ~ b ~")
FS> [1] "a /home/user b /home/user"

FS> I want to open a file regardless whether its name contains any 
character unless 0.

FS> The unix filesystem allow the creation of such files, it sould be 
possible to open these.

FS> How can I switch off any file crippling activity?

FS> Kind regards,
FS> Frank

Do you need 'path.expand'? For example,

    readLines("~/Desktop/a ~ b")

reads just fine the content of a file named
'a ~ b' on my desktop.


-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to generate mutiple set of samples in R?

2019-06-05 Thread Enrico Schumann via R-help
>>>>> "MT" == Thevaraja, Mayooran  writes:

MT> Hello

MT> I am trying to generate samples from a bulk set of number for my
MT> research. So I need to get an output which contains various
MT> collection of samples, for example, sample1, sample2, sample3,
MT>  Does anyone suggest any ideas?

If you want people to help you, you need to provide
more information about what you want to do.

MT> [[alternative HTML version deleted]]

Please do not post in HTML, but in plain text.

MT> __
MT> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
MT> https://stat.ethz.ch/mailman/listinfo/r-help
MT> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

If you want people to help you, PLEASE DO read this
guide and follow its advice.

MT> and provide commented, minimal, self-contained, reproducible code.


-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Complete month name from as.yearmon()

2019-04-13 Thread Enrico Schumann
>>>>> "CB" == Christofer Bogaso  writes:

CB> Hi,
CB> I am wondering if there is any way to get the full name from 
as.yearmon()
CB> function. Please consider below example:

CB> library(quantmod)
CB> as.yearmon(Sys.Date())

CB> This gives: [1] "Apr 2019".

CB> How can I extract the full name ie. 'April 2019'

CB> Appreciate your pointer. Thanks,


library("zoo")   ## where 'as.yearmon' comes from
format(as.yearmon(Sys.Date()), "%B %Y")
## [1] "April 2019"

Note that this will give you the full monthname in your
locale.


-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] large number of scrollable histograms....

2019-01-22 Thread Enrico Schumann
>>>>> "akshay" == akshay kulkarni  writes:

akshay> dear members,

akshay> I am a day trader based in INDIA. I use R for my research.

akshay> I have about 200 vectors whose histograms I need to
akshay> inspect. I have to compare them simultaneously.

akshay> I know methods whereby you can plot multiple histograms on
akshay> one screen. However, you can clearly view only 4 to 5
akshay> histograms in one screen.

akshay> Is there a way to construct a long list of all the 100
akshay> histograms that can be scrollable (like you scroll up or
akshay> down the R console) both downwards and upwards? Any package
akshay> to that effect?

akshay> I would be highly grateful, also, if you can offer any
akshay> suggestions or "out of the box" ideas to simultaneously
akshay> compare all the 100 histograms.

akshay> very many thanks for your help and support..
akshay> yours sincerely,
akshay> AKSHAY M KULKARNI

Just two thoughts:

1) You could plot all histograms into one pdf and
   scroll the PDF.

2) Do you need histograms? Boxplots for instance need
   less space (and if you sort the input data by
   median, say, they often help much better to see
   differences between samples); or use similar plots such
   as quartile plots
   (e.g. https://cran.r-project.org/web/packages/NMOF/vignettes/qTableEx.pdf ).

kind regards
 Enrico


-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] time mathematics

2018-11-20 Thread Enrico Schumann
On Tue, 20 Nov 2018, Knut Krueger writes:

> I have an dataframe from with a given time format:
>
> "23:01:19"
>
> to change some given data:
>
> x=data.frame
> ("Y"=c(1:5),"TIME"=c("23:01:18","23:01:18","23:01:18","23:01:18","23:01:18"))
>
> I need to change  the time increasing in seconds
>
> x=data.frame
> ("Y"=c(1:5),"TIME"=c("23:01:18","23:01:19","23:01:20","23:01:21","23:01:22"))
>
>
> Is it possible without any additional package ?
>
>
> Kind Regards Knut
>

Like so?

start <- "23:01:18"
Y <- 1:5
tmp <- as.POSIXct(paste(Sys.Date(), start))
tmp <- tmp + seq(from = 0, length.out = length(Y))
format(tmp, "%H:%M:%S")
## [1] "23:01:18" "23:01:19" "23:01:20" "23:01:21" "23:01:22"

data.frame(Y, TIME = format(tmp, "%H:%M:%S"))



-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sub/grep question: extract year

2018-08-09 Thread Enrico Schumann



Quoting Marc Girondot via R-help :


Hi everybody,

I have some questions about the way that sub is working. I hope that  
someone has the answer:


1/ Why the second example does not return an empty string ? There is  
no match.


subtext <- "-1980-"
sub(".*(1980).*", "\\1", subtext) # return 1980
sub(".*(1981).*", "\\1", subtext) # return -1980-


This is as documented in ?sub:
   "Elements of character vectors x which are not
substituted will be returned unchanged"

2/ Based on sub documentation, it replaces the first occurence of a  
pattern: why it does not return 1980 ?


subtext <- " 1980 1981 "
sub(".*(198[01]).*", "\\1", subtext) # return 1981


Because the pattern matches the whole string,
not just the year:

regexpr(".*(198[01]).*", subtext)
## [1] 1
## attr(,"match.length")
## [1] 11
## attr(,"useBytes")
## [1] TRUE

From this match, the RE engine will give you the last backreference-match,
which is "1981". If you want to _extract_ the first year, use a  
non-greedy RE instead:


sub(".*?(198[01]).*", "\\1", subtext)
## [1] "1980"

I say _extract_ because you may _replace_ the pattern, as expected:

sub("198[01]", "", subtext)
## [1] "  1981 "

That is because the pattern does not match the whole string.
Perhaps this example makes it clearer:

test <- "1 2 3 4 5"
sub("([0-9])", "\\1\\1", test)
## [1] "11 2 3 4 5"
sub(".*([0-9]).*", "\\1\\1", test)
## [1] "55"
sub(".*?([0-9]).*", "\\1\\1", test)
## [1] "11"




3/ I want extract year from text; I use:

subtext <- "bla 1980 bla"
sub(".*[ \\.\\(-]([12][01289][0-9][0-9])[ \\.\\)-].*", "\\1",  
subtext) # return 1980

subtext <- "bla 2010 bla"
sub(".*[ \\.\\(-]([12][01289][0-9][0-9])[ \\.\\)-].*", "\\1",  
subtext) # return 2010


but

subtext <- "bla 1010 bla"
sub(".*[ \\.\\(-]([12][01289][0-9][0-9])[ \\.\\)-].*", "\\1",  
subtext) # return 1010


I would like exclude the case 1010 and other like this.

The solution would be:

18[0-9][0-9] or 19[0-9][0-9] or 200[0-9] or 201[0-9]

Is there a solution to write such a pattern in grep ?


You answered this yourself, I think.



Thanks a lot

Marc




--
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] loop over matrix: subscript out of bounds

2018-08-06 Thread Enrico Schumann



Quoting Maija Sirkjärvi :


I have a basic for loop with a simple matrix. The code is doing what it is
supposed to do, but I'm still wondering the error "subscript out of
bounds". What would be a smoother way to code such a basic for loop?

myMatrix <- matrix(0,5,12)
for(i in 1:nrow(myMatrix)) {
  for(i in 1:ncol(myMatrix)) {
myMatrix[i,i] = -1
myMatrix[i,i+1] = 1
}}
print(myMatrix)

Thanks in advance!



Perhaps you do not need loops at all?

myMatrix <- matrix(0, 5, 12)
diag(myMatrix) <- -1
diag(myMatrix[, -1]) <- 1

--
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] dbGetQuery() returns wrong value

2018-07-31 Thread Enrico Schumann



Quoting Christofer Bogaso :


The data type is defined as bigint


Your query does not specify a number, but a string
(you single-quote the digits). Databases may do type conversion;
for instance, see the MySQL manual:  
https://dev.mysql.com/doc/refman/8.0/en/type-conversion.html


Since you send the query as a string anyway, there may be no need
for a "'string' within a string".

Please note that I have Cc'ed . Any follow-up
should probably go to that mailing list.




On Mon, Jul 30, 2018 at 4:45 PM Eric Berger  wrote:


The ID matches in the first 16 characters.
How is your table  declared?


On Mon, Jul 30, 2018 at 2:00 PM, Christofer Bogaso <
bogaso.christo...@gmail.com> wrote:


Session Information for above error:

> sessionInfo()
R version 3.5.0 (2018-04-23)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 16299)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United
States.1252LC_MONETARY=English_United States.1252 LC_NUMERIC=C
 LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] csvread_1.2   bit64_0.9-7   bit_1.1-14RJDBC_0.2-7.1 rJava_0.9-10
DBI_1.0.0

loaded via a namespace (and not attached):
[1] compiler_3.5.0 tools_3.5.0

On Mon, Jul 30, 2018 at 2:27 PM Christofer Bogaso <
bogaso.christo...@gmail.com> wrote:

> Hi,
>
> I used following SQL query to fetch information from DB
>
> > dbGetQuery(Conn, "select ID from  where date = '2018-07-18' and
ID =
> '72075186224672770' limit 10")
>  ID
> 1 72075186224672768
>
> As you see, it is returning a different result from what actual query
> string contains.
>
> However when I used the same query in some other SQL client, I get the
> expected result as:
>
> 72075186224672770
>
> Any idea on what went wrong in R supplied query would be highly
> appreciated.
>


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Zoo changing time-zone when I merge 2 zoo time series

2018-07-19 Thread Enrico Schumann
On Mon, 09 Jul 2018, Christofer Bogaso writes:

> Hi,
>
> Below is my code :
>
> library(zoo)
> Dat1 = structure(c(17890, 17770.01, 17600, 17593, 17630.01), index =
> structure(c(1512664740,
> 1512664800, 1512664860, 1512664920, 1512664980), class = c("POSIXct",
> "POSIXt"), tzone = "America/Los_Angeles"), class = "zoo")
> Dat2 = structure(c(15804.28, 15720.61, 15770, 15750, 15770), index =
> structure(c(1512664740,
> 1512664800, 1512664860, 1512664920, 1512664980), class = c("POSIXct",
> "POSIXt"), tzone = "America/Los_Angeles"), class = "zoo")
>
> merge(Dat1, Dat2)
>
> Dat1 Dat2
> 2017-12-07 22:09:00 17890.00 15804.28
> 2017-12-07 22:10:00 17770.01 15720.61
> 2017-12-07 22:11:00 17600.00 15770.00
> 2017-12-07 22:12:00 17593.00 15750.00
> 2017-12-07 22:13:00 17630.01 15770.00
>
>
> So, after merging the TZ of the original series got changed.
>
> Appreciate if someone points what went wrong
>

Nothing went wrong. Only 'merge.zoo' drops the
time-zone attribute.  But note that it did not change
the actual times:

  unclass(index(Dat1))
  ## [1] 1512664740 1512664800 1512664860 1512664920 1512664980
  ## attr(,"tzone")
  ## [1] "America/Los_Angeles"

  unclass(index(merge(Dat1, Dat2)))
  ## [1] 1512664740 1512664800 1512664860 1512664920 1512664980

  all(unclass(index(Dat1)) == unclass(index(merge(Dat1, Dat2
  ## [1] TRUE

  M <- merge(Dat1, Dat2)
  attr(index(M), "tzone") <- attr(index(Dat1), "tzone")
  M
  ## Dat1 Dat2
  ## 2017-12-07 08:39:00 17890.00 15804.28
  ## 2017-12-07 08:40:00 17770.01 15720.61
  ## 2017-12-07 08:41:00 17600.00 15770.00
  ## 2017-12-07 08:42:00 17593.00 15750.00
  ## 2017-12-07 08:43:00 17630.01 15770.00


See Ripley, B. D. and Hornik, K. (2001) Date-time
classes. R News, 1/2,
8–11. https://www.r-project.org/doc/Rnews/Rnews_2001-2.pdf



-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] bracketing for optimize

2018-05-31 Thread Enrico Schumann
On Thu, 31 May 2018, ivo welch writes:

> dear R wizards:  `optimize()` requires the user to provide the
> brackets.  I can write a bracketing routine, given a function and a
> starting point, but I was wondering whether there was already a
> "standard" user-exposed implementation.  (Presumably, this is used in
> nlm, too; alas, nlm is in C, not native R.)  regards, /iaw
>

There is a 'bracketing' function in package NMOF,
though it is for root-finding (i.e. for optimising you
would need the derivative).

-- 
Enrico Schumann (the maintainer of NMOF)
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] command line fails

2018-02-02 Thread Enrico Schumann
On Fri, 02 Feb 2018, Michael Ashton writes:

> Where can I get the patch? I've never installed a patch for R...usually just 
> upgrade.
>
> Michael Ashton, CFA
> Managing Principal
>
> Enduring Investments LLC
> W: 973.457.4602
> C: 551.655.8006
>

The patched build (i.e. a complete version, not just
the patch) is available from the same CRAN site at
which you find the official release; just further
below.


> -----Original Message-
> From: Enrico Schumann [mailto:e...@enricoschumann.net] 
> Sent: Friday, February 02, 2018 10:36 AM
> To: Michael Ashton
> Cc: Duncan Murdoch; r-help@r-project.org
> Subject: Re: [R] command line fails
>
>
> Quoting Michael Ashton <m.ash...@enduringinvestments.com>:
>
>> Fascinating. The script runs fine in 3.2.5, but won't run in 3.4.3 
>> even with ALL lines commented out.
>>
>> I have no idea what that means. I can't imagine I found a 3.4.3 bug no 
>> one knows about.
>>
>> Michael Ashton, CFA
>> Managing Principal
>>
>> Enduring Investments LLC
>> W: 973.457.4602
>> C: 551.655.8006
>
> Just a guess: Could you try the patched version?
>
> There was some discussion concerning 3.4.3 and command-line arguments on 
> Windows:
>
> https://stat.ethz.ch/pipermail/r-devel/2017-December/075194.html
>
> Kind regards
>  Enrico
>
>
>
>>
>>
>> -Original Message-
>> From: Duncan Murdoch [mailto:murdoch.dun...@gmail.com]
>> Sent: Friday, February 02, 2018 9:03 AM
>> To: Michael Ashton; r-help@r-project.org
>> Subject: Re: [R] command line fails
>>
>> On 02/02/2018 8:20 AM, Michael Ashton wrote:
>>> I don't think it's the path or the slashes. I run other files in this 
>>> same manner, with the same path to the script itself, and they go off 
>>> without a hitch. Although this is the first time I am using 3.4.3, 
>>> and the only script I am using that version of R for at the moment.
>>>
>>> Having said that, I did TRY reversing the slashes and got the same 
>>> result. :-)
>>>
>>
>> I'd try to determine if anything works with 3.4.3.  If nothing does, 
>> maybe you need to back out to the older version.  If some scripts work 
>> and some don't, then it shouldn't take long to find the offending line 
>> by bisection:  comment out the last half of the script, if it works, 
>> that's where the problem is, so comment only the last quarter, etc.
>>
>> Duncan Murdoch
>>
>>
>>> Michael Ashton, CFA
>>> Managing Principal
>>>
>>> Enduring Investments LLC
>>> W: 973.457.4602
>>> C: 551.655.8006
>>>
>>>
>>> -Original Message-
>>> From: Duncan Murdoch [mailto:murdoch.dun...@gmail.com]
>>> Sent: Friday, February 02, 2018 8:16 AM
>>> To: Michael Ashton; r-help@r-project.org
>>> Subject: Re: [R] command line fails
>>>
>>> On 02/02/2018 7:52 AM, Michael Ashton wrote:
>>>> Hi - Think this is quick help. Not sure how to trap what is causing 
>>>> my simple script to run fine in R, but fail immediately when called 
>>>> from rscript. I can put all sorts of traps in the script itself, but 
>>>> when called from the command line the r window simply flashes and 
>>>> closes.
>>>>
>>>> There's probably a way to find out why rscript is failing, but I 
>>>> don't know it and can't seem to find it online. To be clear, I'm not 
>>>> really trying to save the OUTPUT of the file...it never even 
>>>> executes as far as I can tell. I'm calling it with C:\Program 
>>>> Files\R\R-3.4.3\bin\Rscript.exe "P:\Investments\Trading Tools\RV 
>>>> Tools\myfile.r" And again, it executes perfectly if I open the GUI 
>>>> first and then run it within R.
>>>>
>>>
>>> I'd try using forward slashes in the path, i.e.  
>>> "P:/Investments/Trading Tools/RV Tools/myfile.r"  I don't remember if 
>>> R processes the path to the script or whether it's done entirely by 
>>> the shell, but they shouldn't hurt.
>>>
>>> Spaces in file paths sometimes cause trouble.  If you put the script 
>>> in a path with no spaces does that help?  If so, you can probably 
>>> escape that space, but I can't remember what the escape sequence is.  
>>> (Escapes in Windows can be processed by the command
>>> shell or Rscript.exe or both, so it's hard to get them right.)   
>>> Another alternative might be to change directory to that path and 
>>> the

Re: [R] command line fails

2018-02-02 Thread Enrico Schumann


Quoting Michael Ashton <m.ash...@enduringinvestments.com>:

Fascinating. The script runs fine in 3.2.5, but won't run in 3.4.3  
even with ALL lines commented out.


I have no idea what that means. I can't imagine I found a 3.4.3 bug  
no one knows about.


Michael Ashton, CFA
Managing Principal

Enduring Investments LLC
W: 973.457.4602
C: 551.655.8006


Just a guess: Could you try the patched version?

There was some discussion concerning 3.4.3 and command-line
arguments on Windows:

https://stat.ethz.ch/pipermail/r-devel/2017-December/075194.html

Kind regards
Enrico






-Original Message-
From: Duncan Murdoch [mailto:murdoch.dun...@gmail.com]
Sent: Friday, February 02, 2018 9:03 AM
To: Michael Ashton; r-help@r-project.org
Subject: Re: [R] command line fails

On 02/02/2018 8:20 AM, Michael Ashton wrote:
I don't think it's the path or the slashes. I run other files in  
this same manner, with the same path to the script itself, and they  
go off without a hitch. Although this is the first time I am using  
3.4.3, and the only script I am using that version of R for at the  
moment.


Having said that, I did TRY reversing the slashes and got the same
result. :-)



I'd try to determine if anything works with 3.4.3.  If nothing does,  
maybe you need to back out to the older version.  If some scripts  
work and some don't, then it shouldn't take long to find the  
offending line by bisection:  comment out the last half of the  
script, if it works, that's where the problem is, so comment only  
the last quarter, etc.


Duncan Murdoch



Michael Ashton, CFA
Managing Principal

Enduring Investments LLC
W: 973.457.4602
C: 551.655.8006


-Original Message-
From: Duncan Murdoch [mailto:murdoch.dun...@gmail.com]
Sent: Friday, February 02, 2018 8:16 AM
To: Michael Ashton; r-help@r-project.org
Subject: Re: [R] command line fails

On 02/02/2018 7:52 AM, Michael Ashton wrote:
Hi - Think this is quick help. Not sure how to trap what is  
causing my simple script to run fine in R, but fail immediately  
when called from rscript. I can put all sorts of traps in the  
script itself, but when called from the command line the r window  
simply flashes and closes.


There's probably a way to find out why rscript is failing, but I  
don't know it and can't seem to find it online. To be clear, I'm  
not really trying to save the OUTPUT of the file...it never even  
executes as far as I can tell. I'm calling it with C:\Program  
Files\R\R-3.4.3\bin\Rscript.exe "P:\Investments\Trading Tools\RV  
Tools\myfile.r" And again, it executes perfectly if I open the GUI  
first and then run it within R.




I'd try using forward slashes in the path, i.e.  
"P:/Investments/Trading Tools/RV Tools/myfile.r"  I don't remember  
if R processes the path to the script or whether it's done entirely  
by the shell, but they shouldn't hurt.


Spaces in file paths sometimes cause trouble.  If you put the  
script in a path with no spaces does that help?  If so, you can  
probably escape that space, but I can't remember what the escape  
sequence is.  (Escapes in Windows can be processed by the command  
shell or Rscript.exe or both, so it's hard to get them right.)   
Another alternative might be to change directory to that path and  
then use a relative path for the R script.


Duncan Murdoch


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to programmatically save a web-page using R (mimicking Command+S)

2018-01-14 Thread Enrico Schumann
On Sat, 06 Jan 2018, Christofer Bogaso writes:

> Hi,
>
> I would appreciate if someone can give me a pointer on how to save a
> webpage programmatically using R.
>
> For example, let say I have this webpage open in my browser:
>
> http://www.bseindia.com/stock-share-price/dabur-india-ltd/dabur/500096/
>
> When manually I save this page, I just press Command+S (using Mac) and
> then this page get saved in hard-disk
>
> Now I want R to mimic this same job that I do using Command-S
>
> So far I have tried with readLines() however the output content is
> different than what I could achieve using Command+S
>
> Any help will be highly appreciated.
>
> Thanks for your time.
>

The command-line utility 'wget' can download websites,
including graphics, etc. Look for 'mirror' in its
documentation if you want to download the complete
site. It is usually available by default on Unix-style
systems; I am sure there is a version for Mac. If you
insist on using R, you could write a simple wrapper,
using ?system or ?system2.


-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Generating help files for a function

2017-12-16 Thread Enrico Schumann
On Sat, 16 Dec 2017, Erin Hodgess writes:

> Hello everyone!
>
> I'm in the process of writing a package, and I'm using the lovely "R
> Package" book as a guideline.
>
> However, in the midst of my work,  I discovered that I had omitted a
> function and am now putting in it the package.  Not a problem.  But the
> problem is the help file.  What is the best way to generate a help file
> "after the fact" like that, please?
>
> Thank you in advance.  Hope everyone is enjoying various holidays.
>
> Sincerely,
> Erin

see ?prompt 

-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [R-pkgs] NMOF 1.2-2 (Numerical Methods and Optimization in Finance)

2017-10-24 Thread Enrico Schumann

Dear all,

version 1.2-2 of package NMOF is on CRAN now.

NMOF stands for 'Numerical Methods and Optimization
in Finance'. The package provides R code and datasets
for the book with the same name, written by Manfred
Gilli, Dietmar Maringer and Enrico Schumann, published
by Elsevier/Academic Press in 2011.

The package has finally crossed the 1.0 line: It is
10 years since the development of NMOF began, and many
of the functions -- notably those for optimization --
have been in continuous use since then. That implies
a certain maturity, and so it was time to upgrade
the version to 1.0 (and beyond already).

Since my last announcement on this list [1], a number
of functions have been added to the package:
'SAopt' (Simulated Annealing), 'CPPIgap' (portfolio
insurance), 'minvar' (computation of minimum-variance
portfolios), and more. See the NEWS file [2] and the
ChangeLog [3] for all details.

Many of the new functions are described, with
examples, in the Manual [4].


Kind regards
Enrico


[1] https://stat.ethz.ch/pipermail/r-packages/2016/001510.html
[2] https://github.com/enricoschumann/NMOF/blob/master/NEWS
[3] https://github.com/enricoschumann/NMOF/blob/master/ChangeLog
[4] http://enricoschumann.net/NMOF.htm#NMOFmanual


--
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Select part of character row name in a data frame

2017-10-19 Thread Enrico Schumann


Quoting Francesca PANCOTTO <f.panco...@unimore.it>:


Dear R contributors,

I have a problem in selecting in an efficient way, rows of a data  
frame according to a condition,

which is a part of a row name of the table.

The data frame is made of 64 rows and 2 columns, but the row names  
are very long but I need to select them according to a small part of  
it and perform calculations on the subsets.


This is the example:

X   Y
"Unique to strat  "   
  0.048228.39
"Unique to crt.dummy" 
  0.044125.92
"Unique to gender   " 
  0.0159 9.36
"Unique to age   "
  0.083949.37
"Unique to gg_right1  "   
  0.0019 1.10
"Unique to strat:crt.dummy "  
  0.068940.54
"Common to strat, and crt.dummy " 
 -0.0392   -23.09
"Common to strat, and gender "
 -0.0031-1.84
"Common to crt.dummy, and gender "
  0.0038 2.21
"Common to strat, and age "   
  0.0072 4.21


X and Y are the two columns of variables, while “Unique to strat”,  
are the row names. I am interested to select for example those rows
whose name contains “strat” only. It would be very easy if these  
names were simple, but they are not and involve also spaces.
I tried with select matches from dplyr but works for column names  
but I did not find how to use it on row names, which are of course  
character values.


Thanks for any help you can provide.

--
Francesca Pancotto, PhD



Use ?grep or ?grepl:

df[grep("strat", row.names(df)), ]

(in which 'df' is your data frame)


--
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [ESS] [R] M-x R gives no choice of starting dir

2017-09-11 Thread Enrico Schumann
On Mon, 11 Sep 2017, Christian writes:

> Hi,
>
> I experienced a sudden change in the behavior of M-x R in not giving
> me the choice where to start R. May be that I botched my
> preferences. I am using Aquamacs 3.3 on MacOS 10.12.6
>
> Christian

I suppose you are using ESS? There is a variable called
'ess-ask-for-ess-directory', which controls whether
M-x R prompts for a directory. Perhaps you have set
this to nil?

I also Cc the ESS-help mailing list and suggest that
follow-up be sent there.


Kind regards
     Enrico

-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
ESS-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/ess-help


Re: [R] Packages for Learning Algorithm Independent Branch and Bound for Feature Selection

2017-07-03 Thread Enrico Schumann


Quoting Alex Byrley <anbyr...@buffalo.edu>:


See, I have built my own genetic algorithm already and tested it on this
problem. I have a solution, but due to the heuristic nature of GA, I cannot
guarantee that it is the optimal subset.

If I was simply doing this for a company project, you are spot on with the
type of algorithm I would use, but I am doing this for a scientific paper.
I need to be able to find the optimal subset over my dataset, and I know
branch and bound will find it without resorting to exhaustive search. If I
can't claim that my subset is optimal, it is going to open my paper up to
serious enough criticism that it will get rejected, regardless of whether
my new method outperforms the state-of-the-art or not. (And regardless if
my dataset is representative enough to make such performance claims)


Just a comment:

A heuristic such as a Genetic Algorithm does indeed not guarantee the
optimum [*], but will only provide a stochastic approximation. This
approximation will get better if the algorithm gets more computing time;
but still, the solution will remain random.

However, people who read scientific papers can deal with such randomness,
as long as it is properly explained and reported. (Otherwise, no results
that relied on stochastic methods, such as MCMC, would ever have
been published.)

Some years ago, we wrote a short article on this topic (though on a
financial application):
https://link.springer.com/article/10.1007/s10732-010-9138-y

Kind regards (and good luck)
Enrico Schumann


[*] Which does not mean it has not found it.



I will take a look at that page, thanks! Hopefully there is an R
implementation of generic B as I described out there somewhere...

Alex Byrley
Graduate Student
Department of Electrical Engineering
235 Davis Hall
(716) 341-1802

2017-07-01 3:53 GMT-04:00 Enrico Schumann <e...@enricoschumann.net>:


On Thu, 29 Jun 2017, Alex Byrley writes:

> I am looking for packages that can run a branch-and-bound algorithm to
> maximize a distance measure (such as Bhattacharyya or Mahalanobis) on a
set
> of features.
>
> I would like this to be learning algorithm independent, so that the
method
> just looks at the features, and selects the subset of a user-defined size
> that maximizes a distance criteria such as those stated above.
>
> Can anyone give some suggestions?
>
> Alex Byrley
> Graduate Student
> Department of Electrical Engineering
> 235 Davis Hall
> (716) 341-1802
>

It seems you are looking for a generic optimisation
algorithm; so perhaps start at the task view:
https://cran.r-project.org/web/views/Optimization.html

What you describe is a combinatorial problem: select k
from N features, with k (much) smaller than N. So I'd
suggest to also look at heuristic algorithms that can
deal with such problems (e.g. genetic algorithms).


--
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net





--
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Packages for Learning Algorithm Independent Branch and Bound for Feature Selection

2017-07-01 Thread Enrico Schumann
On Thu, 29 Jun 2017, Alex Byrley writes:

> I am looking for packages that can run a branch-and-bound algorithm to
> maximize a distance measure (such as Bhattacharyya or Mahalanobis) on a set
> of features.
>
> I would like this to be learning algorithm independent, so that the method
> just looks at the features, and selects the subset of a user-defined size
> that maximizes a distance criteria such as those stated above.
>
> Can anyone give some suggestions?
>
> Alex Byrley
> Graduate Student
> Department of Electrical Engineering
> 235 Davis Hall
> (716) 341-1802
>

It seems you are looking for a generic optimisation
algorithm; so perhaps start at the task view:
https://cran.r-project.org/web/views/Optimization.html

What you describe is a combinatorial problem: select k
from N features, with k (much) smaller than N. So I'd
suggest to also look at heuristic algorithms that can
deal with such problems (e.g. genetic algorithms).


-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Can I use tabu search for minimization problem ?

2017-06-20 Thread Enrico Schumann


Zitat von Mars Xu <xujiao.myc...@gmail.com>:


Hi all,
	I want to use tabu search to solve my minimization problem. but  
tabu search in R is for maximization, so I turn my function from f  
to -f, but the eUtilityKeep always be 0 from the second position. I  
have go through a part of source code found that it always give the  
default value to compare,


move <- ifelse(maxTaboo > maxNontaboo & maxTaboo > aspiration,
 ifelse(length(which(neighboursEUtility ==  
maxTaboo)) == 1,
which(neighboursEUtility ==  
maxTaboo), sample(which(neighboursEUtility == maxTaboo), 1)),
 ifelse(length(which(neighboursEUtility ==  
maxNontaboo & tabuList == 0)) == 1,
which(neighboursEUtility ==  
maxNontaboo & tabuList == 0), sample(which(neighboursEUtility ==  
maxNontaboo & tabuList == 0), 1)))


this cause the 0 value.

How can I use it to get my minimization value using tabu search in R ?

Thanks .


If you want people to help you, provide a minimal (or, at least, small)
reproducible code example. In particular, tell people what package(s)
you are using.


--
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] regular expression help

2017-06-08 Thread Enrico Schumann


Zitat von Ashim Kapoor <ashimkap...@gmail.com>:


Dear All,

My query is:

Do we always need to use perl = TRUE option when doing ignore.case=TRUE?

A small example :

my_text =
"RECOVERY OFFICER-II\nDEBTS RECOVERY TRIBUNAL-III\n  RC No. 162/2015\nSBI
VS RAMESH GUPTA.\nDated: 01.03.2016   Item no.01\n
Present:   Ms. Sonakshi, the proxy counsel for Ms. Usha Singh, the counsel
for ARCIL.\nNone for the CDs.\n  The counsel for the CHFI
submitted that the matter has been assigned to ARCIL and deed of
assignment, application for substituting the name and vakalatnama has been
filed vide diary no. 1454 dated 08.02.2016\nIn the application it has been
prayed that ARCIL may be substituted in place of SBI for the purpose of
further proceedings in the matter. Request allowed.\nThe proxy counsel for
CHFI further requested to issue demand notice thereby mentioning the name
of ARCIL. Request allowed.\nRegistry is directed to issue fresh demand
notice mentioning the name of ARCIL.\nCHFI is directed to file status of
the mortgaged property as well as other assets of the CDs.\nList the case
on 28.03.2016.\n  (SUJEET KUMAR)\nRECOVERY OFFICER-II."

My regular expression is:

parties_present_start_1=
regexpr("\n.*Present.*\n.*\n",my_text,ignore.case=TRUE,perl=T)

parties_present_start_2=
regexpr("\n.*Present.*\n.*\n",my_text,ignore.case=TRUE)


parties_present_start_1

[1] 138
attr(,"match.length")
[1] 123
attr(,"useBytes")
[1] TRUE

parties_present_start_2

[1] 20
attr(,"match.length")
[1] 949
attr(,"useBytes")
[1] TRUE




Why do I see the correct result only in the first case?

Best Regards,
Ashim



In Perl, '.' matches anything but a newline.

In R, '.' matches any character.

  test <- "hello\n1"
  regexpr(".*[0-9]", test)
  ## [1] 1
  ## attr(,"match.length")
  ## [1] 7
  ## attr(,"useBytes")
  ## [1] TRUE

  regexpr(".*[0-9]", test, perl = TRUE)
  ## [1] 7
  ## attr(,"match.length")
  ## [1] 1
  ## attr(,"useBytes")
  ## [1] TRUE


--
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Setting up a .Rprofile file

2017-03-24 Thread Enrico Schumann
On Fri, 24 Mar 2017, Bruce Ratner PhD writes:

> Henrico:
> Thanks for quick reply.
> However, one last question:
> If I want to change working directory, and put setwd() in the Rprofile
> file, logically R will not know where the be work directory is,
> correct?
>
> So, should I install R in my preferred working directory?
>
> Thanks again, in advance. 
> Bruce
>

Hm, I think I don't understand what you mean here. Where R
is installed and where it is run are (usually) not the same
places.

Let $ be a shell prompt, and let > be the R
prompt. Then:

  $ cd /tmp/
  $ R -q
  > getwd()
  [1] "/tmp"
  > q("no")
  $ cd ~/Documents/
  $ R -q
  > getwd()
  [1] "~/Documents"

Now I write into my "~./Rprofile" file:

  setwd("~/Downloads")

$ cd /tmp/
$ R -q
> getwd()
[1] "~/Downloads"


But maybe I am completely misunderstanding what you
mean

Kind regards
 Enrico


>
>
>> On Mar 24, 2017, at 3:48 AM, Enrico Schumann <e...@enricoschumann.net> wrote:
>> 
>> On Thu, 23 Mar 2017, Bruce Ratner PhD writes:
>> 
>>> Hi R'ers:
>>> I would like to setting up a .Rprofile file with
>>> setwd("C:/R_WorkDir")
>>> set.seed(12345)
>>> options (prompt "> R ")
>>> 
>>> ---
>>> Can you help providing the code or instructive link,
>>> I've find many links, but I can't figure it out?
>>> 
>>> Thanks.
>>> Bruce 
>>> 
>> 
>> Quoting from ?Startup:
>> 
>> ,
>> | [...] unless ‘--no-init-file’ was given, R searches
>> | for a user profile, a file of R code.  The path of
>> | this file can be specified by the ‘R_PROFILE_USER’
>> | environment variable (and tilde expansion will be
>> | performed).  If this is unset, a file called
>> | ‘.Rprofile’ is searched for in the current directory
>> | or in the user's home directory (in that order).  The
>> | user profile file is sourced into the workspace.
>> `
>> 

-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Setting up a .Rprofile file

2017-03-24 Thread Enrico Schumann
On Thu, 23 Mar 2017, Bruce Ratner PhD writes:

> Hi R'ers:
> I would like to setting up a .Rprofile file with
> setwd("C:/R_WorkDir")
> set.seed(12345)
> options (prompt "> R ")
>
> ---
> Can you help providing the code or instructive link,
> I've find many links, but I can't figure it out?
>
> Thanks.
> Bruce 
>

Quoting from ?Startup:

,
| [...] unless ‘--no-init-file’ was given, R searches
| for a user profile, a file of R code.  The path of
| this file can be specified by the ‘R_PROFILE_USER’
| environment variable (and tilde expansion will be
| performed).  If this is unset, a file called
| ‘.Rprofile’ is searched for in the current directory
| or in the user's home directory (in that order).  The
| user profile file is sourced into the workspace.
`

-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data and Variables from Tables

2017-03-21 Thread Enrico Schumann
On Tue, 21 Mar 2017, Shawn Way writes:

> I have an org-mode table with the following structure
> that I am pulling into an R data.frame, using the
> sfsmisc package and using xtable to print in org-mode
>
> | Symbol | Value | Units   |
> |--+---+---|
> | A | 1 | kg/hr|
> | \beta| 2 | \frac{m^3}{hr} |
> | G| .25 | in   |
>
> This all works well and looks great.
>
> What I am trying to do is use this to generate
> variables for programming as well.  For example, when
> processed I would have the following variables:
>
> A <- 1
> beta <- 2
> G <- .25
>
> Has anyone done something like this or can someone
> point me in the right direction to do this?
>
> Shawn Way
>

You may be looking for ?assign.

  df <- data.frame(Symbol = c("A", "\\beta",  "G"),
   Value  = c(  1,2, 0.25))
  
  ## remove backslashes
  df[["Symbol"]] <- gsub("\\", "", df[["Symbol"]], fixed = TRUE)
  
  for (i in seq_len(nrow(df)))
  assign(df[i, "Symbol"], df[i, "Value"])
  
But depending on what you want to do, it may be
cleaner/safer to keep the variables from the table
together in a list.

tbl <- as.list(df[["Value"]])
names(tbl) <- df[["Symbol"]]

## $A
## [1] 1
## 
## $beta
## [1] 2
## 
## $G
## [1] 0.25


-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [R-pkgs] NMOF 0.40-0 (Numerical Methods and Optimization in Finance)

2016-10-26 Thread Enrico Schumann
Dear all,

version 0.40-0 of package NMOF is on CRAN now, 5 years
(exactly) after its first release on CRAN.

'NMOF' stands for 'Numerical Methods and Optimization
in Finance'. The package accompanies the book with the
same name, written by Manfred Gilli, Dietmar Maringer
and Enrico Schumann, published by Elsevier/Academic
Press in 2011.

Since my last announcement on this list [1], many
things have been added to the package:

- all the R code examples from the book (?showExample)

- many new functions, e.g. for pricing financial
  instruments (?vanillaOptionEuropean, ?vanillaBond,
  ?callMerton, ?xtContractValue, ...), and utilities
  for Monte-Carlo simulation, for computing implied
  vol, yields, etc.

Many of these new functions are described, with
examples, in the Manual [2].

If you want to stay up-to-date: the latest version is
always available from my website [3]; there is a public
Git repository on GitHub [4].

In case of comments/corrections/remarks/suggestions --
which are very welcome -- please contact the maintainer
(me) directly.


Kind regards
Enrico


[1] https://stat.ethz.ch/pipermail/r-packages/2011/001257.html
[2] http://enricoschumann.net/NMOF.htm#NMOFmanual
[3] http://enricoschumann.net/R/packages/NMOF/
[4] https://github.com/enricoschumann/NMOF

-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Output formatting in PDF

2016-10-11 Thread Enrico Schumann
On Tue, 11 Oct 2016, Preetam Pal <lordpree...@gmail.com> writes:

> Hi,
>
> Can you please help me with the following output formatting:
> I am planning to include 2 plots and some general description in a one-page
> PDF document, such that
>
>- I'll leave some appropriate margin on the PDF- say, 1.5 inches
>top,right, bottom and left (will decide based on overall appearance)
>- the 2 plots are placed side-by-side (looks best for comparison)
>- the margins for each plot can be 4 lines on the top and the bottom &
> 2 lines on the left and the right
>- each of these 2 plots would have time (0 to 260) along x-axis and two
>time-series (daily USD-GBP and USD-EUR FX rates) on the y-axis, i.e. 2
>time-series would be plotted on each of the 2 graphs. I would need a
>different color for each plot to demarcate them
>- I need to add some text (eg: "Independent analysis of Exchange Rate
>dynamics") with reduced font size (not high priority though-just good to
>have a different size)
>- The general discussion (may be a paragraph) would come right below the
>2 plots - I can specify this text as an argument in a function, may be. I
>am not sure how to arrange the entire PDF as per the format I mentioned
>above
>
> I shall really appreciate any help with this - the time series analysis is
> not difficult, I can manage that - however, I don't know how to manage the
> formatting part though, so that the 1-pager output looks decently
> presentable. Thanks.
>
> Regards,
> Preetam

If using LaTeX is an option, I would suggest
?Sweave. There are many tutorials on the web that
should get you started.


-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Least Median Square Regression

2016-10-08 Thread Enrico Schumann
On Sat, 08 Oct 2016, Bryan Mac <bryanmac...@gmail.com> writes:

> I am confused reading the document. 
>
> I have installed and added the package (MASS).
>
> What is the function for LMS Regression?
>

In MASS, it is 'lqs'.

But the vignette provides a code example for how to
compute 'manually' an LMS-regression, i.e. how to do
the actual optimisation.

>
>> On Oct 8, 2016, at 6:17 AM, Enrico Schumann <e...@enricoschumann.net> wrote:
>> 
>> On Sat, 08 Oct 2016, Bryan Mac <bryanmac...@gmail.com> writes:
>> 
>>> Hi R-help,
>>> 
>>> How do you perform least median square regression in R? Here is what I have 
>>> but received no output. 
>>> 
>>> LMSRegression <- function(df, indices){
>>>  sample <- df[indices, ]
>>>  LMS_NAR_NIC_relation <- lm(sample$NAR~sample$NIC, data = sample, method = 
>>> "lms")
>>>  rsquared_lms_nar_nic <- summary(LMS_NAR_NIC_relation)$r.square
>>> 
>>>  LMS_SQRTNAR_SQRTNIC_relation <- lm(sample$SQRTNAR~sample$SQRTNIC, data = 
>>> sample, method = "lms")
>>>  rsquared_lms_sqrtnar_sqrtnic <- 
>>> summary(LMS_SQRTNAR_SQRTNIC_relation)$r.square
>>> 
>>>  out <- c(rsquared_lms_nar_nic, rsquared_lms_sqrtnar_sqrtnic)
>>>  return(out)
>>> }
>>> 
>>> Also, which value should be looked at decide whether this is best 
>>> regression model to use?
>>> 
>>> Bryan Mac
>>> bryanmac...@gmail.com
>>> 
>> 
>> A tutorial on how to run such regressions is included
>> in the NMOF package.
>> 
>> https://cran.r-project.org/package=NMOF/vignettes/PSlms.pdf
>> 

-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Least Median Square Regression

2016-10-08 Thread Enrico Schumann
On Sat, 08 Oct 2016, Bryan Mac <bryanmac...@gmail.com> writes:

> Hi R-help,
>
> How do you perform least median square regression in R? Here is what I have 
> but received no output. 
>
> LMSRegression <- function(df, indices){
>   sample <- df[indices, ]
>   LMS_NAR_NIC_relation <- lm(sample$NAR~sample$NIC, data = sample, method = 
> "lms")
>   rsquared_lms_nar_nic <- summary(LMS_NAR_NIC_relation)$r.square
>   
>   LMS_SQRTNAR_SQRTNIC_relation <- lm(sample$SQRTNAR~sample$SQRTNIC, data = 
> sample, method = "lms")
>   rsquared_lms_sqrtnar_sqrtnic <- 
> summary(LMS_SQRTNAR_SQRTNIC_relation)$r.square
>   
>   out <- c(rsquared_lms_nar_nic, rsquared_lms_sqrtnar_sqrtnic)
>   return(out)
> }
>  
> Also, which value should be looked at decide whether this is best regression 
> model to use?
>
> Bryan Mac
> bryanmac...@gmail.com
>

A tutorial on how to run such regressions is included
in the NMOF package.

https://cran.r-project.org/package=NMOF/vignettes/PSlms.pdf


-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Faster Subsetting

2016-09-28 Thread Enrico Schumann
On Wed, 28 Sep 2016, "Doran, Harold" <hdo...@air.org> writes:

> I have an extremely large data frame (~13 million rows) that resembles
> the structure of the object tmp below in the reproducible code. In my
> real data, the variable, 'id' may or may not be ordered, but I think
> that is irrelevant.
>
> I have a process that requires subsetting the data by id and then
> running each smaller data frame through a set of functions. One
> example below uses indexing and the other uses an explicit call to
> subset(), both return the same result, but indexing is faster.
>
> Problem is in my real data, indexing must parse through millions of
> rows to evaluate the condition and this is expensive and a bottleneck
> in my code.  I'm curious if anyone can recommend an improvement that
> would somehow be less expensive and faster?
>
> Thank you
> Harold
>
>
> tmp <- data.frame(id = rep(1:200, each = 10), foo = rnorm(2000))
>
> idList <- unique(tmp$id)
>
> ### Fast, but not fast enough
> system.time(replicate(500, tmp[which(tmp$id == idList[1]),]))
>
> ### Not fast at all, a big bottleneck
> system.time(replicate(500, subset(tmp, id == idList[1])))
>

If you really need only one column, it will be faster
to extract that column and then to take a subset of it:

  system.time(replicate(500, tmp[[2L]][tmp$id == idList[1L]]))

(A data.frame is a list of atomic vectors, and it is
 typically faster to first extract the component of
 interest, i.e. the specific column, and then to subset
 this vector. The result will, of course, be a vector,
 not a data.frame.)


-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with strftime error "character string is not in a standard unambiguous format"

2016-09-12 Thread Enrico Schumann
On Mon, 12 Sep 2016, Chris Evans <chrish...@psyctc.org> writes:

> I am trying to read activity data created by Garmin. It outputs dates like 
> this:
>
> "Thu, 25 Aug 2016 6:34 PM"
>
> The problem that has stumped me is this:
>
>> strftime("Thu, 25 Aug 2016 6:34 PM",format="%a, %d %b %Y %I:%M %p")
> Error in as.POSIXlt.character(x, tz = tz) : 
>   character string is not in a standard unambiguous format


Didn't you mean strptime?
   ^
  > strptime("Thu, 25 Aug 2016 6:34 PM",format="%a, %d %b %Y %I:%M %p")

  ## [1] "2016-08-25 18:34:00 CEST"


> I _thought_ I had this running OK but that error is catching me now.
> I think I've read ?strftime and written that format string correctly
> to match the input but I'm stumped now.
>
> Can someone advise me?  Many thanks in advance,
>
> Chris
>
>
>> sessionInfo()
> R version 3.3.1 (2016-06-21)
> Platform: x86_64-w64-mingw32/x64 (64-bit)
> Running under: Windows 10 x64 (build 10586)
>
> locale:
> [1] LC_COLLATE=English_United Kingdom.1252 
> [2] LC_CTYPE=English_United Kingdom.1252   
> [3] LC_MONETARY=English_United Kingdom.1252
> [4] LC_NUMERIC=C   
> [5] LC_TIME=English_United Kingdom.1252
>
> attached base packages:
> [1] stats graphics  grDevices utils datasets  methods   base 
>
> loaded via a namespace (and not attached):
> [1] compiler_3.3.1 tools_3.3.1   
>>
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] get start and end date of ISO weeks giving a date as input

2016-09-08 Thread Enrico Schumann
Hi Veronica,

please see inline.

On Thu, 08 Sep 2016, Veronica Andreo <veroand...@gmail.com> writes:

> Hello Luisfo and Enrico,
>
> Thanks for your help! I've been testing both
> solutions... results differ for the same date (I
> changed both functions to use ISO8601). And I added
> contiguous dates, to see how they handle the
> start-end of the week.
>
> So, here the results:
>
> ### one example
> d <- c("2010-08-21","2010-08-22","2010-08-23","2010-08-24")
> iso_start_end <- function(d) {
>   d <- as.Date(d)
>   wday <- as.POSIXlt(d)$wday
>   data.frame(date = d,
>              week = format(d, "%V"),
>              starts = d - wday + 1,
>              ends = d + 7 - wday)
> }
> iso_start_end(d)
>
>         date week     starts       ends
> 1 2010-08-21   33 2010-08-16 2010-08-22
> 2 2010-08-22   33 2010-08-23 2010-08-29
> 3 2010-08-23   34 2010-08-23 2010-08-29
> 4 2010-08-24   34 2010-08-23 2010-08-29

Yes, the second date makes no sense, and it happens
because Sunday is 0 (and not 7). My bad. Here is
a fixed version:

  iso_start_end <- function(d) {
  d <- as.Date(d)
  wday <- as.POSIXlt(d)$wday
  wday[wday == 0] <- 7
  data.frame(date = d,
 week = format(d, "%V"),
 starts = d - wday + 1,
 ends = d + 7 - wday)
  }


> ### the other example:
> dd <- as.Date(strptime('2010-08-21', format="%Y-%m-%d", tz="GMT"))
> ref.date <- as.Date(strptime(paste0(year(dd),"-01-01"), format="%Y-%m-%d"))
> bound.dates <- ref.date + 7 * (isoweek(dd)) + c(0,6)
> bound.dates
> [1] "2010-08-20" "2010-08-26"

You can use the function "weekdays" to see check the
results.

  > weekdays(bound.dates)
  [1] "Friday"   "Thursday"

> So, researching a bit more and inspired by those
> examples, I eventually came up with this solution
> that seems to work fine... I share in case that any
> other has a similar problem:
>
> # get ISOweek for my vector of dates 
> week_iso<-ISOweek(d)
>
> # vector with the format %Y-W%V-1 for start day of the ISO week
> week_iso_day1 <- paste(week_iso,1, sep="-")
>
> #  vector with the format %Y-W%V-7 for end day of the ISO week
> week_iso_day7 <- paste(week_iso, 7, sep="-")
>
> # use ISOweek2date
> data.frame(date= d, week_iso = week_iso, start = ISOweek2date(week_iso_day1), 
> end = ISOweek2date(week_iso_day7)
>
> date week_iso  startend
> 1 2010-08-21 2010-W33 2010-08-16 2010-08-22
> 2 2010-08-22 2010-W33 2010-08-16 2010-08-22
> 3 2010-08-23 2010-W34 2010-08-23 2010-08-29
> 4 2010-08-24 2010-W34 2010-08-23 2010-08-29

The updated 'iso_start_end' gives the same result.
  
  date week starts   ends
  1 2010-08-21   33 2010-08-16 2010-08-22
  2 2010-08-22   33 2010-08-16 2010-08-22
  3 2010-08-23   34 2010-08-23 2010-08-29
  4 2010-08-24   34 2010-08-23 2010-08-29


Kind regards
 Enrico

> Thanks again for your time, ideas and help!
>
> Best,
> Vero
>
> 2016-09-08 8:20 GMT-03:00 Luisfo <luisf...@yahoo.es>:
>
> Dear Veronica,
>
> Here there's a way of doing what you requested.
>
> library("lubridate")
> # your date '2010-08-21' as Date object
> dd <- as.Date(strptime("2010-08-21", format="%Y-%m-%d", tz="GMT"))
> # take the first day of the year as Date object, i.e. 2010-01-01 in our 
> example
> ref.date <- as.Date(strptime(paste0(year(dd),"-01-01"), 
> format="%Y-%m-%d", tz="GMT"))
> # the start and end dates
> bound.dates <- ref.date + 7 * (week(dd)-1) + c(0,6)
>
> I hope you find it useful.
>
> Best,
>
> Luisfo Chiroque
> PhD Student | PhD Candidate
> IMDEA Networks Institute
> http://fourier.networks.imdea.org/people/~luis_nunez/
>
> On 09/08/2016 12:13 PM, Veronica Andreo wrote:
>
> Hello list,
> 
> Is there a quick way to get start and end date (%Y-%m-%d) from ISO
> weeks if I only have dates?
> 
> For example, I have this date in which some event happened:
> "2010-08-21". Not only I want the ISO week, which I can obtain either
> with isoweek (lubridate) or ISOweek (ISOweek), but I want the start
> and end date of that ISO week.
> 
> Do I need to print all ISO weeks from the period of interest and
> sample there for start and end date? Or is there a better way to do
> that?
> 
> Thanks a lot in advance!
> 
> Best,
> Veronica
>

-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] get start and end date of ISO weeks giving a date as input

2016-09-08 Thread Enrico Schumann
On Thu, 08 Sep 2016, Veronica Andreo <veroand...@gmail.com> writes:

> Hello list,
>
> Is there a quick way to get start and end date (%Y-%m-%d) from ISO
> weeks if I only have dates?
>
> For example, I have this date in which some event happened:
> "2010-08-21". Not only I want the ISO week, which I can obtain either
> with isoweek (lubridate) or ISOweek (ISOweek), but I want the start
> and end date of that ISO week.
>
> Do I need to print all ISO weeks from the period of interest and
> sample there for start and end date? Or is there a better way to do
> that?
>
> Thanks a lot in advance!
>
> Best,
> Veronica


You could use a function like the following one (which
assumes the start of the week is Monday and its end is
Sunday):

  d <- c("2010-08-21",
 "2016-08-01")

  iso_start_end <- function(d) {
  d <- as.Date(d)
  wday <- as.POSIXlt(d)$wday
  data.frame(date = d,
 week = format(d, "%V"),
 starts = d - wday + 1,
 ends = d + 7 - wday)
  }
  
  iso_start_end(d)

The function should produce this output:

date week starts   ends
1 2010-08-21   33 2010-08-16 2010-08-22
2 2016-08-01   31 2016-08-01 2016-08-07



-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with constrained portfolio optimization

2016-07-09 Thread Enrico Schumann
On Fri, 08 Jul 2016, Paulino Levara <paulino.lev...@gmail.com> writes:

> Dear R community,
>
> I am a beginner in portfolio optimization and I would appreciate your help
> with the next problem:given a set of 10 variables (X), I would like to
> obtain the efficient portfolio that minimize the variance taking the
> expected return as mean(X), subject to the next constraints:
>
> a) Limit the sum of the weights of the first five variables to 30%
> b) Limit the sum of the weights of the last five variables to 70%
>
> What is your suggestion? Can I do this with the portfolio.optim function of
> the tseries package?
>
> How Can I do that?
>
> Thanks in advance.
>
> Regards.
>

Such a problem can be solved via quadratic programming. I would
start with package 'quadprog' and its function 'solve.QP' (which
is actually what tseries::portfolio.optim uses). You can find
many tutorials on the web on how to use this function for
portfolio optimisation.

The tricky part is setting up the constraint matrices. Here is
some code that may help you get started. (It is adapted from one
of the code examples in package "NMOF".) 

require("quadprog")

## create random returns data
na  <- 10  ## number of assets 
ns  <- 60  ## number of observations
R   <- array(rnorm(ns * na, 
   mean = 0.005, sd = 0.015), 
 dim = c(ns, na)) ## roughly like monthly equity returns
m<- colMeans(R)  ## asset means
rd   <- mean(m)  ## desired mean
wmax <- 1## maximum holding size
wmin <- 0.0  ## minimum holding size

## set up matrices
A <- rbind(1, c(rep(1,5), rep(0,5)), m,
   -diag(na), diag(na))

bvec <- c(1, 0.3, rd,
  rep(-wmax, na),
  rep(wmin, na))

result <- solve.QP(Dmat = 2*cov(R),
   dvec = rep(0, na),
   Amat = t(A),
   bvec = bvec,
   meq  = 2)

w <- result$solution 

## check results
sum(w) ## check budget constraint
sum(w[1:5])## check sum-of-weights constraint
w %*% m >= rd  ## check return constraint
summary(w) ## check holding size constraint


-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] filter a data.frame in dependence of a column value

2016-06-17 Thread Enrico Schumann
On Fri, 17 Jun 2016, Matthias Weber <matthias.we...@fntsoftware.com> writes:

> Hello togehter,
>
> i have short question, maybe anyone can help me.
>
> I have a data.frame like this one:
>
>NO   ORDER
> 1 1530 for Mr. Muller (10.0 -> 11.2)
> 2 1799 for Mr Giulani
> 3 1888 for Mr. Marius (11.2 -> 12)
>
> I need a solution, which only contains the values in brackets. The result 
> should look like the following:
>
>NO   ORDER
> 1 1530 for Mr. Muller (10.0 -> 11.2)
> 2 1888 for Mr. Marius (11.2 -> 12)
>
> I tried it with the following code, but that doesn't work.
>
> data4.1<-data3[data3$ORDER%in% "[(]*->*[)]",]
>
> maybe anyone can help me.
>
> Thank you.
>
> Best regards
>
> Mat
>

Try ?grepl instead of %in%.

x <- c("for Mr. Muller (10.0 -> 11.2)",
   "for Mr Giulani",
   "for Mr. Marius (11.2 -> 12)")

grepl("[(].*->.*[)]", x)




-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query about use of format in strptime

2016-04-11 Thread Enrico Schumann
On Mon, 11 Apr 2016, Stefano Sofia <stefano.so...@regione.marche.it> writes:

> Dear R-list users,
> I need to use strptime because I have to deal with date with hours and 
> minutes.
> I read the manual for strptime and I also looked at many examples, but when I 
> try to apply it to my code, I always encounter some problems.
> I try to change the default format, with no success. Why? How can I change 
> the format?
>
> 1.
> init_day <- as.factor("2015-02-24-00-30")
> strptime(init_day, format="%Y-%m-%d-%H-%M")
> [1] "2015-02-24 00:30:00"
> It works, but why also seconds are shown if in format seconds are not 
> specified?
>
> 2.
> init_day <- as.factor("2015-02-24-0-00")
> strptime(init_day, format="%Y-%m-%d-%H-%M")
> [1] "2015-02-24"
> Again, the specified format is not applied. Why?
>
> Thank you for your attention and your help
> Stefano


strptime creates a POSIXlt object, and the specified format
tells it how to interpret the string you pass in. (Your
factor is converted to character by strptime.)

If you want to have the POSIXlt object printed in a
particular way, use ?strftime or ?format.

  > format(strptime(init_day, format="%Y-%m-%d-%H-%M"), "%Y-%m-%d-%H-%M")
  ## [1] "2015-02-24-00-00"


-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Optimization Grid Search Slow

2015-09-18 Thread Enrico Schumann
frc * pR
>
> # What's the new belief given observation
> posteriorL <- pflc * pL/denom
> posteriorR <- 1-posteriorL
>
> pL <- posteriorL
> pR <- posteriorR
>
> pL <- (1/(1 + exp(-temperature * (pL-.5
> pR <- (1/(1 + exp(-temperature * (pR-.5
>
> pLlist[i] = pL
> pRlist[i] = pR
>
> if(i > 1){
>   if(dat$choice[i] == 1){
> trialProb[i] <- pLlist[i-1]
>   } else
>   {
> trialProb[i] <- 1-pLlist[i-1]
>   }
> }
> else {
>   trialProb[1] <- .5
> }
>
>   }
>   trialProb2 <- sum(log(trialProb))
>   subFit <- exp(trialProb2/length(dat$choice))
>   hmmOutput <- list("logLikelihood" = trialProb2, "subjectFit" = subFit,
> "probabilities" = pLlist)
>   # print(hmmOutput$logLikelihood)
>   return(hmmOutput)
> }
>
>
> subjectFits <- 0; subLogLike <- 0; bestTemp <- 0; bestDelta= 0;
>
> min = 0.001; max = .5; inc = 0.001;
> deltaList = seq(min, max, inc)
> mina = 0; maxa = 5; inca = .01
> amList = seq(mina, maxa, inca)
> maxLogValue <- -1000
> for(delta in deltaList){
>   for(temp in amList){
> probabilities <- hmmFunc(delta, temp)
> if(probabilities$logLikelihood > maxLogValue){
>   pList <- probabilities$probabilities
>   maxLogValue <- probabilities$logLikelihood
>   subLogLike <- probabilities$logLikelihood
>   subjectFits <- probabilities$subjectFit
>   bestTemp <- temp
>   bestDelta <- delta
>
> }
>   }
> }


Another option, perhaps: there is a function 'gridSearch' in package
NMOF that allows you to distribute (i.e. run in parallel) the
computations.

(Disclosure: I am the maintainer of NMOF.)

-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fast multiple match function

2015-04-07 Thread Enrico Schumann
On Mon, 06 Apr 2015, Keshav Dhandhania kshav...@gmail.com writes:

 Hi,

 I know that one can find all occurrences of x in a vector v by doing
 which(x == v).

 However, if I need to do this again and again, where v is remaining the
 same, then this is quite inefficient. In my particular case, I need to do
 this millions of times, and length(v) = 100 million.

 Does anyone have suggestion on how to go about it?
 I know of a package called fmatch that does the above for the match
 function. But they don't handle multiple matches.


Perhaps 'match(x, v)' is what you want? In which 'x' may be a vector of
length  1.

In any case, have you actually tried package 'fastmatch'? The function
'fmatch', which that package provides, is very fast for repeated
lookups in a table 'v'.


-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] split a string a keep the last part

2014-08-28 Thread Enrico Schumann
On Thu, 28 Aug 2014, Jun Shen jun.shen...@gmail.com writes:

 Hi everyone,

 I believe I am not the first one to have this problem but couldn't find a
 relevant thread on the list.

 Say I have a string (actually it is the whole column in a data frame) in a
 format like this:

 test- 'AF14-485-502-89-00235'

 I would like to split the test string and keep the last part. I think I can
 do the following

 sub('.*-.*-.*-.*-(.*)','\\1', test)

 to keep the fifth part of the string. But this won't work if other strings
 have more or fewer parts separated by '-'. Is there a general way to do it?
 Thanks.

 Jun

This should work for your example:

  gsub(.*-([^-]*)$, \\1, test)



-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] legendre quadrature

2014-05-01 Thread Enrico Schumann
On Thu, 01 May 2014, pari hesabi statistic...@hotmail.com writes:

 Hello everybody
 I need to approximate the amount of integral by using
  legendre quadrature. I have written a program which doesn't give me a 
 logical answer; Can anybody help me and send the correct program? For 
 example  the approximated amount of integral of ( x ^2)  on (-1,1) based
  on legendre quad rule. 
   

One possibility:

  require(NMOF)
  xw -xwGauss(10, legendre)

  fun - function(x)
  x^2

  sum(fun(xw$nodes) * xw$weights)



 integrand-function(x) {x^2}
 rules - legendre.quadrature.rules( 50 )

  Error: object 'legendre.quadrature.rules' not found

PLEASE provide commented, minimal, self-contained, reproducible code.

 order.rule - rules[[50]]
 chebyshev.c.quadrature(integrand, order.rule, lower = -1, upper = 1) 

 Thank you
 Diba  
   

-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Getting a particular weekday for a given month

2014-04-07 Thread Enrico Schumann
On Mon, 07 Apr 2014, Christofer Bogaso bogaso.christo...@gmail.com writes:

 Hi,

 Given a month name, I am looking for some script to figure out, what is the
 date for 3rd Wednesday. For example let say I have following month:

 library(zoo)
 Month - as.yearmon(as.Date(Sys.time()))

 I need to answer: What is the date for 3rd Wednesday of 'Month'?

 Really appreciate for any pointer.

 Thanks for your time.


There is a function 'lastWeekday' in the PMwR package, which will
compute the last weekday -- Wednesday, say -- in a given month.
(Disclosure: I am the package author.) For example

  lastWeekday(3, Sys.Date()) 

produces 2014-04-30, which is the last Wednesday of the current month.
To get the third Wednesday of a given month, you can do this:

  lastWeekday(3, endOfPreviousMonth(Sys.Date()), shift = 3)

or, for example, for the third Friday of September 2012:

  lastWeekday(5, endOfPreviousMonth(as.Date(2012-09-01)), shift = 3)

  ##2012-09-21

That is, you first find the last particular weekday of the previous
month, and then shift forward 3 weeks.

However, PMwR is not on CRAN; it's available from here

  http://enricoschumann.net/R/packages/PMwR/index.htm

If you are on a Unix-type system (or have Rtools installed on Windows),
you can directly install the package from source:

  install.packages('PMwR',
   repos = 'http://enricoschumann.net/R', 
   type = 'source')

Regards,
Enrico

-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Environmental problems.

2014-02-25 Thread Enrico Schumann
On Tue, 25 Feb 2014, Rolf Turner r.tur...@auckland.ac.nz writes:

 I have a function that makes use of the ode() function from the
 deSolve package.  I am trying to find a way of getting it to put out
 a progress report every t.int time units (by progress report I
 just mean reporting what time it's got up to).

 I thought to put code something like the following in my func
 function that gets called by (is an argument to) ode():

 cat(Before: time =,tt,tdone =,tdone,diff =,tt-tdone,\n)
 if(tt - tdone = 0.1-sqrt(.Machine$double.eps)) {
 cat(Prog. Rep.: time =,tt,tdone =,tdone,diff =,tt-tdone,\n)
 assign(tdone,tt,envir=parent.env(environment()))
 }
 cat(After: time =,tt,tdone =,tdone,diff =,tt-tdone,\n)

 The object tdone gets initialized (to 0) outside of func(), so there
 is not a problem with tdone not being found the first time that
 func() gets called by ode().  (I'm hardwiring t.int=0.1 in the
 forgoing just for test/illustration purposes.)  The Before and
 After cat()-s are there to demonstrate what goes wrong.

 What goes wrong is that I get no progress report and tdone remains
 equal to 0 until tt reaches 0.1. As desired. I then get a progress
 report and tdone gets set equal to the first value of tt which is
 greater than 0.1. As desired.

 Then I get no further progress reports and tdone gets set equal to tt
 at every call to func() --- even though tt - tdone = 0 which is less
 than 0.1 so the assignment of tdone cannot occur.  And yet it does,
 keeping the difference equal to 0.  (*Not* as desired!)

 So the function is recognizing that the difference is less than 0.1 in
 that it does not execute the cat() statement.  Yet it executes the
 assign() statement.  This is clearly impossible! :-)  But it happens.

 The output from the cat()-ing, around time = 0.1, looks like:

 Before: time = 0.09364548 tdone = 0 diff = 0.09364548
 After: time = 0.09364548 tdone = 0 diff = 0.09364548
 Before: time = 0.0975779 tdone = 0 diff = 0.0975779
 After: time = 0.0975779 tdone = 0 diff = 0.0975779
 Before: time = 0.0975779 tdone = 0 diff = 0.0975779
 After: time = 0.0975779 tdone = 0 diff = 0.0975779
 Before: time = 0.09698997 tdone = 0 diff = 0.09698997
 After: time = 0.09698997 tdone = 0 diff = 0.09698997
 Before: time = 0.1009224 tdone = 0 diff = 0.1009224
 Prog. Rep.: time = 0.1009224 tdone = 0 diff = 0.1009224
 After: time = 0.1009224 tdone = 0.1009224 diff = 0
 Before: time = 0.1009224 tdone = 0.1009224 diff = 0
 After: time = 0.1009224 tdone = 0.1009224 diff = 0
 Before: time = 0.1003344 tdone = 0.1003344 diff = 0  --|
 After: time = 0.1003344 tdone = 0.1003344 diff = 0
 Before: time = 0.1042669 tdone = 0.1042669 diff = 0
 After: time = 0.1042669 tdone = 0.1042669 diff = 0

 It's at that line indicated by |, 4 lines from the bottom of
 the forgoing display, where things go to hell in a handcart.  Why (how
 on earth can) tdone change from 0.1009224 to 0.1003344, given that the
 difference is 0 whence no assignment of tdone should take place?

 What am I not seeing?  Can anyone help me out?  I'm going mad!
 ***MAD*** I tell you! :-)

 Suggestions as to a better way of accomplishing my desired goal of
 producing progress reports would also be welcome.

 I am not at all sure that assigning tdone in
 parent.env(environment()) is the right thing to do.  I need to assign
 it in such a way and in such a location that its value will persist
 from call to call of func.  Words of wisdom about this would be
 gratefully received.  (I don't really grok environments.  I just try
 things until *something* works!)

 cheers,

 Rolf Turner


I did not follow your example, neither do I use the deSolve package; but
why not pass an environment as an argument?

  ## some iterative function that takes another fun as argument
  outer - function(fun, ...) {
  for (i in 1:20)
  fun(...)
  }
  
  ## create an environment ...
  info - new.env()
  info$tt - 0

  ## ... and pass it as an argument
  myfun -function(e) {
  e$tt - e$tt + 1
  cat(Iteration , e$tt, \n)
  }

  outer(myfun, info)
  info$tt


-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Selecting individuals to maximize the correlation of two variables

2014-01-11 Thread Enrico Schumann
On Fri, 10 Jan 2014, 汤靖 mr_tangj...@hotmail.com writes:

 Hi,
 Maybe it is not directly related to R but sine many are statistical experts 
 so I post it here for help:

 I have two variables (say x and y) of length n. Now the cor(x,y) is close to 
 0. I need to find the subset in {1,.. n} so that the correlation between x 
 and y using the subset data is maximized. A trivial choice would be selecting 
 2 individuals only so that cor(x,y) =1. As the size of the subset increases, 
 cor(x,y) will go down to 0, but I am assuming the best correlation for each 
 size of the subsets would not be monotonically decreasing.

 Any idea of how to find the solution?

 Thanks,
 Jing

 

Hi Jing,

in chapter 1 of the NMOF manual I discuss a very similar problem; perhaps
it helps you (but it's a draft only...)

http://enricoschumann.net/NMOF.htm#NMOFmanual


-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How can I find nonstandard or control characters in a large file?

2013-12-09 Thread Enrico Schumann
On Mon, 09 Dec 2013, andrewH ahoer...@rprogress.org writes:

 I have a humongous csv file containing census data, far too big to read into
 RAM. I have been trying to extract individual columns from this file using
 the colbycol package. This works for certain subsets of the columns, but not
 for others. I have not yet been able to precisely identify the problem
 columns, as there are 731 columns and running colbycol on the file on my old
 slow machine takes about 6 hours. 

 However, my suspicion is that there are some funky characters, either
 control characters or characters with some non-standard encoding, somewhere
 in this 14 gig file. Moreover, I am concerned that these characters may
 cause me trouble down the road even if I use a different approach to getting
 columns out of the file.

 Is there an r utility will search through my file without trying to read it
 all into memory at one time and find non-standard characters or misplaced
 (non-end-of-line) control characters? Or some R code to the same end?  Even
 if the real problem ultimately proves top be different, it would be helpful
 to eliminate this possibility. And this is also something I would routinely
 run on files from external sources if I had it. 

  I am working in a windows XP environment, in case that makes a difference.

 Any help anyone could offer would be greatly appreciated.

 Sincerely, andrewH

You could process your file in chunks:

  f - file(myfile.csv, open = r)
  lines - readLines(f, n = 1)
  ## do something with lines
  lines - readLines(f, n = 1)
  ## do something with lines
  ## 

To find 'non-standard characters' you will need to define what
'non-standard characters' are.  But perhaps ?tools:::showNonASCII, which
uses ?iconv, can help you.  (Please note the warnings and caveats on the
functions' help pages.)


-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] save table to txt file in a printable form

2013-11-19 Thread Enrico Schumann
On Wed, 20 Nov 2013, Alaios ala...@yahoo.com writes:

 Hi there,
 I would like to save tabular (in R just matrices) in a txt file but in a way 
 that they would be somehow readable.
 That means keeping columns aligned and rows so one can read easily for 
 example the 2,3 element.

 IS there a way to do that in R?
 capture.output for example does not produce tables in the way I want to have 
 those.

Please provide an example that shows what you want to achieve, and why
capture.output does not work.

  A - rnorm(12)
  dim(A) - c(4,3)
  colnames(A) - LETTERS[1:3]
  A

  #  A  B  C
  # [1,] 1.3883785  1.7264519 -0.1340838
  # [2,] 1.0601385 -0.2378076  0.2078081
  # [3,] 0.7278065 -0.1279776 -1.1674676
  # [4,] 1.3321629 -1.4082142  0.9145042

  capture.output(A)
  # [1]  A  B  C
  # [2] [1,] 1.3883785  1.7264519 -0.1340838
  # [3] [2,] 1.0601385 -0.2378076  0.2078081
  # [4] [3,] 0.7278065 -0.1279776 -1.1674676
  # [5] [4,] 1.3321629 -1.4082142  0.9145042

  # writeLines(capture.output(A), matrix.txt) ## or similar



 Regards
 Alex
   [[alternative HTML version deleted]]

Please send plain-text emails.


-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Yield to maturity in R

2013-10-31 Thread Enrico Schumann
On Wed, 30 Oct 2013, Katherine Gobin katherine_go...@yahoo.com writes:

 Dear R forum,

 Just want to know if there is any function / package in R which will 
 calculate Yield to Maturity in R for a given bond?

 Regards

 Katherine


require(sos)
findFn(yield to maturity)


-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem adding lines to a plot, when y is defined

2013-09-09 Thread Enrico Schumann
On Mon, 09 Sep 2013, Cech, Christian christian.c...@fh-vie.ac.at writes:

 Dear all,

 I want to create a line-plot with two lines and some additional scatter-plots.
 However, adding a line to the plot does not work when I specify the 
 y-argument in the plot command, while it does work when y is not specified.

 I first send you an example that works:
 plot(var[, 3],
  type=l,
  ylim=c(-0.04, 0),
  ylab   = 'portfolio returns',
  xlab   = 'time')
 lines(var[, 5], type=l, lty=3)
 for (i in 1:nrow(var)) {
   if(var[i, 4] | var[i,6])
 points(i, var[i, 2], pch=4)
 }

 -- the result is displayed in attachment Plot1.pdf

 What does not work is the following code, where as the first argument of plot 
 (ie the y-argument) is defined:
 plot(as.Date(var[, 1], origin=1899-12-30),
  var[, 3],
  type=l,
  ylim=c(-0.04, 0),
  ylab='portfolio returns',
  xlab='time')
 lines(var[, 5], type=l, lty=3)
 for (i in 1:nrow(var)) {
   if(var[i, 4] | var[i,6])
 points(i, var[i, 2], pch=4)
 }

 -- the result is displayed in attachment Plot2.pdf

 The same problem appears if instead of

 as.Date(var[, 1], origin=1899-12-30)

 I use

 var[, 1]

 to define the y-argument.

Hard to say without a reproducible example, but try 

  lines(as.Date(var[, 1], origin=1899-12-30),  var[, 5])

instead.


 I do very much appreciate your help!
 Kind regards,
 Christian Cech


[...]


-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plotting zoo objects with chron axis

2013-08-09 Thread Enrico Schumann
On Fri, 09 Aug 2013, Manta mantin...@libero.it writes:

 Dear all,

 I have a problem which I'm not to fix. I have the following two series:

 a=structure(c(33242.5196150509, 34905.8434338503, 38490.6957848689, 
 38747.0287172129, 38919.1028597142, 39026.3956586941, 38705.5344288997, 
 38545.6274379387, 38651.2079354205, 38748.2769580121), index =
 structure(c(14029, 
 14032, 14033, 14034, 14035, 14036, 14039, 14040, 14041, 14042
 ), class = Date), class = zoo)

 b=structure(c(53337.7643740991, 52210.2079727035, 50235.4480363949, 
 50667.1147389469, 50796.5403152116, 51113.5420947436, 51003.3603311344, 
 50654.0539778796, 49927.5267060329, 49320.1813921822), index =
 structure(c(14029, 
 14032, 14033, 14034, 14035, 14036, 14039, 14040, 14041, 14042
 ), class = Date), class = zoo)

 I want to plot them on the same chart, with the x-axis the Date, and the
 y-axis the time in format %H:%M. At the moment, the y series is expressed in
 seconds from midnight. I am confused with the as.POSIXct conversion.

 hours(min(rollrep))
 [1] 9
 minutes(min(rollrep))
 [1] 34
 seconds(min(rollrep))
 [1] 27

 however, by doing the following it seems the minimum time is 11:09, which is
 not true.

 as.POSIXct(round(min(rollrep)),origin=Sys.Date())
 [1] 2013-08-09 11:09:14 CEST

 Do you have any advice? Is there a way to plot only the hours and minutes,
 without having to go to the Sys.Date?

 Thank you,
 Marco

Please provide a *reproducible* example.  (What is 'rollrep'?  'Seconds
from midnight' may be ambiguous when there was a change to/from
summertime or when time zones are relevant.)

Perhaps something like this:

  require(zoo)
  plot(merge(a, b), plot.type = single, yaxt = n)
  midnight - ISOdatetime(2013, 1, 1, 0, 0, 0, GMT)
  sq - seq(0, 86400, by = 300)
  axis(2, at = sq, labels = format(midnight + sq, %H:%M))


-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] cannot base64decode string which is base64encode in R

2013-08-05 Thread Enrico Schumann
On Mon, 05 Aug 2013, Qiang Wang uns...@gmail.com writes:

 On Sat, Aug 3, 2013 at 3:49 PM, Enrico Schumann 
 e...@enricoschumann.netwrote:

 On Fri, 02 Aug 2013, Qiang Wang uns...@gmail.com writes:

  Hi,
 
  I'm struggling with encode/decode strings in R. Don't know why the second
  example below would fail. Thanks in advance for your help.
  succeed: s - saf x - base64encode(s) y - base64decode(x, character)
  fail: s - safs x - base64encode(s) y - base64decode(x, character)
 

 And the first example works for you?

   require(base64enc)
   s - saf
   x - base64encode(s)

 ## Error in file(what, rb) : cannot open the connection
 ## In addition: Warning message:
 ## In file(what, rb) : cannot open file 'saf': No such file or directory

 ?base64encode says that its first argument is

 data to be encoded/decoded. For ‘base64encode’ it can be a raw
  vector, text connection or file name. For ‘base64decode’ it can be
  a string or a binary connection.

 Try this:

   rawToChar(base64decode(base64encode(charToRaw(saf

 ## [1] saf

 --
 Enrico Schumann
 Lucerne, Switzerland
 http://enricoschumann.net


 Thanks for your reply!

 Sorry I did not clarify that I was using base64encode and base64decode
 functions provide from caTools package. It seems that if I convert the
 string to the raw type first, it still solves my problem.

 My original problem actually is that I have a string:
 secret -
 '5Kwug+Byrq+ilULMz3IBD5tquNt5CcdYi3XPc8jnKwtXvIgHw/vcSGU1VCIo4b/OfcRDm7uH359syfhWzXFrNg=='

 It was claimed to be encoded in Base64. So I tried to decode it:

 require(base64enc)
 rawToChar(base64decode(secret))

 Then, I got
 \xe4\xac.\x83\xe0r\xae\xaf\xa2\x95B\xcc\xcfr\001\017\x9bj\xb8\xdby\t\xc7X\x8bu\xcfs\xc8\xe7+\vW\xbc\x88\a\xc3\xfb\xdcHe5T\(\xe1\xbf\xce}\xc4C\x9b\xbb\x87ߟl\xc9\xf8V\xcdqk6

 But what I suppose to get is:
 '\xe4\xac.\x83\xe0r\xae\xaf\xa2\x95B\xcc\xcfr\x01\x0f\x9bj\xb8\xdby\t\xc7X\x8bu\xcfs\xc8\xe7+\x0bW\xbc\x88\x07\xc3\xfb\xdcHe5T(\xe1\xbf\xce}\xc4C\x9b\xbb\x87\xdf\x9fl\xc9\xf8V\xcdqk6'

 Most part of the result is correct except several characters near the end.
 I don't know where the problem is.


See the help page of 'rawToChar': the function transforms raw bytes into
characters.  But, depending on your locale, one character may be more
than one byte.  On my computer, with a UTF-8 locale (see my
'?sessionInfo' below),

  rawToChar(base64decode(secret), TRUE)

gives me 

  ##  [1] \xe4 \xac .\x83 \xe0 r\xae
  ##  [8] \xaf \xa2 \x95 B\xcc \xcf r   
  ## [15] \001 \017 \x9b j\xb8 \xdb y   
  ## [22] \t   \xc7 X\x8b u\xcf s   
  ## [29] \xc8 \xe7 +\v   W\xbc \x88
  ## [36] \a   \xc3 \xfb \xdc He5   
  ## [43] T\   (\xe1 \xbf \xce }   
  ## [50] \xc4 C\x9b \xbb \x87 \xdf \x9f
  ## [57] l\xc9 \xf8 V\xcd qk   
  ## [64] 6

That is, every *single* byte is converted into character.  For example:

  rawToChar(base64decode(secret), TRUE)[55:56]

gives 

  ## [1] \xdf \x9f

which probably is what you expected.  But if I paste those two
characters together,

  paste(rawToChar(base64decode(s), TRUE)[55:56], collapse = )

they will be shown like so:

  ## [1] ߟ

because this is how this byte pattern will be interpreted in UTF-8.




Abbreviated 'sessionInfo':

R version 3.0.1 (2013-05-16)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_GB.UTF-8   LC_NUMERIC=C  
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_GB.UTF-8
 [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_GB.UTF-8   
 [7] LC_PAPER=C LC_NAME=C 
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C   



-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] cannot base64decode string which is base64encode in R

2013-08-03 Thread Enrico Schumann
On Fri, 02 Aug 2013, Qiang Wang uns...@gmail.com writes:

 Hi,

 I'm struggling with encode/decode strings in R. Don't know why the second
 example below would fail. Thanks in advance for your help.
 succeed: s - saf x - base64encode(s) y - base64decode(x, character)
 fail: s - safs x - base64encode(s) y - base64decode(x, character)


And the first example works for you?

  require(base64enc)
  s - saf
  x - base64encode(s)

## Error in file(what, rb) : cannot open the connection
## In addition: Warning message:
## In file(what, rb) : cannot open file 'saf': No such file or directory

?base64encode says that its first argument is

data to be encoded/decoded. For ‘base64encode’ it can be a raw
 vector, text connection or file name. For ‘base64decode’ it can be
 a string or a binary connection.

Try this:

  rawToChar(base64decode(base64encode(charToRaw(saf

## [1] saf

-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] using rollapply to calculate a moving sum or running sum?

2013-08-03 Thread Enrico Schumann
On Fri, 02 Aug 2013, Anika Masters anika.mast...@gmail.com writes:

 This is not critical, but I am curious to learn. Are there any
 suggestions for speeding up the process to calculate a moving row sum?
 (Ideally from within R, as opposed to suing C, etc.)
 Using rollapply on a matrix of 45,000 rows and 400 columns takes 83 minutes.

 date()
 mymatrix - matrix(data=1:45000, nrow=45000, ncol=400)
 temp - t(rollapply(t(mymatrix), width=12, FUN=sum, by.column=T,
 fill=NA, partial=FALSE, align=left))
 date()


Write a function that *quickly* computes the moving sum of a single row;
then loop over the rows.

Here is such a function, which is a slightly modified copy of the
function MA (moving average) in the NMOF package.

  rsum - function (y, order, pad = NULL) {
  n - length(y)
  ma - cumsum(y)
  ma[order:n] - ma[order:n] - c(0, ma[1L:(n - order)])
  if (!is.null(pad)  order  1L) 
  ma[1L:(order - 1L)] - pad
  ma
  }

The main 'trick' is the use of 'cumsum'.  Some speed comparions can
given here

  require(NMOF)
  showExample(fastMA)

The original function computes a moving average but is right-aligned
(which I typically want for time series).  Since in your example you use
'align = left', I flip the columns of your matrix left to right.

Your example (increase the size by increasing rows/cols):

  require(zoo)

  ## data
  rows - 450
  cols - 40
  mymatrix - matrix(data = rnorm(rows*cols), nrow = rows, ncol = cols)
  
  ## with rollapply
  temp - t(rollapply(t(mymatrix), width=12, FUN=sum, by.column=T,
  fill=NA, partial=FALSE, align=left))
  
  ## with rsum
  answer - array(NA, dim = dim(mymatrix))
  flipped - mymatrix[ ,cols:1]
  for (i in seq_len(rows))
  answer[i, ] - rsum(flipped[i, ], 12, NA)
  all.equal(unname(temp), answer[ ,cols:1])




Regards,
Enrico



 On Thu, Jun 27, 2013 at 2:41 PM, arun smartpink...@yahoo.com wrote:
 Hi,
 Try:
 library(zoo)

 rollapply(t(mymatrix),width=12,FUN=sum,by.column=T,fill=NA,partial=FALSE,align=left)
  # [,1] [,2] [,3] [,4] [,5]
  #[1,]  342  354  366  378  390
  #[2,]  402  414  426  438  450
  #[3,]  462  474  486  498  510
  #[4,]  522  534  546  558  570
  #[5,]  582  594  606  618  630
  #[6,]  642  654  666  678  690
  #[7,]  702  714  726  738  750
  #[8,]  762  774  786  798  810
  #[9,]  822  834  846  858  870
 #[10,]   NA   NA   NA   NA   NA
 #[11,]   NA   NA   NA   NA   NA
 #[12,]   NA   NA   NA   NA   NA
 #[13,]   NA   NA   NA   NA   NA
 #[14,]   NA   NA   NA   NA   NA
 #[15,]   NA   NA   NA   NA   NA
 #[16,]   NA   NA   NA   NA   NA
 #[17,]   NA   NA   NA   NA   NA
 #[18,]   NA   NA   NA   NA   NA
 #[19,]   NA   NA   NA   NA   NA
 #[20,]   NA   NA   NA   NA   NA
 A.K.



 - Original Message -
 From: Anika Masters anika.mast...@gmail.com
 To: R help r-help@r-project.org
 Cc:
 Sent: Thursday, June 27, 2013 3:00 PM
 Subject: [R] using rollapply to calculate a moving sum or running sum?

 #using rollapply to calculate a moving sum or running sum?

 #I am tryign to use rollapply to calcualte a moving sum? #I tried
 rollapply and get the error message
 #Error in seq.default(start.at, NROW(data), by = by) :
 #  wrong sign in 'by' argument

 #example:

 mymatrix - ( matrix(data=1:100, nrow=5, ncol=20) )
 mymatrix_cumsum  - ( matrix(data=NA, nrow=5, ncol=20) )
 w=12
 for(i in 1: (ncol(mymatrix)-w+1) ) {
 mymatrix_cumsum[ , i]  - apply(X=mymatrix[, i:(i+w-1)] , MARGIN=1,
 FUN=sum, na.rm=T)
 }

 #How might I use the rollapply function instead?

 rollapply(mymatrix, 12, sum)

 rollapply(data = mymatrix, width = 12, FUN=sum, by.column =T, fill =
 NA, partial = FALSE, align = left )


-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] optimize integer function parameters

2013-07-23 Thread Enrico Schumann
On Tue, 23 Jul 2013, Christof Kluß ckl...@email.uni-kiel.de writes:


 I have observations obs - (11455, 11536, 11582, 11825, 11900,  ...)

 and a simulation function f(A,B,C,D,E,F), so sim - f(A,B,C,D,E,F)

 e.g. sim = c(11464, 11554, 11603, 11831, 11907, ...)

 now I would like to fit A,B,C,D,E,F such that obs and f(A,B,C,D,E,F)
 match as well as possible. A,..,F should be integers and have bounds.

 How would you solve this problem without bruteforce in an acceptable time?



That depends on what your simulation function looks like.  Could you
post a (small) self-contained example?


-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] optimize integer function parameters

2013-07-23 Thread Enrico Schumann
On Tue, 23 Jul 2013, Christof Kluß ckl...@email.uni-kiel.de writes:

 Am 23-07-2013 13:20, schrieb Enrico Schumann:

 On Tue, 23 Jul 2013, Christof Kluß ckl...@email.uni-kiel.de writes:


 I have observations obs - (11455, 11536, 11582, 11825, 11900,  ...)

 and a simulation function f(A,B,C,D,E,F), so sim - f(A,B,C,D,E,F)

 e.g. sim = c(11464, 11554, 11603, 11831, 11907, ...)

 now I would like to fit A,B,C,D,E,F such that obs and f(A,B,C,D,E,F)
 match as well as possible. A,..,F should be integers and have bounds.

 How would you solve this problem without bruteforce in an acceptable time?



 That depends on what your simulation function looks like.  Could you
 post a (small) self-contained example?



 the integer values in the vectors sim and obs are dates

 when I set sim - f(TS0,TS1,TS2,TB0,TB1,TB2) my A,..,F below
 then TS0 and TB0 are depend (and so on)


 the main thing in f(...) is something like

  for (i in c(1:length(temperature))) {
   Temp - temperature[i]
   if (DS  0) {
 DS - DS + max(Temp-TB0,0) / TS0
   } else if (DS  1) {
 ... date0 - i
 DS - DS + max(Temp-TB1,0) / TS1
   } else if (DS  2) {
 ... date1 - i
 DS - DS + max(Temp-TB2,0) / TS2
   } else {
 ... date2 - i
 break
   }
 }


 this produced a vector sim = c(date0,date1,date2,...)


 now I would like to minimize RMSE(sim,obs) or something like that

 thx
 Christof



 for brute force I would do something like

 obs - ...
 act - 1000

   for (TS0 in seq(50,100,10))
 for (TS1 in seq(750,850,10))
   for (TS2 in seq(400,600,10))
 for (TB0 in c(5:7))
   for (TB1 in c(5:7))
 for (TB2 in c(4:9)) {
   sim - foosim(dat,TS0,TS1,TS2,TB0,TB1,TB2)
   rmse - sqrt(mean((sim - obs)^2, na.rm = TRUE))

   if (rmse  act) {
 print(paste(rmse,TS0,TS1,TS2,TB0,TB1,TB2))
 act - rmse
   }
 }


Sorry, but that is not what I meant by a (small) self-contained
example. 

In any case, if brute force is feasible -- ie, your function can be
evaluated quickly enough and there are no further parameters -- then why
not do brute force?  (In the NMOF package there is a function
'gridSearch' that allows you to distribute the function evaluations.
Disclosure: I am the package author.)

If that does not work, you might try a Local Search or one of its
variants (see for instance 'LSopt' in the NMOF package).  But it is
difficult to say anything specific without knowing your actual function.

Regards,
Enrico

-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] modify timestemp

2013-07-04 Thread Enrico Schumann
On Wed, 03 Jul 2013, Ye Lin ye...@lbl.gov writes:

 Hey All,

 I want to standardize my timestamp which is formatted as hh:mm:ss

  My data looks like this:

  Date Time
 01/01/2013 00:09:01
 01/02/2013 00:10:14
 01/03/2013 00:11:27
 01/04/2013 00:12:40
 01/05/2013 00:13:53
 01/06/2013 00:15:06
 01/07/2013 00:16:19
 01/08/2013 00:17:32
 01/09/2013 00:18:45
 01/10/2013 00:19:58

 Dataset - structure(list(Date = c(01/01/2013, 01/02/2013,
 01/03/2013,
 01/04/2013, 01/05/2013, 01/06/2013, 01/07/2013, 01/08/2013,
 01/09/2013, 01/10/2013), Time = c(00:09:01, 00:10:14,
 00:11:27, 00:12:40, 00:13:53, 00:15:06, 00:16:19, 00:17:32,
 00:18:45, 00:19:58)), .Names = c(Date, Time), class = data.frame,
 row.names = c(NA,
 -10L))

 I would like to change all the records in Time column uniformed as
 hh:mm:00, then the output would be this:

 Date Time
 01/01/2013 00:09:00
 01/02/2013 00:10:00
 01/03/2013 00:11:00
 01/04/2013 00:12:00
 01/05/2013 00:13:00
 01/06/2013 00:15:00
 01/07/2013 00:16:00
 01/08/2013 00:17:00
 01/09/2013 00:18:00
 01/10/2013 00:19:00

 Thanks for your help!

Since your dates and times are character vectors, you can do this:
 
  substr(Dataset$Time, 7, 8) - 00

If you want to treat them as actual dates and times, you need to convert
them into one of R's date/time classes (see ?DateTimeClasses):

  timestamp - as.POSIXlt(paste(Dataset$Date, Dataset$Time), 
  format = %m/%d/%Y %H:%M:%S)

  timestamp$sec

  ## [1]  1 14 27 40 53  6 19 32 45 58

  timestamp$sec - 0



   [[alternative HTML version deleted]]


Please send only plain-text mails to this list.



-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] return output to console for copying as input

2013-06-25 Thread Enrico Schumann
On Tue, 25 Jun 2013, nevil amos nevil.a...@gmail.com writes:

 I want to print a vector of strings to the console formatted as if it were
 input

 X-c(a,b,c)
 X
 [1] a b c

 what I would like to get is

 the.function(X)
a,b,c

 what is the function?


the.function - function(x)
cat(paste0(\, x, \, collapse = , ), \n)



-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Time of day

2013-06-25 Thread Enrico Schumann
On Tue, 25 Jun 2013, Rguy r...@123mail.org writes:

 Is there a simple way to obtain the time of day in R? I want the time of
 day for computational purposes, not for display. I want to be able to
 create code like the following:

 if (time_of_day = 22:00  time_of_day = 06:00) then X otherwise Y
 ^  ^
this will not work: the times need to be something that R understands,
eg, numeric values such as 2200 or character values such as 22:00


 I realize I could parse a date/time object and extract the time, but
 hopefully other people have already done this, or there is a
 straightforward representation of time of day in R that I  have not been
 able to find in the documentation.

strftime(Sys.time(), %H:%M)

-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >