Re: [R] Integer Sample with Mean Dependent on Size

2019-08-01 Thread Lorenzo Isella
Hello, On Thu, Aug 01, 2019 at 11:17:30PM +1200, Richard O'Keefe wrote: 2(N-1)/N = 2 - 2/N. So one way to get exactly that mean is to make all the numbers 2 except for two of them which are 1. N < 2 : can't be done. N = 2 : only [1,1] does the job. N = 3 : the sum of the three numbers must be

Re: [R] Integer Sample with Mean Dependent on Size

2019-08-01 Thread Lorenzo Isella
212 gerrit.eich...@math.uni-giessen.de Justus-Liebig-University Giessen Tel: +49-(0)641-99-32104 Arndtstr. 2, 35392 Giessen, Germany http://www.uni-giessen.de/eichner - Am 01.08.2019 um 12:27 schrieb Lorenzo Isella: Dea

[R] Integer Sample with Mean Dependent on Size

2019-08-01 Thread Lorenzo Isella
Dear All, I cannot unfortunately provide any R code, otherwise I would not need to post this in the first place. I need to generate a sample of N positive non-zero integers such that their mean is *exactly* 2(N-1)/N, i.e. the mean depends on the length of the sample. For a start, we can assume

Re: [R] [R-sig-Debian] Curl4, Quantmod, tseries and forecast

2019-07-09 Thread Lorenzo Isella
Thanks, that fixed the issue! L. On Tue, Jul 09, 2019 at 01:41:39PM +0200, Ralf Stubner wrote: Hi Lorenzo I reordered the quote slightly: On Tue, Jul 9, 2019 at 1:30 PM Lorenzo Isella wrote: On Sun, Jul 07, 2019 at 03:16:20PM +0200, Ralf Stubner wrote: >Did you reinstall the curl pack

Re: [R] [R-sig-Debian] Curl4, Quantmod, tseries and forecast

2019-07-09 Thread Lorenzo Isella
b am So. 7. Juli 2019 um 14:16: Hi Lorenzo, On Sun, Jul 7, 2019 at 6:42 AM Lorenzo Isella wrote: > ** byte-compile and prepare package for lazy loading > Error in dyn.load(file, DLLpath = DLLpath, ...) : > unable to load shared object '/usr/local/lib/R/site-library/curl/libs/curl.so':

[R] Curl4, Quantmod, tseries and forecast

2019-07-07 Thread Lorenzo Isella
Dear All, I have just upgraded to Debian stable 10 and rebuilt most of the R packages. I use the R backported packages from here https://cran.r-project.org/bin/linux/debian/#debian-buster-testing for the core system. I encounter some issues when updating quantmod, tseries and forecast. For

Re: [R] Sequential Filtering of a Data Set

2019-04-27 Thread Lorenzo Isella
rradas Às 15:51 de 26/04/19, Lorenzo Isella escreveu: Dear All, I must be drowning in a glass of water. Consider the following data set tt2<-structure(list(year = c(2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018), country = c(

[R] Sequential Filtering of a Data Set

2019-04-26 Thread Lorenzo Isella
Dear All, I must be drowning in a glass of water. Consider the following data set tt2<-structure(list(year = c(2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018), country = c("DE", "DE", "DE", "DE", "DE", "DE", "DE", "DE", "DE",

[R] Multiple Lags with Dplyr

2019-04-23 Thread Lorenzo Isella
Dear All, I refer to the excellent post at https://purrple.cat/blog/2018/03/02/multiple-lags-with-tidy-evaluation/ What I want to do is to create a function capable, à la dplyr, to generate new columns which are a lagged version of existing columns in a data frame. For instance, you can do this

[R] Multiple Lags with Dplyr

2019-04-23 Thread Lorenzo Isella
Dear All, I refer to the excellent post at https://purrple.cat/blog/2018/03/02/multiple-lags-with-tidy-evaluation/ What I want to do is to create a function capable, à la dplyr, to generate new columns which are a lagged version of existing columns in a data frame. For instance, you can do this

Re: [R] Purr and Basic Functional Programming Tasks

2019-01-27 Thread Lorenzo Isella
:45 AM Lorenzo Isella wrote: Dear All, I am making my baby steps with the tidyverse purr package and I am stuck with some probably trivial tasks. Consider the following data set zz<-list(structure(list(year = c(2000, 2001, 2002, 2003, 2000, 2001, 2002, 2003, 2000, 2001, 2002, 2003), tot

[R] Purr and Basic Functional Programming Tasks

2019-01-25 Thread Lorenzo Isella
Dear All, I am making my baby steps with the tidyverse purr package and I am stuck with some probably trivial tasks. Consider the following data set zz<-list(structure(list(year = c(2000, 2001, 2002, 2003, 2000, 2001, 2002, 2003, 2000, 2001, 2002, 2003), tot_i = c(22393349.081, 23000574.372,

[R] Optimisation with Normalisation Constraint

2018-06-20 Thread Lorenzo Isella
Dear All, I have a problem I haver been struggling with for a while: I need to carry out a non-linear fit (and this is the easy part). I have a set of discrete values {x1,x2...xN} and the corresponding {y1, y2...yN}. The difficulty is that I would like the linear fit to preserve the sum of the

[R] Reasons to Use R in a Public Administrations and Ideas for a Short Training

2018-04-18 Thread Lorenzo Isella
Dear All, Ages ago I posted to this mailing list asking for advice about to evangelize the use of R in an international public administration where the fact that R is free is not a decisive factor (actually its being "freeware" may even be seen negatively). After a long time, I think it is

[R] Fitting Beta Distribution

2017-12-21 Thread Lorenzo Isella
esults are consistent. -- Forwarded message -- From: Lorenzo Isella <lorenzo.ise...@yopmail.com> Date: 21 December 2017 at 11:29 Subject: Fitting Beta Distribution To: "r-help@r-project.org" <r-help@r-project.org> Dear All, I need to fit a custo

[R] Fitting Beta Distribution

2017-12-21 Thread Lorenzo Isella
Dear All, I need to fit a custom probability density (based on the symmetric beta distribution B(shape, shape), where the two parameters shape1 and shape2 are identical) to my data. The trouble is that I experience some problems also when dealing with the plain vanilla symmetric beta distribution.

[R] Fitdistrplus and Custom Probability Density

2017-11-07 Thread Lorenzo Isella
Dear All, Apologies for not providing a reproducible example, but if I could, then I would be able to answer myself my question. Essentially, I am trying to fit a very complicated custom probability distribution to some data. Fitdistrplus does in principle everything which I need, but if require

[R] Access Data Base Reading on Linux Platform

2017-07-31 Thread Lorenzo Isella
Dear All, I am really far from a database expert (I do prefer flat files as long as that is reasonable), but I have to deal with an accdb database (Microsoft Access new format). It all stems from the fact that I run R almost exclusively on Debian platforms. I did a bit of googling

[R] rJava Broken on Linux + R 3.4

2017-06-25 Thread Lorenzo Isella
Dear All, I think there is something wrong with rJava on any Debian based distribution. I may be wrong, but I experiencing exactly the problems mentioned at https://github.com/amattioc/SDMX/issues/130 and at https://github.com/s-u/rJava/issues/110 A couple of packages (RJSDMX and xlsx) are

[R] Segmentation Fault when Installing Some Packages

2017-06-21 Thread Lorenzo Isella
Dear All, I have a fresh Debian Stretch (now the official Debian stable) installation on my machine. Based on what written here https://cran.r-project.org/bin/linux/debian/#debian-jessie-stable I added the line deb https://stat.ethz.ch/CRAN/bin/linux/debian stretch-cran34/ to my sources in

[R] Changing Color of Selected Column Names in Corrplot

2017-06-16 Thread Lorenzo Isella
Dear All, Please consider the following example library(corrplot) M <- cor(mtcars) corrplot(M, method="circle", type="lower", diag=F) Suppose that I want to have the label "mpg" at the top in black and leave everything else in red. How can I achieve that? Cheers Lorenzo

Re: [R] Spacing Between Elements in Lattice Legend

2017-05-29 Thread Lorenzo Isella
.text=3) If this is *NOT* what you want, do cc the list in any reply. Oh, and incidentally, your code mistakenly had the "main" argument replicated. Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it."

[R] Spacing Between Elements in Lattice Legend

2017-05-28 Thread Lorenzo Isella
Dear All, Please consider the short code at the end of the email. It generates a barchart where everything is as I want, apart from some minor tuning of the legend. I can control the spacing between the text in the two rows of the legend, but how do I force some separation between the red and the

[R] Problem with Example with Lattice

2017-05-27 Thread Lorenzo Isella
Dear All, I am making my baby steps with the lattice graphic system. I am going through the great book by Sarkar which provides plenty of examples. However, I notice that some of them appear to give a different result on my system. For instance, consider the following library(lattice)

[R] Fitdistrplus and Parameter Constraints

2017-05-23 Thread Lorenzo Isella
Dear All, In principle it is a simple question, but not idea about how to tackle it. Suppose you have a distribution depending on two parameters, e.g. beta(a,b). For some reasons, you want to impose that the two parameters of the beta distribution are identical, i.e. you want to fit your data to

[R] Windowing Time Series in The Past

2017-04-01 Thread Lorenzo Isella
Dear All, I am sure the solution is a one liner, but I am a bit struggling. Given a time series which starts at a given time t_ini, I would like to set a initial start time farther away in the past and have NA before t_ini (I need this to align different time series). For instance

Re: [R] GADM -- Download World Map for R

2017-03-15 Thread Lorenzo Isella
, 2017 at 1:57 AM, Lorenzo Isella <lorenzo.ise...@gmail.com> wrote: Dear All, Please have a look at the snippet here http://bit.ly/2mVS8me This short code addresses precisely one of my needs: to superimpose a network (created with the igraph library) to a geographical map. Unlike th

[R] GADM -- Download World Map for R

2017-03-14 Thread Lorenzo Isella
Dear All, Please have a look at the snippet here http://bit.ly/2mVS8me This short code addresses precisely one of my needs: to superimpose a network (created with the igraph library) to a geographical map. Unlike the case of the example, where a single country is enough, I need to have a world

Re: [R] Transport and Earth Mover's Distance

2017-03-07 Thread Lorenzo Isella
Dear Dominic, Thanks a lot for the quick reply. Just a few questions to make sure I got it all right (I now understand that transport and spatstat in particular can do much more than I need right now). Essentially I am after the Wasserstein distance between univariate distributions (and it would

[R] Earth Mover's Distance

2017-03-07 Thread Lorenzo Isella
Dear All, From time to time I need to resort to the calculation of the earth mover' distance (see https://en.wikipedia.org/wiki/Earth_mover's_distance and https://en.wikipedia.org/wiki/Wasserstein_metric . In the past I used the package https://r-forge.r-project.org/projects/earthmovdist/

[R] Earth Mover's Distance

2017-03-07 Thread Lorenzo Isella
Dear All, From time to time I need to resort to the calculation of the earth mover' distance (see https://en.wikipedia.org/wiki/Earth_mover's_distance and https://en.wikipedia.org/wiki/Wasserstein_metric . In the past I used the package https://r-forge.r-project.org/projects/earthmovdist/

[R] Question about Cubist Model

2017-01-12 Thread Lorenzo Isella
Dear All, I am fine tuning a Cubist model (see https://cran.r-project.org/web/packages/Cubist/index.html). I am a bit puzzled by its output. On a dataset which contains 275 cases, I get non mutually exclusive rules. E.g., in the output below, rules 2 and 3 cover all the 275 cases of the data set

[R] Question about Cubist Model

2017-01-12 Thread Lorenzo Isella
Dear All, I am fine tuning a Cubist model (see https://cran.r-project.org/web/packages/Cubist/index.html). I am a bit puzzled by its output. On a dataset which contains 275 cases, I get non mutually exclusive rules. E.g., in the output below, rules 2 and 3 cover all the 275 cases of the data set

[R] Discarding Models in Caret During Model Training

2016-11-14 Thread Lorenzo Isella
Dear All, Maybe some of you has come across this problem. Let's say that you use caret for hyperparameter tuning. You train several models and you then select the best performing one according to some performance metric. My problem is that, sometimes, I would like to tune really many models (in

[R] Minimum Binding Box in 3D

2016-09-13 Thread Lorenzo Isella
Dear All, I would like to know if anybody is aware of an R implementation of a minimal bounding box ( http://bit.ly/2cKaSgT ) algorithm in R for points in 3 dimensions. It looks like there is no shortage of implementations in 2 dimensions, e.g. http://bit.ly/2cKaSh0 http://bit.ly/2cKboLS but I

Re: [R] DEA -- Extract the Frontier and ggplot2

2016-08-02 Thread Lorenzo Isella
ssional-resources-home/knowledge-hub-evidence-statistics/ From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Lorenzo Isella Sent: 02 August 2016 09:05 To: r-help@r-project.org Cc: ggpl...@googlegroups.com Subject: [R] DEA -- Extract the Frontier and ggplot2 Dear All, Please consider

[R] DEA -- Extract the Frontier and ggplot2

2016-08-02 Thread Lorenzo Isella
Dear All, Please consider the code at the end of the email. Everything is fine in this little example, just I do not know how to extract the DEA frontier (solid line in the plot). The reason is that I want to reproduce a more complicated DEA frontier plot using ggplot2 and I need to understand

Re: [R] Visualization of a Convex Hull in R (Possibly with RGL)

2016-07-26 Thread Lorenzo Isella
On Tue, Jul 26, 2016 at 10:48:22AM +, Michael Sumner wrote: On Tue, 26 Jul 2016 at 20:29 Lorenzo Isella <lorenzo.ise...@gmail.com> wrote: Dear All, I am not an expert about the calculation and visualization of convex hulls, but I am trying to do something relatively simple. Please co

[R] Visualization of a Convex Hull in R (Possibly with RGL)

2016-07-26 Thread Lorenzo Isella
Dear All, I am not an expert about the calculation and visualization of convex hulls, but I am trying to do something relatively simple. Please consider the snippet at the end of the email. The array pts represents the position of (the centres of) a set of spheres in 3D (whose radius is 0.5). I

[R] Visualization of a Convex Hull in R (Possibly with RGL)

2016-07-26 Thread Lorenzo Isella
Dear All, I am not an expert about the calculation and visualization of convex hulls, but I am trying to do something relatively simple. Please consider the snippet at the end of the email. The array pts represents the position of (the centres of) a set of spheres in 3D (whose radius is 0.5). I

[R] Trouble with (Very) Simple Clustering

2016-06-06 Thread Lorenzo Isella
Dear All, I am doing something extremly basic (and I do not claim at all there is no other way to achieve the same): I have a list of numbers and I would like to split them up into clusters. This is what I do: I see each number as a 1D vector and I calculate the euclidean distance between them. I

Re: [R] Getting Rid of NaN in ts Object

2016-05-27 Thread Lorenzo Isella
Perfect! Exactly what I was looking for. Thanks Lorenzo On Fri, May 27, 2016 at 01:50:03PM +0200, Christian Brandstätter wrote: Hi Lorenzo, Try: tt[is.nan(tt)] <- NA tt <- na.omit(tt) Best, Christian Am 27.05.2016 um 13:38 schrieb Lorenzo Isella: On Fri, May 27, 2016 at 08:46:20PM

Re: [R] Getting Rid of NaN in ts Object

2016-05-27 Thread Lorenzo Isella
On Fri, May 27, 2016 at 8:14 PM, Lorenzo Isella <lorenzo.ise...@gmail.com> wrote: Dear All, I am sure the answer is a one liner, but I am banging my head against the wall and googling here and there has not helped much. Consider the following time series tt<-structure(c(NaN, NaN,

[R] Getting Rid of NaN in ts Object

2016-05-27 Thread Lorenzo Isella
Dear All, I am sure the answer is a one liner, but I am banging my head against the wall and googling here and there has not helped much. Consider the following time series tt<-structure(c(NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, 1133.09, 1155.77,

[R] Bootstrap Methods for Confidence Intervals -- glmnet

2016-05-12 Thread Lorenzo Isella
Dear All, Please have a look at the code at the end of the email. It is just an example of regression based on glmnet with some artificial data. My question is how I can evaluate the uncertainty of the prediction yhat. It looks like there are some reasons for not providing a standard error

Re: [R] Problem with X11

2016-04-20 Thread Lorenzo Isella
, 2016 at 11:23 AM, Lorenzo Isella <lorenzo.ise...@gmail.com> wrote: Dear All, I have never had this problem before. I run debian testing on my box and I have recently update my R environment. Now, see what happens when I try the most trivial of all plots plot(seq(22)) Error in (fu

[R] Problem with X11

2016-04-19 Thread Lorenzo Isella
Dear All, I have never had this problem before. I run debian testing on my box and I have recently update my R environment. Now, see what happens when I try the most trivial of all plots plot(seq(22)) Error in (function (display = "", width, height, pointsize, gamma, bg, : X11 module cannot

[R] Total Least Squares Regression

2016-03-29 Thread Lorenzo Isella
Dear All, I am looking for an R package to handle total least square regression (TLS). See http://bit.ly/1pSf4Bg I am in a situation in which I have errors in both the dependent variables X (plural because I have several predictors) and the independent variable y. I found several discussion

[R] Orthogonal Nonlinear Least-Squares Regression in R

2016-03-19 Thread Lorenzo Isella
Dear All, I am trying my hands at orthogonal least square regression. Have a look for instance at http://bit.ly/1pB2aHX https://cran.r-project.org/web/packages/onls/index.html http://bit.ly/1XDkkTL docs.scipy.org/doc/external/odrpack_guide.pdf However, I am experiencing some problems with a

[R] Linear Model and Missing Data in Predictors

2016-03-15 Thread Lorenzo Isella
Dear All, A situation that for sure happens very often: suppose you are in the following situation set.seed(1235) x1 <- seq(30) x2 <- c(rep(NA, 9), rnorm(19)+9, c(NA, NA)) x3 <- c(rnorm(17)-2, rep(NA, 13)) y <- exp(seq(1,5, length=30)) mm<-lm(y~x1+x2+x3) i.e. you try a simple linear

[R] Modelling non-Negative Time Series

2016-01-31 Thread Lorenzo Isella
Dear All, I am struggling to develop a model to forecast the daily expenses from a bank account. The daily time series consists (obviously) of non-negative numbers which can be zero in the days when no money is taken from the bank account. To give you an idea of the kind of series I am dealing

Re: [R] Time Series and Auto.arima

2016-01-31 Thread Lorenzo Isella
Fri, Jan 29, 2016 at 02:16:27PM -0800, David Winsemius wrote: On Jan 29, 2016, at 12:59 PM, Lorenzo Isella <lorenzo.ise...@gmail.com> wrote: Dear All, I am puzzled and probably I am misunderstanding something. Please consider the snippet at the end of the email. We see a time series that

Re: [R] Time Series and Auto.arima

2016-01-29 Thread Lorenzo Isella
29, 2016 at 02:16:27PM -0800, David Winsemius wrote: On Jan 29, 2016, at 12:59 PM, Lorenzo Isella <lorenzo.ise...@gmail.com> wrote: Dear All, I am puzzled and probably I am misunderstanding something. Please consider the snippet at the end of the email. We see a time series that has c

[R] Time Series and Auto.arima

2016-01-29 Thread Lorenzo Isella
Dear All, I am puzzled and probably I am misunderstanding something. Please consider the snippet at the end of the email. We see a time series that has clearly some pattern (essentially, it is an account where a salary is regularly paid followed by some expenses). However the output of the

[R] Gsarima and Method

2016-01-20 Thread Lorenzo Isella
Dear All, While tuning some time series model with gsarima (which is primarily a wrapper for arima) from the astsa package, I encounter the following error message Error in stats::arima(xdata, order = c(p, d, q), seasonal = list(order = c(P, : non-stationary seasonal AR part from CSS

[R] Segmentation Fault on Debian

2016-01-15 Thread Lorenzo Isella
Dear All, I am running R on a debian testing machine and lately I have experienced several segmentation faults (often when running Amelia on some large data set). However, please have a look at the script pasted at the end of the email. If I uncomment the line about the RJSDMX library (which

[R] Updating a Time Series After Forecast()

2016-01-14 Thread Lorenzo Isella
Dear All, Perhaps I am drowning in a cup of water, since I am positive that the answer will be a one-liner. Consider the following short script library(forecast) ts2<-structure(c(339130, 356462, 363234, 378179, 367864, 378337, 392157,

[R] Forecast and xreg

2016-01-13 Thread Lorenzo Isella
Dear All, Please consider the small self-contained code at the end of the email. It is an artificial arimax model with a matrix of regressors xreg. In the script, xreg has as many rows as the number of data points in the time series "visits" I want to model. Now, my problem is the following: I am

[R] Forecasting with Timeseries with Different Frequency

2016-01-12 Thread Lorenzo Isella
Dear All, Suppose you have some time series, e.g. monthly data about profits in a company. I am trying to develop a model to predict what the profit will be the next month or two. Other useful data is for sure the number of employes in the company, the volume of product purchases etc...but they

[R] Amelia: Inputing Integers

2016-01-06 Thread Lorenzo Isella
Dear All, I can provide a numerical example if needed. My problem is the following: I am using amelia to input some missing data in a dataset. The problem is that some columns consist of integer numbers only and amelia inputs some real numbers with decimals. Is there a way to tell amelia that

[R] Checkpoints in Caret

2015-12-21 Thread Lorenzo Isella
Dear All, I am training a model using caret. Everything is fine, but the training will take several days. I wonder if there is the possibility of creating some checkpoints in caret so that the training of a complex model can be split up into several jobs to be executed in different days. I

[R] Optim() and Instability

2015-11-14 Thread Lorenzo Isella
Dear All, I am using optim() for a relatively simple task: a linear model where instead of minimizing the sum of the squared errors, I minimize the sum of the squared relative errors. However, I notice that the default algorithm is very sensitive to the choice of the initial fit parameters,

[R] Caret Internal Data Representation

2015-11-05 Thread Lorenzo Isella
Dear All, I have a data set which contains both categorical and numerical variables which I analyze using Cubist+the caret framework. Now, from the generated rules, it is clear that cubist does something to the categorical variables and probably uses some dummy coding for them. However, I cannot

[R] Caret and Summary

2015-10-29 Thread Lorenzo Isella
Dear All, I trained a model, let's call it mm, using caret+Cubist. When I type summary(mm), the output is rather long. This is because a Cubist model is a long set of rules, partially reminiscent of a classification tree. How can I save summary(mm) in a printer/article friendly way when this is

Re: [R] Parsing XML File

2015-10-13 Thread Lorenzo Isella
ytrue DU108063 3 AccountType CORPORATION DU108063 4 AccruedCash 0 AUDDU108063 5 AccruedCash 0 BASEDU108063 6 AccruedCash 0 CADDU108063 Jim Holtman Data Munger Guru What is the problem that you are try

[R] Parsing XML File

2015-10-11 Thread Lorenzo Isella
Dear All, I am struggling with the parsing of the xml file you can find at https://www.dropbox.com/s/i4ld5qa26hwrhj7/account.xml?dl=0 Essentially, I would like to be able to convert it to a data.frame to manipulate it in R and detect all the attributes of an account for which unrealizedPNL

[R] Creating World Map with Points

2015-09-28 Thread Lorenzo Isella
Dear All, Please have a look at the snippet at the end of the email. Essentially, I am trying to combine google maps with ggplot2. The idea is to simply plot some points, whose size depend on a scalar, on a google map. My question is how I can extend the map in the snippet below in order to plot

Re: [R] Sampling the Distance Matrix

2015-09-25 Thread Lorenzo Isella
lt;-(dist(mm)) - David L Carlson Department of Anthropology Texas A University College Station, TX 77840-4352 -Original Message- From: David Winsemius [mailto:dwinsem...@comcast.net] Sent: Thursday, September 24, 2015 6:30 PM To: Lorenzo Isella Cc: David L Carlson; r-help@r-project.org Sub

Re: [R] Sampling the Distance Matrix

2015-09-25 Thread Lorenzo Isella
Apologies for not letting this thread rest in peace. The small script # set.seed(1234) x <- rnorm(20) y <- rnorm(20) goodcls <- apply(mtxcomb , 2, function(idx) all( dist( cbind( x[idx], y[idx]) ) > 0.9)) mycomb <- mtxcomb [ , goodcls]

Re: [R] Sampling the Distance Matrix

2015-09-24 Thread Lorenzo Isella
-boun...@r-project.org] On Behalf Of William Dunlap Sent: Wednesday, September 23, 2015 3:23 PM To: Lorenzo Isella Cc: r-help@r-project.org Subject: Re: [R] Sampling the Distance Matrix mm <- cbind(1/(1:5), sqrt(1:5)) d <- dist(mm) d 1 2 3 4 2 0.6492864 3 0.

Re: [R] Sampling the Distance Matrix

2015-09-24 Thread Lorenzo Isella
On Thu, Sep 24, 2015 at 01:30:02PM -0700, David Winsemius wrote: On Sep 24, 2015, at 12:36 PM, Lorenzo Isella wrote: Hi, And thanks for your reply. Essentially, your script gets the job done. For instance, if I run mm <- cbind(5/(1:5), -2*sqrt(1:5)) dst <- dist(mm) dst2 <- as.m

[R] Sampling the Distance Matrix

2015-09-23 Thread Lorenzo Isella
Dear All, Suppose you have a distance matrix stored like a dist object, for instance x<-rnorm(20) y<-rnorm(20) mm<-as.matrix(cbind(x,y)) dst<-(dist(mm)) Now, my problem is the following: I would like to get the rows of mm corresponding to points whose distance is always larger of, let's say,

[R] Trouble with Caret and C5.0

2015-08-31 Thread Lorenzo Isella
Dear All, I am trying to mine a small dataset. Admittedly, it is a bit odd since it is an example of multi-classification task where I have more than 300 different classes for about 600 observations. Having said that, the problem is not the output of my script, but the fact that it gets stuck,

[R] Problem with gridExtra

2015-08-27 Thread Lorenzo Isella
Dear All, Please consider the snippet at the end of the email, largely based on what you find here http://bit.ly/1ND6MGa When I run it, I get this error Error in arrangeGrob(p, sub = textGrob(Footnote, x = 0, hjust = -0.1, : could not find function textGrob However, the code runs on another

[R] Caret and custom summary function

2015-05-11 Thread Lorenzo Isella
Dear All, I am trying to implement my own metric (a log loss metric) for a binary classification problem in Caret. I must be making some mistake, because I cannot get anything sensible out of it. I paste below a numerical example which should run in more or less one minute on any laptop. When I

[R] Confusing 2 classes in a multiclass problem

2015-05-08 Thread Lorenzo Isella
Dear All, I hope this is not too off topic. Apologies for not sending now any code, but the point is really for me to understand how to proceed. Let's say that you have a multiclass classification problem and the outcome you want to predict is given by 9 different classes {A, B...}. By training

[R] Random Forest in Caret

2015-04-22 Thread Lorenzo Isella
Dear All, I am a bit concerned about the memory consumption of randomForest in caret. This seems to e due to the fact that the option keep.forest=FALSE does not work in caret. Does anybody know a workaround for that? Many thanks Lorenzo __

[R] Caret and Model Prediction

2014-10-05 Thread Lorenzo Isella
Dear All, I am learning the ropes of CARET for automatic model training, more or less following the steps of the tutorial at http://bit.ly/ZJQINa However, there are a few things about which I would like a piece of advice. Consider for instance the following model

Re: [R] Caret and Model Prediction

2014-10-05 Thread Lorenzo Isella
the performance metric you want to find the optimal result. Please see the details of the caret tutorial to see how to. On Sun, Oct 5, 2014 at 8:54 AM, Lorenzo Isella lorenzo.ise...@gmail.com wrote: Dear All, I am learning the ropes of CARET for automatic model training, more or less following the steps

[R] GLM with Numeric and Factor as an Input

2014-02-25 Thread Lorenzo Isella
Dear All, Please consider the snippet at the end of the email. It is representative of the problems I am experiencing. I am trying to use glm (without using the formula interface because the original data is quite large) to model the response in a case where the predictors are a mix of

[R] Random Forest, Variable Mismatch

2014-02-15 Thread Lorenzo Isella
Dear All, I am a bit puzzled. I am developing a random forest model. The data is large and it involves hundred of predictors, but the code I have written is relatively simple. After training my random forest model, I apply it on some new data set to carry out some prediction, as you can see

[R] Reading SDMX Files in R

2014-01-20 Thread Lorenzo Isella
Dear All, I often need to access some data files which are available (also) as files in the SDMX format. For instance, this is the case of the OECD (http://www.oecd.org/) data. In order to automate some processes, I need to retrieve the SDMX data from a url and read the content into an R

[R] 3D Strip Packing and R

2013-12-04 Thread Lorenzo Isella
Dear All, I am struggling with a 3D Strip Packing problem. Briefly: you have a set of boxes (cuboids of variable sizes) that you want to put inside a (large) container of finite width and length (which are larger than those of any of the boxes) and infinite depth. The goal is to minimize the

[R] GADM Data Download

2013-11-27 Thread Lorenzo Isella
Dear All, Please consider the snippet at the end of the email. I often download some maps (in the R format) from http://www.gadm.org/ However, when I run (typically more than once) a variation of the script below (based on http://bit.ly/1b3W0Aa ), I often get Error in

[R] R and Interactive Visualizations

2013-11-22 Thread Lorenzo Isella
Dear All, I use several R libraries (ggplot2, igraph etc...) for producing static visualizations. However, I'd like to be able to go beyond this. Things I may like to be able to achieve (relying on R as much as possible): 1) network visualizations such that when you click on a node, you see

[R] R and Interactive Visualizations

2013-11-22 Thread Lorenzo Isella
Dear All, I use several R libraries (ggplot2, igraph etc...) for producing static visualizations. However, I'd like to be able to go beyond this. Things I may like to be able to achieve (relying on R as much as possible): 1) network visualizations such that when you click on a node, you see its

Re: [R] Download CSV Files from EUROSTAT Website

2013-11-04 Thread Lorenzo Isella
, Oct 31, 2013 at 10:38 AM, Lorenzo Isella lorenzo.ise...@gmail.com wrote: Dear All, I often need to do some work on some data which is publicly available on the EUROSTAT website. I saw several ways to download automatically mainly the bulk data from EUROSTAT to later on postprocess it with R

Re: [R] Download CSV Files from EUROSTAT Website

2013-11-04 Thread Lorenzo Isella
(mytable) mytable[] - lapply(mytable, function(x) gsub(\\(.*\\), , x)) mytable[] - lapply(mytable, function(x) gsub(,, , x)) mytable[] - lapply(mytable, as.numeric) colnames(mytable) - 2000:2013 Hope this helps, Rui Barradas Em 04-11-2013 09:53, Lorenzo Isella escreveu: Hello, And thanks a lot

Re: [R] Download CSV Files from EUROSTAT Website

2013-11-04 Thread Lorenzo Isella
On Mon, 04 Nov 2013 20:26:46 +0100, David Winsemius dwinsem...@comcast.net wrote: On Nov 4, 2013, at 11:03 AM, Lorenzo Isella wrote: Thanks. I had already introduced this minor adjustments in the code, but the real problem (to me) is the information that gets lost: the informative name

[R] Download CSV Files from EUROSTAT Website

2013-10-31 Thread Lorenzo Isella
Dear All, I often need to do some work on some data which is publicly available on the EUROSTAT website. I saw several ways to download automatically mainly the bulk data from EUROSTAT to later on postprocess it with R, for instance http://bit.ly/HrDICj http://bit.ly/HrDL10

[R] Image Classification in R

2013-10-14 Thread Lorenzo Isella
Dear All, For a project I am given a set of images. They represent either healthy or tumoral tissue, but the specific nature of the images does not matter. I need to train a classifier which is expected to tell me in which category (let's call it 0 vs 1) each image falls. I am thinking about

[R] Generation of a Markov Chain

2013-09-17 Thread Lorenzo Isella
Dear All, While looking for a way to generate a Markov chain given a transition matrix, I found this http://bit.ly/1a1CFl8 but the example provided does not work on my machine y-numeric(100) x=matrix(runif(16),4,4) for(i in 2:100) { + y[i]=which(rmultinom(1, size = 1, prob = x[y[i-1],

[R] Markov Decision Process

2013-09-14 Thread Lorenzo Isella
Dear All, I am struggling with the conceptual aspects of a problem. I am sure that someone on this list must be familiar with this. Let's say that you have some cancer data for your patients. In particular, every patient may undergo up to [i.e. the cycles may stop earlier for various reasons] 6

[R] Question About Markov Models

2013-09-01 Thread Lorenzo Isella
Dear All, I am a bit struggling with the many packages for Markov models available in R. Apologies for now posting a code snippet, but I am looking for some guidance here. Please consider a set like the one below (which you can get with

[R] Binomial Regression and nnet

2013-07-04 Thread Lorenzo Isella
Dear All, I am playing with different models/packages (random forest, logistic regression, gbm etc...) for a problem of binomial regression (i.e. the outcome is 0/1, dead or alive etc...). I have used in the past the multinom function from the nnet library which uses the neural networks for

[R] Sparse Matrices and glmnet

2013-06-22 Thread Lorenzo Isella
Dear All, I am not going to discuss in detail the implementation of a model, but I am rather puzzled because, even if I manage to train a model, then I do not succeed in applying it to some test data. I am sure I am making some trivial mistake, but so far I have been banging my head against

[R] Linear Model with Discrete Data

2013-06-13 Thread Lorenzo Isella
Dear All, I am struggling with a linear model and an allegedly trivial data set. The data set does not consist of categorical variables, but rather of numerical discrete variables (essentially, they count the number of times that something happened). Can I still use a standard linear

[R] Classification of Multivariate Time Series

2013-05-27 Thread Lorenzo Isella
Dear All, Apologies for not posting a code snippet, but I really need a pointer about a methodology to look at my data and possibly some R package which can ease my task. I am given a set consisting of several multivariate noisy time series, let's call it {A}. Each A_i in {A}, in turn, consists of

[R] R 3 and Debian Testing

2013-05-05 Thread Lorenzo Isella
Dear All, I am using Debian testing on multiple machine machines at home. This is my source list deb http://ftp.ch.debian.org/debian/ testing main contrib non-free deb-src http://ftp.ch.debian.org/debian/ testing main non-free contrib deb http://security.debian.org/ testing/updates main contrib

Re: [R] Factors and Multinomial Logistic Regression

2013-05-03 Thread Lorenzo Isella
On Thu, 02 May 2013 22:04:26 +0200, peter dalgaard pda...@gmail.com wrote: On May 2, 2013, at 20:33 , Lorenzo Isella wrote: On Wed, 01 May 2013 23:49:07 +0200, peter dalgaard pda...@gmail.com wrote: It still doesn't work! Apologies; since I had already imported nnet in my

  1   2   3   >