(first.col.val, reps-1), mat.clean)
On Fri, 31 Jan 2020 at 20:31, Berry, Charles
wrote:
>
>
> > On Jan 31, 2020, at 1:04 AM, Emmanuel Levy
> wrote:
> >
> > Hi,
> >
> > I'd like to use the Netflix challenge data and just can't figure out how
> to
Hi,
I'd like to use the Netflix challenge data and just can't figure out how to
efficiently "scan" the files.
https://www.kaggle.com/netflix-inc/netflix-prize-data
The files have two types of row, either an *ID* e.g., "1:" , "2:", etc. or
3 values associated to each ID:
The format is as
'empty'
> data.frames, which could be either.)
>
>
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
> On Wed, Nov 2, 2016 at 6:48 AM, Emmanuel Levy <emmanuel.l...@gmail.com>
> wrote:
>
>> Dear All,
>>
>> This sounds simple but can'
Dear All,
This sounds simple but can't figure out a good way to do it.
Let's say that I have an empty data frame "df":
## creates the df
df = data.frame( id=1, data=2)
## empties the df, perhaps there is a more elegant way to create an empty
df?
df = df[-c(1),]
> df
[1] id data
<0 rows> (or
for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey
2015-07-21 15:43 GMT+02:00 Emmanuel Levy emmanuel.l...@gmail.com:
Thanks! -- this is indeed much faster (plus I made a mistake, one has to
use paste with the option collapse
Hi,
The answer to this is probably straightforward, I have a dataframe and I'd
like to build an index of column combinations, e.g.
col1 col2 -- col3 (the index I need)
A 1 1
A 1 1
A 2 2
B 1 3
B 2 4
B 2 4
At
Hi,
That sounds simple but I cannot think of a really fast way of getting
the following:
c(1,1,2,2,3,3,4,4) would give c(1,3,5,7)
i.e., a function that returns the indexes of the first occurrences of numbers.
Note that numbers may have any order e.g., c(3,4,1,2,1,1,2,3,5), can
be very large,
Hi,
I've had this problem for a while and tackled it is a quite dirty way
so I'm wondering is a better solution exists:
If we have two vectors:
v1 = c(0,1,2,3,4)
v2 = c(5,3,2,1,0)
How to remove one instance of the 3,1 / 1,3 double?
At the moment I'm using the following solution, which is
I did not know that unique worked on entire rows!
That is great, thank you very much!
Emmanuel
On 27 December 2012 22:39, Marc Schwartz marc_schwa...@me.com wrote:
unique(t(apply(cbind(v1, v2), 1, sort)))
__
R-help@r-project.org mailing list
Hello,
The heatmap function conveniently has a reorder.dendrogram function
so that clusters follow a certain logic.
It seems that the hclust function doesn't have such feature. I can use
the reorder function on the dendrogram obtained from hclust, but
this does not modify the hclust object
wrote:
I don't have a general answer to your question, but 1L and 2L are just the
integers 1 and 2 (the L makes them integers instead of doubles which is
useful for some things)
Michael
On May 11, 2012, at 2:15 PM, Emmanuel Levy emmanuel.l...@gmail.com wrote:
Hello,
The heatmap function
Hi,
I have a three dimensional array, e.g.,
my.array = array(0, dim=c(2,3,4), dimnames=list( d1=c(A1,A2),
d2=c(B1,B2,B3), d3=c(C1,C2,C3,C4)) )
what I would like to get is then a dataframe:
d1 d2 d3 value
A1 B1 C1 0
A2 B1 C1 0
.
.
.
A2 B3 C4 0
I'm sure there is one function to do this
OK, it seems that the array2df function from arrayhelpers package does
the job :)
On 19 April 2012 16:46, Emmanuel Levy emmanuel.l...@gmail.com wrote:
Hi,
I have a three dimensional array, e.g.,
my.array = array(0, dim=c(2,3,4), dimnames=list( d1=c(A1,A2),
d2=c(B1,B2,B3), d3=c(C1,C2,C3,C4
return(c(Xtrans,Ytrans))
}
On 12 March 2012 20:58, David Winsemius dwinsem...@comcast.net wrote:
On Mar 12, 2012, at 3:07 PM, Emmanuel Levy wrote:
Hi Jeff,
Thanks for your reply and the example.
I'm not sure if it could be applied to the problem I'm facing though,
for two reasons:
(i
O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#. rocks...1k
---
Sent from my phone. Please excuse my brevity.
Emmanuel Levy emmanuel.l...@gmail.com wrote:
Dear Jeff,
I'm sorry but I
. That is not what you say you want,
so these approaches are unlikely to work.
-- Bert
On Sat, Mar 10, 2012 at 6:20 PM, Emmanuel Levy emmanuel.l...@gmail.com
wrote:
Hi,
I'm wondering which function would allow fitting this type of data:
tmp=rnorm(2000)
X.1 = 5+tmp
Y.1 = 5+ (5*tmp+rnorm(2000
Hi,
I am trying to normalize some data. First I fitted a principal curve
(using the LCPM package), but now I would like to apply a
transformation so that the curve becomes a straight diagonal line on
the plot. The data used to fit the curve would then be normalized by
applying the same
Hi,
I'm trying to normalize data by fitting a line through the highest density
of points (in a 2D plot).
In other words, if you visualize the data as a density plot, the fit I'm
trying to achieve is the line that goes through the crest of the mountain.
This is similar yet different to what LOESS
Hi,
I posted a message earlier entitled How to fit a line through the
Mountain crest ...
I figured loess is probably the best way, but it seems that the
problem is the robustness of the fit. Below I paste an example to
illustrate the problem:
tmp=rnorm(2000)
X.background = 5+tmp;
as a reply to the second post.
All the best,
Emmanuel
On 10 March 2012 19:46, David Winsemius dwinsem...@comcast.net wrote:
On Mar 10, 2012, at 3:55 PM, Emmanuel Levy wrote:
Hi,
I'm trying to normalize data by fitting a line through the highest density
of points (in a 2D plot).
In other
))) )
On 10 March 2012 18:30, Emmanuel Levy emmanuel.l...@gmail.com wrote:
Hi,
I posted a message earlier entitled How to fit a line through the
Mountain crest ...
I figured loess is probably the best way, but it seems that the
problem is the robustness of the fit. Below I paste an example
Hi,
I'm wondering which function would allow fitting this type of data:
tmp=rnorm(2000)
X.1 = 5+tmp
Y.1 = 5+ (5*tmp+rnorm(2000))
tmp=rnorm(100)
X.2 = 9+tmp
Y.2 = 40+ (1.5*tmp+rnorm(100))
X.3 = 7+ 0.5*runif(500)
Y.3 = 15+20*runif(500)
X = c(X.1,X.2,X.3)
Y =
Dear All,
I would like to generate random protein sequences using a HMM model.
Has anybody done that before, or would you have any idea which package
is likely to be best for that?
The important facts are that the HMM will be fitted on ~3 million
sequential observations, with 20 different states
Hello,
This sounds like a problem to which many solutions should exist, but I
did not manage to find one.
Basically, given a list of datapoints, I'd like to keep those within
the X% percentile highest density.
That would be equivalent to retain only points within a given line of
a contour plot.
help,
Emmanuel
On 19 November 2010 21:25, David Winsemius dwinsem...@comcast.net wrote:
On Nov 19, 2010, at 8:44 PM, Emmanuel Levy wrote:
Hello,
This sounds like a problem to which many solutions should exist, but I
did not manage to find one.
Basically, given a list of datapoints, I'd
Hello Roger,
Thanks for the suggestions.
I finally managed to do it using the output of kde2d - The code is
pasted below. Actually this made me realize that the outcome of kde2d
can be quite influenced by outliers if a boundary box is not given
(try running the code without the boundary box,
Hi,
The pdf function would not let me change the paper size and gives me
the following warning:
pdf(figure.pdf, width=6, height=10)
Warning message:
‘mode(width)’ and ‘mode(height)’ differ between new and previous
== NOT changing ‘width’ ‘height’
If I use the option paper = a4r,
Update - sorry for the stupid question, let's say it's pretty late.
For those who may be as tired as I am and get the same warning, the
paper size should be given as an integer!
On 16 November 2010 04:17, Emmanuel Levy emmanuel.l...@gmail.com wrote:
Hi,
The pdf function would not let me
Dear All,
I cannot find a solution to the following problem although I imagine
that it is a classic, hence my email.
I have a vector V of X values comprised between 1 and N.
I would like to get random samples of X values also comprised between
1 and N, but the important point is:
* I would like
Dear All,(my apologies if it got posted twice, it seems it didn't
get through)
I cannot find a solution to the following problem although I suppose
this is a classic.
I have a vector V of X=length(V) values comprised between 1 and N.
I would like to get random samples of X values also
solve it.
Many thanks!
Emmanuel
PS: I apologize that I sent a second post. This one did not appear in
my R-help label so I assumed it wasn't sent for some reason.
2009/8/12 Ted Harding ted.hard...@manchester.ac.uk:
On 12-Aug-09 22:05:24, Emmanuel Levy wrote:
Dear All,
I cannot find
with this problem? Or even better of a package?
Thanks for your help,
Emmanuel
2009/8/12 Nordlund, Dan (DSHS/RDA) nord...@dshs.wa.gov:
-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Emmanuel Levy
Sent: Wednesday, August 12, 2009
But if the 1st order differences are the same, then doesn't it follow that
the 2nd, 3rd, ... order differences must be the same between the original and
the new random vector. What am I missing?
You are missing nothing sorry, I wrote something wrong. What I would
like to be preserved is the
Dear all,
I have been using normalize.loess and I get the following error
message when my matrix contains NA values:
my.mat = matrix(nrow=100, ncol=4, runif(400) )
my.mat[1,1]=NA
my.mat.n = normalize.loess(my.mat, verbose=TRUE)
Done with 1 vs 2 in iteration 1
Done with 1 vs 3 in iteration 1
Dear Brian, Mose, Peter and Stefan,
Thanks a lot for your replies - the issues are now clearer to me. (and
I apologize for not using the appropriate list).
Best wishes,
Emmanuel
2008/11/19 Peter Dalgaard [EMAIL PROTECTED]:
Stefan Evert wrote:
On 19 Nov 2008, at 07:56, Prof Brian Ripley
Dear All,
I just read an announcement saying that Mathematica is launching a
version working with Nvidia GPUs. It is claimed that it'd make it
~10-100x faster!
http://www.physorg.com/news146247669.html
I was wondering if you are aware of any development going into this
direction with R?
Thanks
Dear All,
I have a long string and need to search for regular expressions in
there. However it becomes horribly slow as the string length
increases.
Below is an example: when i increases by 5, the time spent increases
by more! (my string is 11,000,000 letters long!)
I also noticed that
- the
Hi Chuck,
Thanks a lot for your suggestion.
You can find all such matches (not just the disjoint ones that gregexpr
finds) using something like this:
twomatch -function(x,y) intersect(x+1,y)
match.list -
list(
which( vec %in% c(3,6,7) ),
Dear All,
I hope the title speaks by itself.
I believe that there should be a solution when I see what Mclust is
able to do. However, this problem is quite
particular in that d3 is not known and does not necessarily correspond
to a common distribution (e.g. normal, exponential ...).
However it
this would be great; is it possible to
somehow force the parameters (e.g variance) to be
greater than a particular threshold?
Thanks,
Emmanuel
2008/10/20 Emmanuel Levy [EMAIL PROTECTED]:
Dear list members,
I am using Mclust in order to deconvolute a distribution that I
believe is a sum of two
=c(0,0.4) )
axis(side=1)
for (i in 1:2) {
ni - v$parameters$pro[i]*dnorm(x0,
mean=as.numeric(v$parameters$mean[i]),sd=1)
lines(x0,ni,col=1)
nt - nt+ni
}
lines(x0,nt,lwd=3)
segments(my.data,0,my.data,0.02)
Best,
Emmanuel
2008/10/21 Emmanuel Levy [EMAIL PROTECTED]:
After playing
Dear All,
I have a distribution of values and I would like to assess the
uni/bimodality of the distribution.
I managed to decompose it into two normal distribs using Mclust, and
the BIC criteria is best for two parameters.
However, the problem is that the BIC criteria is not a P-value, which
I
Hi Duncan,
I'm really stupid --- yes of course!!
Thanks for pointing me out the (now) obvious.
All the best,
E
2008/10/21 Duncan Murdoch [EMAIL PROTECTED]:
On 10/21/2008 2:56 PM, Emmanuel Levy wrote:
Dear All,
I have a distribution of values and I would like to assess the
uni/bimodality
Dear list members,
I am using Mclust in order to deconvolute a distribution that I
believe is a sum of two gaussians.
First I can make a model:
my.data.model = Mclust(my.data, modelNames=c(E), warn=T, G=1:3)
But then, when I try to plot the result, I get the following error:
to encoding of strings.
D.
Emmanuel Levy wrote:
Dear list members,
I encountered this problem and the solution pointed out in a previous
thread did not work for me.
(e.g. install.packages(RCurl, repos = http://www.omegahat.org/R;)
I work with Ubuntu Hardy, and installed R 2.6.2 via apt-get.
I
Dear list members,
I encountered this problem and the solution pointed out in a previous
thread did not work for me.
(e.g. install.packages(RCurl, repos = http://www.omegahat.org/R;)
I work with Ubuntu Hardy, and installed R 2.6.2 via apt-get.
I really need RCurl in order to use biomaRt ...
PM, Peter Cowan [EMAIL PROTECTED] wrote:
Emmanuel,
On Tue, Aug 12, 2008 at 4:35 PM, Emmanuel Levy [EMAIL PROTECTED] wrote:
Dear All,
I have a large data frame ( 270 lines and 14 columns), and I would like
to
extract the information in a particular way illustrated below:
Given a data
that the split and hash.mat functions.
Thanks for your help,
Emmanuel
2008/8/13 Erik Iverson [EMAIL PROTECTED]:
I still don't understand what you are doing. Can you make a small example
that shows what you have and what you want?
Is ?split what you are after?
Emmanuel Levy wrote:
Dear
a small example
that shows what you have and what you want?
Is ?split what you are after?
Emmanuel Levy wrote:
Dear Peter and Henrik,
Thanks for your replies - this helps speed up a bit, but I thought
there would be something much faster.
What I mean is that I thought that a particular
Dear All,
I have a large data frame ( 270 lines and 14 columns), and I would like to
extract the information in a particular way illustrated below:
Given a data frame df:
col1=sample(c(0,1),10, rep=T)
names = factor(c(rep(A,5),rep(B,5)))
df = data.frame(names,col1)
df
names col1
1
Dear All,
I'm sure this is not the first time this question comes up but I
couldn't find the keywords that would point me out to it - so
apologies if this is a re-post.
Basically I've got thousands of points, each depending on three variables:
x, y, and z.
if I do a plot(x,y, col=z), I get
PROTECTED] On
Behalf Of Emmanuel Levy
Sent: Wednesday, March 19, 2008 12:42 PM
To: r-help@r-project.org
Subject: [R] Smoothing z-values according to their x, y positions
Dear All,
I'm sure this is not the first time this question comes up but I
couldn't find the keywords that would
looked yet at the locfit package as it is not installed, but
I will check it out!
Thanks for helping!
Emmanuel
On 20/03/2008, David Winsemius [EMAIL PROTECTED] wrote:
Emmanuel Levy [EMAIL PROTECTED] wrote in
news:[EMAIL PROTECTED]:
Dear Bert,
Thanks for your reply - I indeed saw
53 matches
Mail list logo