I have many pairs of data frames each with about 15 million records each
and about 10 million records in common. They are sorted by two of their
fields and will be merged by those same fields.
The fact that the data are sorted could be used to greatly speed up a
merge, but I have the
On Tue, 13 Jan 2015, Mike Miller wrote:
I have many pairs of data frames each with about 15 million records each and
about 10 million records in common. They are sorted by two of their fields
and will be merged by those same fields.
The fact that the data are sorted could be used to greatly
Hi error message indicates that you have non numeric value in your table/
matrix. Replace missing value by NA and add na.rm= true in your command
prcomp.
Karim
Le 14 janv. 2015 00:27, R Help! emanek...@gmail.com a écrit :
Hello!
I am a beginner to R. I have read several guides, but still am
Thanks for your reply. But I cannot control the data.
I am dealing with real world stream data. It is very normal that the test
data(when you apply model to do prediction) have new values that are not
seen in training data.
If I code myself, I would give a random guess or just an intercept for
sorry I notice the email subject is not accurate.
to be specific, when I do predict, there are error messages like
factor x has new levels 1, 2
Here x is an attribute(independent var), not outcome.
I wonder if the incremental packages (if any) solve this problem? Maybe it
is time to write my
I think it would be nice if predict methods returned NA in appropriate
spots instead of aborting when a categorical predictor contains levels not
found in the training set. It should not be that hard to implement, as the
'xlevels' component of the model is already being used to put factor levels
Folks:
I believe this discussion would be better moved to a statistical
discussion forum, like stats.stackexchange.com ,as it appears to be
all about statistical issues, not R. I do not understand how you can
possibly expect to predict behavior in new categories for which you
have no prior
Although it doesn’t prevent me from using RStudio, there is a bug in the pdf
rendering from the graphics window that mangles the display of graph legends
using base graphics (i.e., when save as pdf is chosen). The same does not
happen when the same graphing code is used with the standard R
I know this must be a wrong method, but I cannot help to ask: Can I only
use the p-value from KS test, saying if p-value is greater than \beta, then
two samples are from the same distribution. If the definition of p-value is
the probability that the null hypothesis is true, then why there's little
Does anyone know of a CRAN package to access Yodlee.com's Aggregation
API[1]? Many thanks -- H
--
OpenPGP: https://hasan.d8u.us/gpg.key
Sent from my mobile device
Envoyé de mon portable
1. http://developer.yodlee.com/Aggregation_API
[[alternative HTML version deleted]]
I am having issues with performance in the R.app console on Mac Yosemite / OS X
10.10.1.
1) While editing commands on the command line, left or right arrow gradually
slows within a session from usable rates to rates like 2/second or slower.
2) Autorepeat works differently for different
I'm having some trouble with Anova() in package car. When the model
formula is explicitly expressed:
library('nlme')
library('car')
fm - lme(distance ~ age + Sex, data = Orthodont, random = ~ 1)
Anova() works fine:
Anova(fm)
However, if the model formula is scanned from an external source:
On Fri, 9 Jan 2015, Duncan Murdoch wrote:
On 09/01/2015 5:32 PM, Erin Hodgess wrote:
Hello again.
Here is another question that I am puzzled about: I had the
(incorrect) impression that if I had Rtools on a Windows machine that I
could use any tar.gz package. However, that is not true.
This sounds more like quality control than hypothesis testing. Rather than
statistical significance, you want to determine what is an acceptable
difference (an 'equivalence margin', if you will). And that is a question
about the application, not a statistical one.
This clearly should go to the r-sig-mac list, not r-help.
Cheers,
Bert
Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374
Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom.
Clifford Stoll
On Tue, Jan 13, 2015 at 10:44 AM, David R
Hello R Psych package users,
Why am I receiving NA for many of the factor scores for individual
observations? I'm assuming it is because there is quite a bit of missing
data (denoted by NA). Are there any tricks in the psych package for getting
a complete set of factor scores?
My input is:
On 12.01.2015 09:01, peter dalgaard wrote:
On 11 Jan 2015, at 11:30 , Duncan Murdoch murdoch.dun...@gmail.com wrote:
- I don't like the tiled display. I find it doesn't give me enough space.
This is a mixed blessing. For teaching purposes, it helps avoid shuffling
windows to uncover
Sorry it was my mistake.
I tried to do like this
rm.outliers = function(model,xsys)
{
rst = rstudent(model)
outliers-vector(numeric,731)
xsys-xsys
for(i in 1:length(rst))
{
if(rst[i]=3 rst[i]=-3) #condition for identifying outlier
{
print(this is not outlier)
Hi
I do not understand what you want to achive with this.
df2$v3 - ifelse(df2$v1 %in% df1$v1 df2$v2==df2$v1, 1, 0).
You compare v1 and v2 from data frame df2 to column v1 in data frame df1?
It is true only in case where df2$v1 equals df2$v2.
In case you mean that you want check equality of
Hi Mark,
Mark Leeds marklee...@gmail.com writes:
Hi All: I have a regular expression problem. If a character string ends
with rhofixed or norhofixed, I want that part of the string to be
removed. If it doesn't end with either of those two endings, then the
result should be the same as the
Apologies for cross-posting
There are 5 remaining seats available on each of the following two courses:
Data exploration, regression, GLM GAM with introduction to R.
2 - 6 February 2015. Coimbra, Portugal
Introduction to Linear mixed effects models, GLMM and MCMC with R
9-13 February 2015.
Thanks for your reply Sarah.
I am using R on Windows 7 professional, 64-bit-OS (on my local machine).
setOutputColors doesn’t work in Windows however I came across this post
mentioning the package colorout, which seems however not to be available from
CRAN:
On 13/01/2015 2:52 AM, Ingrid Charvet wrote:
Thanks for your reply Sarah.
I am using R on Windows 7 professional, 64-bit-OS (on my local machine).
setOutputColors doesn’t work in Windows however I came across this post
mentioning the package colorout, which seems however not to be
Greetings! I am analysing my data and checking for gaps over 1 hour and
come across the problem that R tells me that there are many 2 hour gaps.
When I go into details, I don't see any of these lapses. Is it something
with the diff() function?
You can see what I am saying, below:
On 13/01/2015 3:37 AM, Jue Lin-Ye wrote:
Greetings! I am analysing my data and checking for gaps over 1 hour and
come across the problem that R tells me that there are many 2 hour gaps.
When I go into details, I don't see any of these lapses. Is it something
with the diff() function?
You
Thank you very much it works - I just needed to save my preferences to the R
console when using the GUI and it automatically loads the right colors each
time I open R!
-Original Message-
From: Duncan Murdoch [mailto:murdoch.dun...@gmail.com]
Sent: 13 January 2015 11:23
To: Ingrid
Hi
You are bitten by DST - daylight savings time. You need to use either timezones
which do not have DST or adopt your code to the fact that twice a year one hour
takes 2 hour and one hour is missing.
see CEST and CET difference
time
[1] 1999-10-31 01:00:00 CEST 1999-10-31 02:00:00 CEST
[3]
Hi,
also intersection could be a good measure,
my set when plotted look like in the picture in attach :
https://www.dropbox.com/sh/j68oa80ihn95s04/AADTOt_GYF1JHrmZU2129y__a?dl=0
thanks a lot
max
On 12/01/15 19:44, Bert Gunter wrote:
?intersect
or, more generally,
?match
Cheers,
Bert
Bert
I think the OP does not want to list duplicate records. Perhaps
merge(unique(df1), df2, all.y=TRUE)
v1 v2 ind
1 1 83 1
2 1 84 1
3 2 83 NA
4 2 84 NA
5 3 83 NA
6 3 84 NA
7 4 83 NA
8 4 84 NA
-
David L Carlson
Department of Anthropology
Texas
Dear R-List,
I receive a strange error message when starting R. It says:
Warning message:
package methods in options(defaultPackages) was not found
Error in file(filename, r) : cannot open the connection
In addition: Warning message:
In file(filename, r) :
cannot open file
Dear John,
Thanks a lot for the quick response and fix! I'm looking forward to
try out the development version. I assume that the fix will be
released in the official version at some point.
Thanks again,
Gang
On Tue, Jan 13, 2015 at 5:06 PM, John Fox j...@mcmaster.ca wrote:
Dear Gang,
The
I don't know why the R developers made that comment, and R-devel is probably a
better place to follow up, but the usual problem is that Windows treats text
files differently than binary files, so seeking n text files is a headache.
Binary files ought to be okay, but that is a theoretical
I/we've been utilizing both read and write seek():s on *binary*
connections across platforms and file systems, including Windows (at
least NTFS, but probably also FAT/FAT32 back in the days) in the Aroma
Framework (e.g. affxparser, R.huge) for ~8 years and counting. There
should be thousands and
Thanks, everyone. This is very good news from Henrik because I am
interested only in binary connections. It sounds like a function that
uses seek() is very likely to work well in Windows, so I won't bother to
warn people. I should do a little testing just to see that it's working,
though.
Dear Gang,
The problem was in the model.matrix.lme() method provided by the car
package, and is now fixed in the development version of the car package on
R-Forge. You should be able to install it from there via
install.packages(car, repos=http://R-Forge.R-project.org;) after the
package is next
On Tue, Jan 13, 2015 at 2:05 PM, Mike Miller mbmille...@gmail.com wrote:
Thanks, everyone. This is very good news from Henrik because I am
interested only in binary connections. It sounds like a function that uses
seek() is very likely to work well in Windows, so I won't bother to warn
Hello!
I am a beginner to R. I have read several guides, but still am stuck on
this:
I have data in an excel csv file, on which I want to run PCA.
I'm not sure how the prcomp formula works. The help page states:
prcomp(x, retx = TRUE, center = TRUE, scale. = FALSE,
tol = NULL, ...)
what is x
I'd you keep reading the help file, the answer to your question is right there.
As for step by step... only you know what your data looks like. There are
various pitfalls one can encounter in getting data from a file into an object
in memory, but the basic idea is to use the read.csv function,
38 matches
Mail list logo