From the documentation I have found, it seems that one of the functions from
package plyr, or a combination of functions like split and lapply would
allow me to have a really short R script to analyze all my data (I have
reduced it to a couple hundred thousand records with about half a dozen
Iverson er...@ccbr.umn.edu wrote:
Your code is not reproducible. Can you come up with a small example
showing the crux of your data structures/problem, that we can all run in our
R sessions? You're likely get much higher quality responses this way.
Ted Byers wrote:
From the documentation I
On Mon, Jul 12, 2010 at 4:02 PM, jim holtman jholt...@gmail.com wrote:
try 'drop=TRUE' on the split function call. This will prevent the
NULL set from being sent to the function.
On Mon, Jul 12, 2010 at 3:10 PM, Ted Byers r.ted.by...@gmail.com wrote:
From the documentation I have found
The data.frame is constructed by one of the following functions:
funweek - function(df)
if (length(df$elapsed_time) 5) {
rv = fitdist(df$elapsed_time,exp)
rv$year = df$sale_year[1]
rv$sample = df$sale_week[1]
rv$granularity = week
rv
}
funmonth - function(df)
if
rbind after the loop on the list of such data.frames?
Thanks again,
Ted
On Thu, Jul 15, 2010 at 3:27 PM, Marc Schwartz marc_schwa...@me.com wrote:
On Jul 15, 2010, at 2:18 PM, Ted Byers wrote:
The data.frame is constructed by one of the following functions:
funweek - function(df
]]
NULL
will preallocate a list of 5 elements, each of which can then be indexed to
contain a data frame that is a result of your looping operation.
HTH,
Marc
On Jul 15, 2010, at 2:58 PM, Ted Byers wrote:
Thanks Marc
The next part of the question, though, involves the fact
I must have missed something simple, but still, i don't know what.
I obtained my basic data as follows:
x - sprintf(SELECT m_id,sale_date,YEAR(sale_date) AS
sale_year,WEEK(sale_date) AS sale_week,return_type,0.0001 +
DATEDIFF(return_date,sale_date) AS elapsed_time FROM
`merchants2`.`risk_input`
Hi Steve,
Thanks
Here is a tiny subset of the data:
dput(head(moreinfo, 40))
structure(list(m_id = c(171, 206, 206, 206, 206, 206, 206, 218,
224, 224, 227, 229, 229, 229, 229, 229, 229, 229, 229, 233, 233,
238, 238, 251, 251, 251, 251, 251, 251, 251, 251, 251, 251, 251,
251, 251, 251, 251, 251,
I would have thought this to be relatively elementary, but I can't find it
mentioned in any of my stats texts.
Please consider the following:
library(fitdistrplus)
fp = fitdist(y,exp);
rate = fp$estimate;
sd = fp$sd
fOneWeek = exp(-rate*7); #fraction that happens within a week - y is
I am feeling rather dumb right now.
I created what I thought was a data.frame as follows:
aaa - lapply(split(moreinfo,list(moreinfo$m_id),drop = TRUE), fun_m_id)
m_id_default_res - do.call(rbind, aaa)
print(==)
m_id_default_res
My last query related to this referred to a problem with not being able to
store data. A suggestion was made to try to convert the data returned by
fitdist into a data.frame before using rbind. That failed, but provided the
key to solving the problem (which was to create a data.frame using the
Here is the function that makes the data.frames in the list:
funweek - function(df)
if (length(df$elapsed_time) 5) {
res = fitdist(df$elapsed_time,exp)
year = df$sale_year[1]
sample = df$sale_week[1]
mid = df$m_id[1]
estimate = res$estimate
sd = res$sd
samplesize =
I have R 2.10.1 and 2.9.1 installed, and both have RMySQL packages
installed.
I script I'd developed using an older version (2.8.?, I think) used RMySQL
too and an older version of MySQL (5.0.?), and worked fine at that time
(about a year and a half ago +/- a month or two).
But now, when I run
I tend to have a lot of packages installed, in part because of a wide
diversity of interests and a disposition of examining different ways to
accomplish a given task.
I am looking for a better way to upgrade all my packages when I upgrade the
version of R that I am running.
On looking at support
.
Thanks
Ted
On Thu, Apr 29, 2010 at 4:59 PM, Erik Iverson er...@ccbr.umn.edu wrote:
Ted Byers wrote:
I tend to have a lot of packages installed, in part because of a wide
diversity of interests and a disposition of examining different ways to
accomplish a given task.
I am looking for a better
I started a brand new session in R 2.10.1 (on Windows).
If it matters, I am running the community edition of MySQL 5.0.67, and it is
all running fine.
I am just beginning to examine the process of getting timer series data from
one table in MySQL, computing moving averages and computing a
I am looking at a new project involving time series analysis. I know I can
complete the tasks involving VARMA using either dse or mAr (and I think
there are a couple others that might serve).
However, there is one task that I am not sure of the best way to proceed.
A simple example illustrates
I have not found anything about this except the following from the DBI
documentation :
Bind variables: the interface is heavily biased towards queries, as opposed
to general
purpose database development. In particular we made no attempt to define
“bind
variables”; this is a mechanism by which
To: Ted Byers
Cc: R-help Forum
Subject: Re: [R] Can RMySQL be used for a paramterized query?
I think you can do this:
ids - dbGetQuery(conn, SELECT id FROM my_table) other_table -
dbGetQuery(conn, sprintf(SELECT * FROM my_other_table WHERE t1_id in
(%s), paste(ids, collapse
OK, I have managed to use some of the basic processes of getting data from
my DB, passing it as a whole to something like fitdistr, c. I know I can
implement most of what I need using a brute force algorithm based on a
series of nested loops. I also know I can handle some of this logic in a
I suspect I'm looking in the wrong places, so guidance to the relevant
documentation would be as welcome as a little code snippet.
I have time series data stored in a MySQL database. There is the usual DATE
field, along with a double precision number: there are daily values
(including only
I have hundreds of megabytes of price data time series, and perl
scripts that extract it to tab delimited files (I have C++ programs
that must analyse this data too, so I get Perl to extract it rather
than have multiple connections to the DB).
I can read the data into an R object without any
Hi Mark
Thanks for replying.
Here is a short snippet that reproduces the problem:
library(PerformanceAnalytics)
thedata = read.csv(K:\\Work\\SignalTest\\BP.csv, sep = \t, header
= FALSE, na.strings=)
thedata
x = as.timeseries(thedata)
x
table.Drawdowns(thedata,top = 10)
Hi David,
Thanks for replying.
On Fri, Jul 3, 2009 at 8:08 PM, David Winsemiusdwinsem...@comcast.net wrote:
On Jul 3, 2009, at 7:34 PM, Ted Byers wrote:
Hi Mark
Thanks for replying.
Here is a short snippet that reproduces the problem:
library(PerformanceAnalytics)
thedata = read.csv(K
Hi Gabor, Thanks.
On Fri, Jul 3, 2009 at 8:25 PM, Gabor
Grothendieckggrothendi...@gmail.com wrote:
# 1. You can directly read your data into a zoo series like this:
Lines - 8190 2009-06-16 49.30
8191 2009-06-17 48.40
8192 2009-06-18 47.72
8193 2009-06-19 48.83
8194 2009-06-22 46.85
8195
Hi Mark,
Thanks.
Your example works fine. But I see you're struggling with the same
issue that I am. I also see the format of the dates in the dataset
you use in your example is the same format that my dates are in.
I just read it, so I haven't had a chance to investigate, but you
might take
Sorry, I should have read the read.zoo documentation before replying
to thank Gabor for his repsonse.
Here is how it starts:
read.zoo(zoo) R Documentation
Reading and Writing zoo Series
Description
read.zoo and write.zoo are convenience functions for reading and
writing zoo series from/to text
On Fri, Jul 3, 2009 at 9:05 PM, Mark Knechtmarkkne...@gmail.com wrote:
On Fri, Jul 3, 2009 at 5:54 PM, Ted Byersr.ted.by...@gmail.com wrote:
Sorry, I should have read the read.zoo documentation before replying
to thank Gabor for his repsonse.
Here is how it starts:
read.zoo(zoo) R
Erin,
I trust you know what you risk when you assume. ;-)
There IS a license, but it basically lets you copy or distribute it, or, in
your case, install on as many machines as you wish. It is the GNU GENERAL
PUBLIC LICENSE.
Like most open source software I use, the Gnu license is in place
I found this a few months ago, but for the life of me I can't remember what
the function or package was, and I have had no luck finding it this week.
I have found, again, the functions for working with distributions like
Cauchy, F, normal, c., and ks.test, but I have not found the functions for
it to the output from these two functions)? I just read
the help provided for each and neither mentions AIC.
Thanks again Ben
Ted
Ben Bolker wrote:
Ted Byers r.ted.byers at gmail.com writes:
I found this a few months ago, but for the life of me I can't remember
what
the function
I have a situation where there ae two kinds of events: A and B. B does not
occur without A occuring first, and a percentage of A events lead to an
event B some time later and the remaining ones do not. I have n independant
samples, with a frequency of events B by week, until event B for a given
I found it easy to use R when typing data manually into it. Now I need to
read data from a file, and I get the following errors:
refdata =
read.table(K:\\MerchantData\\RiskModel\\refund_distribution.csv, header
= TRUE)
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines,
Thanks one and all.
Actually, I used OpenOffice's spreadsheet to creat the csv file, but I have
been using it long enough to know to specify how I wanted it, and sometimes,
when that proves annoying, I'll use Perl to finess it the way I want it.
It seems my principle error was to assume that it
items
[1] 1 2 3
On Sun, Sep 21, 2008 at 9:01 PM, Ted Byers [EMAIL PROTECTED] wrote:
I have a number of files containing anywhere from a few dozen to a few
thousand integers, one per record.
The statement refdata18 =
read.csv(K:\\MerchantData\\RiskModel\\Capture.Week.18.csv, header =
TRUE
/21/2008 08:01 PM Ted Byers wrote:
I have a number of files containing anywhere from a few dozen to a few
thousand integers, one per record.
The statement refdata18 =
read.csv(K:\\MerchantData\\RiskModel\\Capture.Week.18.csv, header =
TRUE,na.strings=) works fine, and if I type refdata18, I get
OK, I am now at the point where I can use fitdistr to obtain a fit of one of
the standard distributions to mydata.
It is quite remarkable how different the parameters are for different
samples through from the same system. Clearly the system itself is not
stationary.
Anyway, question 1: I
I am in a situation where I have to fit a distrution, such as cauchy or
normal, to an empirical dataset. Well and good, that is easy.
But I wanted to assess just how good the fit is, using ks.test.
I am concerned about the following note in the docs (about the example
provided): Note that the
Cummings Center, Suite 2450
Beverly, MA 01915
www.agencourt.com
On Mon, Sep 22, 2008 at 12:26 PM, Ted Byers [EMAIL PROTECTED] wrote:
I am in a situation where I have to fit a distrution, such as cauchy or
normal, to an empirical dataset. Well and good, that is easy.
But I wanted to assess
Can anyone explain such different output:
stableFit(s,alpha = 1.75, beta = 0, gamma = 1, delta = 0,
+ type = c(q, mle), doplot = TRUE, trace = FALSE, title = NULL,
+ description = NULL)
Title:
Stable Parameter Estimation
Call:
.qStableFit(x = x, doplot = doplot, title = title,
I have been reading, in various sources, that a poisson distribution is
related to binomial, extending the idea to include numbers of events in a
given period of time.
In my case, the hypergeometric distribution seems more appropriate, but I
need a temporal dimension to the distribution.
I have
I have weekly samples of two kinds of events: call them A and B. I have a
count of A events. These
change dramatically from one week to the next. I also have weekly counts
of B events that I can relate
to A events. Some fraction 'lambda' (between 1 and 1) of A events will
result in B
I just thought of a useful metaphore for the problem I face. I am dealing
with a problem in business finance, with two kinds of related events.
However, imagine you have a known amount of carbon (so many kilograms), but
you do not know what fraction is C14 (and thus radioactive). Only the C14
I am trying something I haven't attempted before and the available
documentation doesn't quite answer my questions (at least in a way I can
understand). My usual course of action would be to extract my data from my
DB, do whatever manipulation is necessary, either manually or using a C++
Getting the basic stuff to work is trivially simple. I can connect, and, for
example, get everything in any given table.
What I have yet to find is how to deal with parameterized queries or how to
do a simple insert (but not of a value known at the time the script is
written - I ultimately want
Thanks Jeffrey and Barry,
I like the humour. I didn't know about xkcd.com, but the humour on it is
familiar. I saw little Bobby Tables what seems like eons ago, when I first
started cgi programming.
Anyway, I recognized the risk of an injection attack with this use of
sprint, but in this
://gsubfn.googlecode.com
On Tue, Oct 14, 2008 at 5:32 PM, Jeffrey Horner
[EMAIL PROTECTED] wrote:
Ted Byers wrote on 10/14/2008 02:33 PM:
Getting the basic stuff to work is trivially simple. I can connect,
and,
for
example, get everything in any given table.
What I have yet to find is how
In the example in the documentation, I see:
rs - dbSendQuery(con,
select Agent, ip_addr, DATA from pseudo_data order by Agent)
out - dbApply(rs, INDEX = Agent,
FUN = function(x, grp) quantile(x$DATA, names=FALSE))
Maybe I am a bit thick, but it took me a while, and a kind
I have examined the documentation for batch mode use of R:
R CMD BATCH [options] infile [outfile]
The documentation for this seems rather spartan.
Running R CMD BATCH --help gives me info on only two options: one for
getting help and the other to get the version. I see, further on, that
Here is what I tried:
optdata =
read.csv(K:\\MerchantData\\RiskModel\\AutomatedRiskModel\\soptions.dat,
header = FALSE, na.strings=)
optdata
attach(optdata)
for (i in 1:length(V4) ) { x = read.csv(V4[[i]], header = FALSE,
na.strings=);x }
And here is the outcome (just a few of the 60 records
. Not a list! Use single brackets --- V4[i] ---
and all will be well.
cheers,
Rolf
On Wed, Oct 15, 2008 at 4:46 PM, Ted Byers [EMAIL PROTECTED] wrote:
Here is what I tried:
optdata =
read.csv(K:\\MerchantData\\RiskModel\\AutomatedRiskModel\\soptions.dat,
header
it there also.
On Wed, Oct 15, 2008 at 5:36 PM, Ted Byers [EMAIL PROTECTED] wrote:
Actually, I'd tried single brackets first. Here is what I got:
for (i in 1:length(V4) ) { x = read.csv(V4[i], header = FALSE,
na.strings=);x }
Error in read.table(file = file, header = header, sep = sep, quote
Here is my little scriptlet:
optdata =
read.csv(K:\\MerchantData\\RiskModel\\AutomatedRiskModel\\soptions.dat,
header = FALSE, na.strings=)
attach(optdata)
library(MASS)
setwd(K:\\MerchantData\\RiskModel\\AutomatedRiskModel)
for (i in 1:length(V4) ) {
x = read.csv(as.character(V4[[i]]),
PM, Gabor Grothendieck
[EMAIL PROTECTED] wrote:
Put the data in an R data frame and use dbWriteTable() to
write it to your MySQL database directly.
On Wed, Oct 15, 2008 at 9:34 PM, Ted Byers [EMAIL PROTECTED] wrote:
Here is my little scriptlet:
optdata =
read.csv(K:\\MerchantData\\RiskModel
Thank you Prof. Ripley.
I appreciate this.
Have a good day.
Ted
On Thu, Oct 16, 2008 at 12:20 AM, Prof Brian Ripley
[EMAIL PROTECTED] wrote:
On Wed, 15 Oct 2008, Ted Byers wrote:
Thanks Jim,
I hadn't seen the distinction between the commandline in RGui and what
happens within my code
dbWriteTable(..., append = TRUE)
On Wed, Oct 15, 2008 at 11:54 PM, Ted Byers [EMAIL PROTECTED] wrote:
Thanks Gabor,
I get how to make a frame using existing vectors. In my example, the
following puts my first three columns into a frame (and displays it:
testframe - data.frame(mid=V1,year=V2,week=V3
I thought I was finished, having gotten everything to work as intended. This
is a model of risk, and the short term forecasts look very good, given the
data collected after the estimates are produced (this model is intended to
be executed daily, to give a continuing picture of our risk). But
Define better.
Really, it depends on what you need to do (are all your data appropriately
represented in a 2D array?) and what resources are available. If all your
data can be represented using a 2D array, then Excel is probably your best
bet for th enear term. If not, you might as well bite
There are tradeoffs no matter what route you take.
I worked on a project a few years ago, repairing an MS Access DB that had
been constructed, data entry forms and all, by one of the consulting
engineers. They supported that development because they found that even
with all the power and
most of those things.
Gabor Grothendieck wrote:
On Tue, Oct 21, 2008 at 3:18 PM, Ted Byers [EMAIL PROTECTED] wrote:
There are tradeoffs no matter what route you take.
You can do validation in Access as you can in Excel, but Excel is not
designed to manage data where Access is, and both
intended market: nothing more can be implied.
Rolf Turner-3 wrote:
On 22/10/2008, at 8:18 AM, Ted Byers wrote:
snip
... even with all the power and utility of Excel ...
snip
Is this some kind of joke?
cheers,
Rolf Turner
Ah, OK. That is new since I used Excel last.
Thanks
On Tue, Oct 21, 2008 at 5:52 PM, Gabor Grothendieck
[EMAIL PROTECTED] wrote:
You can create data entry forms without VB in Excel too.
On Tue, Oct 21, 2008 at 5:09 PM, Ted Byers [EMAIL PROTECTED] wrote:
I wasn't suggesting
For a model I am working on, I have samples organized by year and week of
the year. For this model, the data (year and week) comes from the basic
sample data, but I require a value representing the amount of time since the
sample was taken (actually, for the purpose of the model, it is sufficient
Thanks Patrick.
On Tue, Jan 27, 2009 at 2:03 PM, Patrick Connolly
p_conno...@slingshot.co.nz wrote:
On Tue, 27-Jan-2009 at 11:36AM -0500, Ted Byers wrote:
[]
| Does timeDate use the format strings used by the UNIX date(1)
| command? If so, then can I safely assume timeDate
I wasn't even aware I was using midnightStandard. You won't find it in my
script.
Here is the relevant loop:
date1 = timeDate(charvec = Sys.Date(), format = %Y-%m-%d)
date1
dow = 3;
for (i in 1:length(V4) ) {
x = read.csv(as.character(V4[[i]]), header = FALSE, na.strings=);
y = x[,1];
Hi Yohan, Thanks.
On Wed, Jan 28, 2009 at 4:57 AM, Yohan Chalabi chal...@phys.ethz.ch wrote:
TB == Ted Byers r.ted.by...@gmail.com
on Tue, 27 Jan 2009 16:00:27 -0500
TB I wasn't even aware I was using midnightStandard. You won't
TB find it in my
TB script.
TB
TB Here
Hi Yohan,
On Wed, Jan 28, 2009 at 10:28 AM, Yohan Chalabi chal...@phys.ethz.chwrote:
TB == Ted Byers r.ted.by...@gmail.com
on Wed, 28 Jan 2009 09:30:58 -0500
TB It is certain that all entries have the same format, but I'm
TB starting to
TB think that the error message is something
R version 2.12.0, 64 bit on Windows.
Here is a short script that illustrates the problem:
library(tseries)
library(xts)
setwd('C:\\cygwin\\home\\Ted\\New.Task\\NKs-01-08-12\\NKs\\tests')
x = read.table(quotes_h.2.dat, header = FALSE, sep=\t, skip=0)
str(x)
y -
Hi Joshua,
Thanks.
I had used irts because I thought I had to. The tick data I have has some
minutes in which there is no data, and others when there are hundreds, or
even thousands. If xts supports irregular data, the that is one less step
for me to worry about.
Alas, your suggestion didn't
Thanks Joshua,
That did it.
Cheers,
Ted
--
View this message in context:
http://r.789695.n4.nabble.com/plotOHLC-alpha3-Error-in-plotOHLC-alpha3-x-is-not-a-open-high-low-close-time-series-tp4283217p4286963.html
Sent from the R help mailing list archive at Nabble.com.
Here is a relatively simple script (with comments as to the logic
interspersed):
# Some of these libraries are probably not needed here, but leaving them in
place harms nothing:
library(tseries)
library(xts)
library(quantmod)
library(fGarch)
library(fTrading)
library(ggplot2)
# Set the
71 matches
Mail list logo