http://cran.r-project.org/web/packages/data.table/index.html
On Wed, May 22, 2013 at 12:31 PM, ivo welch ivo.we...@anderson.ucla.eduwrote:
I have a couple of large data sets, on the order of 4GB. they come in .csv
files, with about 50 columns and lots of rows. a couple have weird NA
man cron
or something more robust:
http://jenkins-ci.org/
-Whit
On Mon, Feb 11, 2013 at 1:51 PM, Christofer Bogaso
bogaso.christo...@gmail.com wrote:
Hello again,
My query may look quite generic, however at this point of time I just
want explain my problem. I am hopeful that somebody can
Why don't you use one of the existing MCMC packages. There are many to
choose from...
On Wed, Dec 12, 2012 at 10:49 PM, Chenyi Pan cp...@virginia.edu wrote:
Dear all
I am now running a MCMC iteration in the R program. But it is always
stucked in some loop. This cause big problems for my
I addition to Michael's suggestions, you can also check out this
tutorial which shows how to use lapply into EC2.
http://www.rinfinance.com/agenda/2012/workshop/WhitArmstrong.pdf
Unfortunately, rzmq is not available on windows, so this may not be
the best solution for your setup.
-Whit
On
I don't work for Amazon, but here is one of their promo pieces on
using 'spot' instances:
http://youtu.be/WD9N73F3Fao
at about 2:15, they cite University of Melbourne and Universitat de
Barcelona as customers...
My interest in all this cloud talk is that I'll be presenting a
tutorial on R in the
You should think about the cloud as a serious alternative.
I completely agree with Barry. Unless you will utilize your machines
(and by utilize, I mean 100% cpu usage) all the time (including
weekends) you will probably better use your funds to purchase blocks
of machines when you need to run
Is putting a variable into a list a deep copy (and is tracemem the
correct way to confirm)?
warmstrong@krypton:~/dvl/R.packages$ R
x - rnorm(1000)
tracemem(x)
[1] 0x3214c90
x.list - list(x.in.list=x)
tracemem[0x3214c90 - 0x2af0a20]:
Is it possible to put a variable into a list without
it.
Thanks,
Whit
On Thu, Apr 12, 2012 at 1:00 AM, Bert Gunter gunter.ber...@gene.com wrote:
On Wed, Apr 11, 2012 at 8:12 PM, Gabor Grothendieck
ggrothendi...@gmail.com wrote:
On Wed, Apr 11, 2012 at 10:17 PM, Whit Armstrong
armstrong.w...@gmail.com wrote:
I must admit I'm a little ashamed to have
I must admit I'm a little ashamed to have been using R for so long,
and still lack a sound understanding of deferred calls, eval, deparse,
substitute, and friends.
I'm attempting to make a deferred call to a function which has default
arguments in the following way:
call.foo - function(f) {
/2.13/rzmqâ
The downloaded packages are in
â/tmp/RtmpoTdDMm/downloaded_packagesâ
Warning message:
In install.packages(rzmq, dependencies = TRUE) :
installation of package 'rzmq' had non-zero exit status
Thank you for your help!
Ben
On Wed, Dec 7, 2011 at 7:00 PM, Whit
I don't know where to start because, it looks like rzmq is not available for
Windows and it looks like AWS.tools and deathstar depend on rzmq, so by
Hence my reference to work. patches welcome.
Will using a
local Windows box continue to be an issue as I progress with R and EC2? I've
run
subscribe to R-hpc.
and check out these:
https://github.com/armstrtw/rzmq
https://github.com/armstrtw/AWS.tools
https://github.com/armstrtw/deathstar
and this:
http://code.google.com/p/segue/
If you're willing to work, you can probably get deathstar to work
using a local windows box and remote
I don't think you can share dbi connections across different instances of R.
just have each of your helper functions open a local connection. or
alternatively, load a package on each instance which keeps a dbi
connection open.
and make sure you bump up your allowed number of connections in
not everything has to be done in R.
awk and sed are some of the best tools on a linux/unix box.
quick refs:
http://www.pement.org/awk/awk1line.txt
http://sed.sourceforge.net/sed1line.txt
-Whit
On Wed, Apr 13, 2011 at 12:07 AM, Chris Howden
ch...@trickysolutions.com.au wrote:
Hi Everyone,
I think Dirk has recently done some things w/ boost date time as an Rcpp
based project bdt.
http://cran.r-project.org/web/packages/RcppBDT/ChangeLog
-Whit
On Mon, Apr 11, 2011 at 10:11 AM, Jorge Nieves jorge.nie...@moorecap.comwrote:
Hi,
I was wondering if anyone could point me to the
There are better alternatives for big data than to revert to C.
http://code.google.com/p/pymc/
http://github.com/armstrtw/CppBugs (still alpha)
-Whit
On Mon, Mar 14, 2011 at 11:06 AM, nblarson nblar...@gmail.com wrote:
Has anybody had issues running MCMC (either BUGS or JAGS) on data sets of
index m as a vector and do the assignment in one step
i - df$row + (df$col-1)*nrow(m)
m[i] - df$a
or something along those lines.
-Whit
On Tue, Dec 7, 2010 at 1:31 PM, Cutler, Gene gcut...@amgen.com wrote:
I have a data frame with three columns, x, y, and a. I want to create a
matrix from
http://hudson-ci.org/
give hudson a try. It's incredibly easy to set up, and handles job
dependencies and notifications for job failures.
Its suggested use case is for automated software builds, but it fits
the role of scheduled jobs (and interjob dependencies) very well.
-Whit
On Fri, Nov
http://hudson-ci.org
On Wed, Oct 27, 2010 at 8:49 AM, david.jes...@ubs.com wrote:
Gabor
As someone trying to the rest of my team using Subversion (which I have used
for a while, but more as a backup / record of changes), have you a neat /
automated way of building a package from a
I've looked at the Kim/Nelson gauss code before, and I applaud your
effort to convert it to R.
I'm happy to have a look at it for you if you are willing to share your example.
-Whit
On Tue, Oct 26, 2010 at 4:13 AM, Houge jb.ho...@gmail.com wrote:
Greetings fellow R entusiasts!
We have some
Marc is exactly right about people having strong opinions.
R-forge is really the _only_ reason to consider using svn.
git is where the world is headed. This video is a little old:
http://www.youtube.com/watch?v=4XpnKHJAok8, but does a good job
getting the point across.
Hg is a good
http://rwiki.sciviews.org/doku.php?id=developers:projects:gsoc2010:nosql_interface
http://github.com/wactbprot/R4CouchDB
On Mon, Aug 16, 2010 at 7:40 AM, David Mitchell monch1...@gmail.com wrote:
Hello all,
I'm kind of surprised that searching the archives and Googling haven't given
me a
It isn't beautiful, but I use this package to write excel files from linux.
http://github.com/armstrtw/Rexcelpoi
the basic idea is that each element of a list is written as a separate
sheet, but if a list element is itself a list, then all the elements
of that list are written to the same sheet
?strptime
‘%B’ Full month name in the current locale. (Also matches
abbreviated name on input.)
On Wed, Jul 7, 2010 at 8:40 AM, Christofer Bogaso
bogaso.christo...@gmail.com wrote:
Dear all, I have a date related question. Suppose I have a character string
March-2009, how I can
http://github.com/armstrtw/unifieddbi
which I use on 64bit linux. you are welcome to test it for 64 bit
windows. are you able to compile yourself? or do you need a packaged
version?
-Whit
2010/6/27 顾小波 guxiaobo1...@gmail.com:
Hi,
I post this message to the general r-help list hoping
library(fts)
x - fts(data=rnorm(1e6))
system.time(xrnk - moving.rank(x,500))
user system elapsed
0.680.000.68
you will have to disguise your data as a time series to use fts.
see below the exact implementation of rank that is used.
-Whit
templatetypename ReturnType
class
Pick up Rcpp, make your life easier.
http://dirk.eddelbuettel.com/code/rcpp.html
-Whit
On Fri, Mar 5, 2010 at 9:19 AM, alex46...@yahoo.com wrote:
Hope I can get quick help from here, I have a bunch of c, c++ included main
function and makefile. It works well on both UNIX and windows. I
?expand.grid
On Fri, Jan 8, 2010 at 3:26 PM, Richardson, Patrick
patrick.richard...@vai.org wrote:
Let's say I have 8 variables and I want to generate all combinations of those
variables (In pairs, threes fours, etc) to run in multiple linear regression.
Is there a built-in function to do
We have been using pymc as an alternative to WinBUGS, and have been
very pleased with it. I've begun working on an R2Pymc package, but
don't have anything ready for sharing yet.
Here's the pymc page:
http://code.google.com/p/pymc/
and the repo is here:
http://github.com/pymc-devs/pymc
I've
warmstr...@research:~$ R
strptime(12/9/2007,%m/%d/%Y)
[1] 2007-12-09
format(strptime(12/9/2007,%m/%d/%Y),%Y%m%d)
[1] 20071209
On Tue, Jul 21, 2009 at 1:16 PM, liujbliujul...@yahoo.com wrote:
Hello,
I have a set of data that has a Date column looks like this:
12/9/2007
12/16/2007
I'm running a huge number of regressions in a loop, so I tried lm.fit
for a speedup. However, I would like to be able to calculate the
t-stats for the coefficients.
Does anyone have some functions for calculating the regression summary
stats of an lm.fit object?
Thanks,
Whit
Marc,
Thanks very much for your detailed reply. I'll give your code a try
and post back the time difference.
Cheers,
Whit
On Wed, Jul 8, 2009 at 10:50 AM, Marc Schwartzmarc_schwa...@me.com wrote:
On Jul 8, 2009, at 8:51 AM, Whit Armstrong wrote:
I'm running a huge number of regressions
Seems strange. I can go all the way up to 50GB on our machine which
has 64GB as well. It starts swapping after that, so I killed the
process.
try this:
ans - list()
for(i in 1:100) {
ans[[ i ]] - numeric(2^30/2)
cat(iteration: ,i,\n)
print(gc())
}
source(scripts/test.memory.r)
assuming you pull the data you want into x and y:
w...@ubuntu:~$ R
library(fts)
x - fts()
y - fts()
xy.cor.200 - moving.cor(x,y,200)
tail(xy.cor.200)
[,1]
2012-03-12 -0.3009635
2012-03-13 -0.2923489
2012-03-14 -0.2824015
2012-03-15 -0.2662689
2012-03-16 -0.2566354
2012-03-17
you have a couple of options.
If you require specific R functions to do what you want, then you will
need to call R from C.
I believe that Dirk has been working on an RInside package that does this.
Alternatively, you can use my tslib package, which is a general time
series library written in
try littler:
warmstr...@linuxsvr2:/tmp$ export MYVALUE=`r -e 'cat(10)'`
warmstr...@linuxsvr2:/tmp$ env|grep MYVALUE
MYVALUE=10
warmstr...@linuxsvr2:/tmp$
On Wed, Apr 22, 2009 at 10:48 AM, Bierbryer, Andrew
abierbr...@klsdiversified.com wrote:
If I have an R script that I am executing from a
can you show the list a more specific example of what you are trying to do?
most of the database packages support writeTable commands. So, if you
can represent the data you are trying to write in a dataframe, then
you can probably send it to the database with R.
-Whit
On Wed, Mar 18, 2009 at
if you don't find the solution you need, I have a package that uses
Apache POI to do this, but you will need to compile it yourself.
contact me if you want to go this route.
-Whit
On Mon, Mar 9, 2009 at 3:34 PM, Patrick Connolly
p_conno...@slingshot.co.nz wrote:
On Mon, 09-Mar-2009 at 02:34PM
you want:
ans - intersect(data1,data2)
class(ans) - c(POSIXt,POSIXct)
I personally think intersect should preserve the class of the object
(if both args have the same class), but I think r-core has a different
opinion.
-Whit
On Fri, Jan 23, 2009 at 9:02 AM, Tom La Bone boo...@gforcecable.com
I take a similar approach by storing my vcv's in a list w/ the date
stored as a character vector %y-%m-%d as the list names. That way
you can easily grab the vcv you need by casting your date to a string
and using it to index the list.
not sure if that will work for you.
hth,
Whit
On Tue, Dec
, 2008 at 01:16:46PM -0500, Whit Armstrong wrote:
I have a network of four machines set up. I'm having trouble spawning
my slaves on these machines.
All the examples I have found so far use makeCluster with type=MPI,
and I guess I'm missing some kind of cluster configuration in my
environment
,
On 19 December 2008 at 10:17, Whit Armstrong wrote:
| Does anyone know if these errors can be safely ignored?
|
| [linuxsvr.kls.corp:16242] mca: base: component_find: unable to open
| osc pt2pt: file not found (ignored)
|
| this is on RHEL5 w/ openMPI 1.2.7
Yes. Hao (of Rmpi fame) and I
if you want the speed, you can simply build an fts time series from
it, then apply the moving.sum function and throw away the dates.
this will probably be the fastest implementation of rolling applies
out there unless you do a cumsum difference function.
I got a sample timing of 2 seconds on 12m
I have a network of four machines set up. I'm having trouble spawning
my slaves on these machines.
All the examples I have found so far use makeCluster with type=MPI,
and I guess I'm missing some kind of cluster configuration in my
environment variables because all my clusters are formed on the
for a simple example:
x - list()
x[[a]] - list(a=c(1,2,3),b=c(3,4,5))
x[[b]] - list(a=c(6,7,8),b=c(9,10,11))
lapply(x,sum)
this fails w/
Error in FUN(X[[1L]], ...) : invalid 'type' (list) of argument
Just wondering if I have overlooked something obvious.
one can also do:
...@stats.ox.ac.uk wrote:
On Thu, 11 Dec 2008, Whit Armstrong wrote:
for a simple example:
x - list()
x[[a]] - list(a=c(1,2,3),b=c(3,4,5))
x[[b]] - list(a=c(6,7,8),b=c(9,10,11))
lapply(x,sum)
this fails w/
Error in FUN(X[[1L]], ...) : invalid 'type' (list) of argument
Just wondering
yes, that is correct. I was looking in text mode.
ok, thanks for your help.
-Whit
On Thu, Dec 11, 2008 at 4:02 PM, Prof Brian Ripley
rip...@stats.ox.ac.uk wrote:
On Thu, 11 Dec 2008, Whit Armstrong wrote:
Thanks, Gabor and Prof. Ripley.
Sorry for the oversight.
I grepped the lapply help
I've had a good experience with the ROracle driver. Any reason why
you need RODBC?
-Whit
On Mon, Dec 1, 2008 at 10:17 AM, Prof Brian Ripley
[EMAIL PROTECTED] wrote:
On Fri, 28 Nov 2008, Simon Collins wrote:
Hi
I'm presently trying to connect to Oracle through RODBC / UnixODBC on
linux
I know it's easy to write a simple loop to do this, but in the spirit
of lapply, I thought I would ask if there is a builtin to filter or
take a subset of a list based on a predicate in a similar way to the
Erlang lists:filter/2 function:
http://www.erlang.org/doc/man/lists.html#filter-2
, Whit Armstrong
[EMAIL PROTECTED] wrote:
I know it's easy to write a simple loop to do this, but in the spirit
of lapply, I thought I would ask if there is a builtin to filter or
take a subset of a list based on a predicate in a similar way to the
Erlang lists:filter/2 function:
http
Anyone know a quick way to color one bar of a histogram?
I want to mark the bar in which the most recent observation falls.
So, for instance:
x - rnorm(100)
latest.ob - x[100]
hist(x)
## how do I mark the bucket that latest.ob falls into?
Thanks,
Whit
That's great, Peter.
Thanks very much.
-Whit
On Tue, Oct 28, 2008 at 3:13 PM, Peter Dalgaard
[EMAIL PROTECTED] wrote:
Whit Armstrong wrote:
Anyone know a quick way to color one bar of a histogram?
I want to mark the bar in which the most recent observation falls.
So, for instance:
x
I'm wrapping boost date_time into an R package. I'll post it up to
cran shortly.
http://www.boost.org/doc/libs/1_36_0/doc/html/date_time.html
I'm not sure if that is what you are looking for, but there are a lot
of useful utilities in this library.
-Whit
On Thu, Sep 11, 2008 at 11:02 AM,
probably not pre-canned routines for that, but very easy to implement
with the tools provided in the library.
Looks like most of what you want to do is fairly simple and not worth
the trouble of involving c++.
but things like month_durations and year_durations make it clear that
the authors have
are the chunks on which you need to apply the function rolling windows?
do they overlap?
I have some c++ template utilities that I use for window functions (on
timeseries objects) which you are welcome to copy and modify to fit your
problem.
they are available here:
git://repo.or.cz/fts.git
55 matches
Mail list logo