from:"Daniel Elliott"

[R] dmnorm not meant for 1024-dimensional data?

2007-04-25 Thread Daniel Elliott

Hello,

I have some data generated by a simple mixture of Gaussians (more like
K-means) and (as a test) am using dmnorm to calculate the probability
of each data point coming from each Gaussian.  However, I get only
zero probabilities.

This code works in low dimensions (tried 2 and 3 already).  I have run
into many implementations that do not work in high dimension, but I
thought that I was safe with dmnorm because it has an option to
compute the log of the probability.

So, is dmnorm not intended to be used with data of such high dimensionality?

Thank you,

dan elliott

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Organisation of medium/large projects with multiple analyses

2006-10-28 Thread Daniel Elliott

Mark,

It sounds like your data/experiment storage and organization needs are more
complicated than mine, but I'll share my methodology...

I'm still new to R, but have a fair experience with general programming.
 All of my data is stored in postgresql, and I have a number of R files
 that generate tables, results, graphs etc.  These are then available to
 be imported into powerpoint/latex etc.

 I'm using version control (subversion), and as with most small projects,
 now have an ever increasing number of R scripts, each with fairly
 specific features.


 I only use version control for generic code.  For me, generic code is not
at the experiment level but at the algorithm level.  It is only code that
others would find useful - code that I hope to release to the R community.
I use object-oriented programming to simplify the more specific,
experiment-level scripts that I will describe later.  These objects include
plotting and data import/export among other things.

Like you, many of my experiments are variations on the same theme.  I have
attempted general functions that can run many different experiments with
changes only to parameters, but I have found this far too cumbersome.

I am now resigned to storing all code and input and generated output data
and graphs together in a single directory for each experiment with the
exception of my general libraries.  This typically consists of me copying
the scripts that ran other experiments into a new directory where they are
(hopefully only slightly) modified to fit the new experiment.  I wish I had
a cooler way to handle all of this, but this does make it very easy to rerun
stuff.  I even create new files, but not necessarily new directories, for
scripts that differ only in the parameters they used when calling functions
from my libraries.

Do you go to the effort of creating a library that solves your
 particular problem, or only reserve that for more generic functionality?


I only use libraries and classes for code that is generic enough to be
usable by rest of the R community.

Do people keep all of their R scripts for a specific project separate,
 or in one big file?


Files for a particular project are kept in many different directories with
little structure.  Experiment logs (like informal lab reports) are used if I
need to revisit or rerun an experiment.  By the way, I back all of this
stuff onto tape drive or DVD.


 I can see advantages (knowing it all works) and
 disadvantages (time for it all to run after minor changes) in both
 approaches, but it is unclear to me which is better. I do know that
 I've set-up a variety of analyses, moved on to other things, only to
 find later on that old scripts have stopped working because I've changed
 some interdependency. Does anyone go as far as to use test suites to
 check for sane output (apart from doing things manually)?  Note I'm not
 asking about how to run R on all these scripts, as people have already
 suggested makefiles.


I try really really really hard to never change my libraries.  If I need to
modify on the algorithms in a library I create a new method within the same
library.  Since you use version control (which is totally awesome, do you
use it for your writing as well) hopefully you will be able to quickly
figure out why an old script doesn't work (in theory should only be caused
by function name changes).

I realise these are vague high-level questions, and there won't be any
 right or wrong answers, but I'm grateful to hear about different
 strategies in organising R analyses/files, and how people solve these
 problems? I've not seen this kind of thing covered in any of the
 textbooks. Apologies for being so verbose!


Not sure one could be TOO verbose here!  I am constantly looking for
bulletproof ways to manage these complex issues.  Sadly, in the past, I may
have done so to a fault.  I feel that the use of version control for generic
code and formal writing is very important.

Hope this helps.  Maybe we could someday come up with a metalanguage to
describe our experiments.

- dan elliott

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] function to normalize vectors?

2006-10-26 Thread Daniel Elliott

Hello all.

I can find no function to compute norms (even the basic two-norm) of a
vector in the online help (within the GUI) or the downloadable
documentation.

Thanks.

- dan elliott

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] dmnorm not meant for 1024-dimensional data?

Re: [R] Organisation of medium/large projects with multiple analyses

[R] function to normalize vectors?

3 matches

Site Navigation

Mail list logo

Footer information