7 feb 2008 kl. 19.22 skrev Ted Dunning:
There are many alternatives.
The simplest is tab-delimited files with a header line. That works
pretty
well almost all of the time.
For instance, most of the UCI datasets are in pretty much that
format. Most
of my data sets wind up in that format. Anything from a relational
database
falls into that format pretty easily.
I'd say that is more or less the same thing as ARFF, only that ARFF
has a typed header, is comma delimited and can optionally be stored in
a sparse mode.
The file format is not that important to me (and of course it should
be an interchangable strategy), all I want is to get started on a data
access API. At this point I don't care what matrix or what not will
used for speedy access, I want the API used to load the matrix with
data. A seekable instance enumerator working straight of the files
system. InstanceReader, InstanceWriter. It would allow me to get
started with pre processing filters (resampling, discretization, etc).
karl
On 2/7/08 10:15 AM, "Karl Wettin" <[EMAIL PROTECTED]> wrote:
5 feb 2008 kl. 00.51 skrev Grant Ingersoll:
I haven't used Weka much, is ARFF
I never used anything but Weka that much, are there any alternatives
to ARFF?