7 feb 2008 kl. 19.22 skrev Ted Dunning:

There are many alternatives.

The simplest is tab-delimited files with a header line. That works pretty
well almost all of the time.

For instance, most of the UCI datasets are in pretty much that format. Most of my data sets wind up in that format. Anything from a relational database
falls into that format pretty easily.

I'd say that is more or less the same thing as ARFF, only that ARFF has a typed header, is comma delimited and can optionally be stored in a sparse mode.

The file format is not that important to me (and of course it should be an interchangable strategy), all I want is to get started on a data access API. At this point I don't care what matrix or what not will used for speedy access, I want the API used to load the matrix with data. A seekable instance enumerator working straight of the files system. InstanceReader, InstanceWriter. It would allow me to get started with pre processing filters (resampling, discretization, etc).



  karl






On 2/7/08 10:15 AM, "Karl Wettin" <[EMAIL PROTECTED]> wrote:


5 feb 2008 kl. 00.51 skrev Grant Ingersoll:

I haven't used Weka much, is ARFF

I never used anything but Weka that much, are there any alternatives
to ARFF?


Reply via email to