Re: File format question about Random forest.

Xiaobo Gu Fri, 15 Jul 2011 06:11:05 -0700

But if we use CSV files, how can we generate descriptors for datasets?

Cheers


Xiaobo Gu

On Thu, Jul 14, 2011 at 1:27 AM, deneche abdelhakim <[email protected]> wrote:
> I guess yes. as long as you don't use quotes or double quotes to embed the
> fields.
>
> On Wed, Jul 13, 2011 at 2:58 PM, Xiaobo Gu <[email protected]> wrote:
>
>> So for simple datasets, which only have numeric and character
>> lable(without blank) category columns,  can we just use CSV tools to
>> save it as a standard CSV file without header?
>>
>>
>> On Wed, Jul 13, 2011 at 3:53 AM, deneche abdelhakim <[email protected]>
>> wrote:
>> > the current implementation doesn't support the ARFF format
>> out-of-the-box,
>> > as described in the Wiki you need to remove the header of the file and
>> leave
>> > only the data. Actually, this implementation is fully compatible with
>> UCI's
>> > datasets which are comma separated text files. You'll also need to call
>> the
>> > dataset description tool (see the wiki) in order to generate a proper
>> > description file (contains the nature of each attribute: Numerical or
>> > Categorical).
>> >
>> > Yes you can use BuildForest and TestForest to generate and use Random
>> forest
>> > models from the command line
>> >
>> > On Tue, Jul 12, 2011 at 2:19 PM, Xiaobo Gu <[email protected]>
>> wrote:
>> >
>> >> Hi,
>> >>
>> >> The Random Forest partial implementation in
>> >>
>> https://cwiki.apache.org/confluence/display/MAHOUT/Partial+Implementation
>> >> use the ARFF file format, is ARFF the only supportted file format when
>> >> using the BuildForest and TestForest program, and are BuildForest and
>> >> TestForest program are official tools to build Random Forest models
>> >> from the command line?
>> >>
>> >> Regards,
>> >>
>> >> Xiaobo Gu
>> >>
>> >
>>
>

Re: File format question about Random forest.

Reply via email to