On Mon, Jul 29, 2013 at 12:19 AM, Ross Boucher <[email protected]> wrote:
> Interesting, I've been using DictVectorizer (and one hot coded categorical
> data) with Random Forests and getting decent results. Is this just
> coincidental, and will I see better results if I combine the categorical
> data into a single column?
>
>
Can you give me a sample example of DictVectorizer and RandomForest usage?
What i do is reading a csv file line by line:
train = csv_io.read_data(train_file)
#set the training responses
self.target = [x[0] for x in train]
#set the training features
self.train = [x[1:] for x in train]
and csv_io.read_data is
for line in f:
sample = []
line = line.strip().split(",")
for x in line:
try:
sample.append(float(x))
except ValueError:
sample.append(str(x))
samples.append(sample)
#sample = [float(x) for x in line]
#samples.append(sample)
return samples
How will i use DictVectorizer for string values above?
------------------------------------------------------------------------------
Get your SQL database under version control now!
Version control is standard for application code, but databases havent
caught up. So what steps can you take to put your SQL databases under
version control? Why should you start doing it? Read more to find out.
http://pubads.g.doubleclick.net/gampad/clk?id=49501711&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general