I used more datasets in a range from 2200 to 3500 distinct words in the tf for
training the LDA. This data are preprocessed with lemmatizing before
CountVectorizrt.
Von: Joel Nothman [joel.noth...@gmail.com]
Gesendet: Dienstag, 26. Januar 2016 23:35
An: scikit-lea
How many distinct words are in your dataset?
On 27 January 2016 at 00:21, Rockenkamm, Christian <
c.rockenk...@stud.uni-goettingen.de> wrote:
> Hallo,
>
>
> I have question concerning the Latent Dirichlet Allocation. The results I
> get from using it are a bit confusing.
>
> At first I use about
Hi Christian.
Can you provide the data and code to reproduce?
Best,
Andy
On 01/26/2016 08:21 AM, Rockenkamm, Christian wrote:
Hallo,
I have question concerning the Latent Dirichlet Allocation. The
results I get from using it are a bit confusing.
At first I use about 3000 documents. In the
On 01/26/2016 07:17 AM, Panos Louridas wrote:
> Hello,
>
> A few points on the documentation / examples in the scikit-learn site:
>
> * In the example that plots the decision surface of a decision tree on the
> Iris dataset
> (http://scikit-learn.org/stable/auto_examples/tree/plot_iris.html#exa
I turns out that using "python setup.py develop" while using conda did
exactly what I wanted (it created the build files in to separate
directories), I was convinced I tried that earlier, sorry about that.
2016-01-26 14:45 GMT+01:00 Jacob Vanderplas :
> > I don't see an easy way to maintain the
> I don't see an easy way to maintain the changes in two different
directories.
If both directories are Git repositories linked to a common remote, you
could commit the changes on a branch and then sync them that way.
Jake VanderPlas
Senior Data Science Fellow
Director of Research in Physical
Hallo,
I have question concerning the Latent Dirichlet Allocation. The results I get
from using it are a bit confusing.
At first I use about 3000 documents. In the preparation with the
CountVectorizrt I use the following parameters : max_df=0.95 and min_df=0.05.
For the LDA fit I use the bath le
Hallo,
I have question concerning the Latent Dirichlet Allocation. The results I get
from using it are a bit confusing.
At first I use about 3000 documents. In the preparation with the
CountVectorizrt I use the following parameters : max_df=0.95 and min_df=0.05.
For the LDA fit I use the bath le
Well I did not use travis because I thought it was a little cumbersome to
have to push every little change I made to my Github repo, plus a travis
build takes ~15 min. I was looking for a way to keep the binaries for both
versions with the same source directory (I don't do edit cython files for
the
Hello,
A few points on the documentation / examples in the scikit-learn site:
* In the example that plots the decision surface of a decision tree on the Iris
dataset
(http://scikit-learn.org/stable/auto_examples/tree/plot_iris.html#example-tree-plot-iris-py),
the dataset is initially shuffled
10 matches
Mail list logo