This is indeed interesting. I didn't know that there are so big
differences between these approaches. I split the 18 documents into
sub-documents of 5 paragraphs each, so that I got around 10k of these
sub-documents. Now, scikit-learn and gensim deliver much better results,
quite similar to those f
I'm actually surprised the gibbs sampling gave useful results with so
little data.
And splitting the documents results in very different data. It has a lot
more information.
How many topics did you use?
Also: PR for docs welcome!
On 09/19/2017 04:26 AM, Markus Konrad wrote:
This is indeed int