Re: [Scikit-learn-general] Latent Semantic Indexing (LSI) with Sklearn

2013-01-03 Thread Karsten Jeschkies
Hi, I used gensim in combination with Scikit-learn for two months for my master's thesis. It is quite simple and straight forward. 1. Learn your model on a corpus such as Wikipedia or Reuters-21758 (offline step) 2. Convert your document/s to a feature space with your model. The result is a numpy

Re: [Scikit-learn-general] Latent Semantic Indexing (LSI) with Sklearn

2013-01-03 Thread Andreas Mueller
On 01/04/2013 12:41 AM, Vlad Niculae wrote: > Maybe my mind is not in its right place but how is that different from > using the PCA transformer? > Not sure about my mind either but Lars talked about the SVD of X, not the covariance matrix. [is that right?] --

Re: [Scikit-learn-general] Latent Semantic Indexing (LSI) with Sklearn

2013-01-03 Thread Vlad Niculae
Maybe my mind is not in its right place but how is that different from using the PCA transformer? On Thu, Jan 3, 2013 at 10:48 PM, Lars Buitinck wrote: > 2013/1/3 Jack Alan : > > I'm working in document classification and I wonder if there is a way of > > having the feature vector calculated ba

Re: [Scikit-learn-general] Latent Semantic Indexing (LSI) with Sklearn

2013-01-03 Thread Lars Buitinck
2013/1/3 Jack Alan : > I'm working in document classification and I wonder if there is a way of > having the feature vector calculated based on Latent Semantic Indexing (LSI) > instead of tf or tf-idf. As you know with LSI or Latent Dirichlet Allocation > (LDA), semantic features are captured. LSI

Re: [Scikit-learn-general] Latent Semantic Indexing (LSI) with Sklearn

2013-01-03 Thread Andreas Mueller
Hi Jack. All sklearn estimators work on numpy arrays or sparse matrices. I guess the easiest way would be to just use gensim for the feature extraction and then feed the resulting features into sklearn. Hth, Andy On 01/03/2013 10:02 PM, Jack Alan wrote: Hi all, I'm working in document classi

[Scikit-learn-general] Latent Semantic Indexing (LSI) with Sklearn

2013-01-03 Thread Jack Alan
Hi all, I'm working in document classification and I wonder if there is a way of having the feature vector calculated based on Latent Semantic Indexing (LSI) instead of tf or tf-idf. As you know with LSI or Latent Dirichlet Allocation (LDA), semantic features are captured. I found an online Pytho

[Scikit-learn-general] Book: Shalizi, Advanced Data Analysis from an Elementary Point of View

2013-01-03 Thread denis
Folks, many of you may know this book already: http://www.stat.cmu.edu/~cshalizi/ADAfaEPoV has PDF of a 571-page book draft / course notes for "advanced undergraduate students" at CMU. (http://www.stat.cmu.edu/~cshalizi is witty, http://www.cscs.umich.edu/~crshalizi/weblog amazing.) cheers --

Re: [Scikit-learn-general] a weird problem when I use nibabel and skimage

2013-01-03 Thread soft.join Huang
Oh, I'm sorry :) On Thu, Jan 3, 2013 at 11:04 PM, Gael Varoquaux < gael.varoqu...@normalesup.org> wrote: > Hi Lijie, > > Maybe this is the wrong mailing list, and you were wanting to send this > to the skimage mailing list? > > Cheers, > > Gaël > > On Thu, Jan 03, 2013 at 11:03:35PM +0800, s

Re: [Scikit-learn-general] a weird problem when I use nibabel and skimage

2013-01-03 Thread Gael Varoquaux
Hi Lijie, Maybe this is the wrong mailing list, and you were wanting to send this to the skimage mailing list? Cheers, Gaël On Thu, Jan 03, 2013 at 11:03:35PM +0800, soft.join Huang wrote: > Hi, all, >   > I may have found where the problem is. >   > When I load nifti data using nibabel, the da

Re: [Scikit-learn-general] a weird problem when I use nibabel and skimage

2013-01-03 Thread soft.join Huang
Hi, all, I may have found where the problem is. When I load nifti data using nibabel, the data would be store in Fortran-contiguous, and in numpy the data would be store in C-contiguous for default. The function skimage.morphology.is_local_maximum() would give different result for the two differe