Thanks for all you answers. The problem might be that I am using some resampled data extracted on cortical surface as well as coordinates in subcortical ROIs, so this is similar to grayordinates in HCP but it is not HCP data, so I have not really a way to go back to voxel space. Resampling occurs anyway during preprocessing so I thought it would be ok.
I have not per say correlated the number of neighbors within radius with searchlight results but just displaying it makes me have concerns about potential bias as I know classifiers are sensitive to feature space dimension. The distribution of feature number is really widespread due to the surface constraint, for example a 6mm radius have nfeat from 6 to 118 which definitely have an influence on classification. Will try to find solution to that. cheers basile On Fri, May 15, 2015 at 10:55 AM, Christopher J Markiewicz <[email protected]> wrote: > On 05/15/2015 09:25 AM, basile pinsard wrote: > > Hi MVPA experts, > > > > I have a theoretical question that arised from recent analysis using > > searchlight (either surface or voxel based): > > > > What is the most sensible feature selection strategy between: > > - a radius with variable number of features included, which will make > > the different classifiers trained on different amount of dimensions; > > - a fixed number of closest voxels/surface_nodes that would represent > > different surface/volume/spatial_extent depending on the localization. > > From my reading, the more common is the former. This is probably > because, without evidence that your results particularly correlate with > searchlight size, the more interpretable figure is one in which each > voxel represents a statistic taken over a fixed spatial extent. > > A fixed number of voxels (I agree with Jo that one should always use > voxels; even if you are using surface nodes to define a neighborhood, > these should be mapped back to voxels to avoid smoothing and resampling) > is beneficial if you are using an error metric that is sensitive to > dimensionality, such as mean squared error. > > With a surface searchlight of radius 9mm, I get a distribution of > searchlight sizes (in one subject) that's approximately normal(66, 8). I > have not found that the cross-validation training error of classifiers > (linear SVM, mostly) is particularly sensitive to searchlight size. On > the other hand, attempting to use the same searchlights with regression > problems produces results that correlate strongly (positively or > negatively, depending on regression algorithm) with number of voxels. > > > I had the examples with surfaces, for which I used a spherical templates > > (similar to 32k surfaces in HCP dataset) transformed into subject space. > > I computed the number of neighbors for each node with a fixed radius and > > noted a differential sampling resolution in the brain, which somewhat > > overlay with my network of interest (motor) and thus my concerns. > > Do your preliminary results correlate with searchlight size across > several regions? That would be my primary indication that this is a > concern. > > > With voxel based searchlight, depending on masking voxels on the borders > > of the mask will have less neighbors in a fixed radius sphere. > > Could you smear the mask with your searchlight, i.e. extend it in all > directions? You'll still be including (presumably) uninformative voxels, > but at least you won't be dimensionality itself that gets you. > > > PyMVPA has only this strategy for now, but I read many papers with fixed > > amount of features in Searchlight. > > > > What do you think? > > > > I did an ugly modification to have a temporary fixed feature number > > (closest) on surface but it should be optimized: > > > https://github.com/bpinsard/PyMVPA/commit/1af58ea8a57882ed57059491c19d83bed43e0bce > > If I'm reading this right (I haven't dug into the PyMVPA surface > searchlight implementation), this is selecting a maximum number of > surface nodes, and then mapping that to voxels, and will still end up > with variable numbers of voxels, depending on the density of surface nodes. > > What about sorting nodes based on distance, mapping to voxels, and then > taking the first max_features voxels? > > -- > Christopher J Markiewicz > Ph.D. Candidate, Quantitative Neuroscience Laboratory > Boston University > > > _______________________________________________ > Pkg-ExpPsy-PyMVPA mailing list > [email protected] > http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa > -- Basile Pinsard *PhD candidate, * Laboratoire d'Imagerie Biomédicale, UMR S 1146 / UMR 7371, Sorbonne Universités, UPMC, INSERM, CNRS *Brain-Cognition-Behaviour Doctoral School **, *ED3C*, *UPMC, Sorbonne Universités Biomedical Sciences Doctoral School, Faculty of Medicine, Université de Montréal CRIUGM, Université de Montréal
_______________________________________________ Pkg-ExpPsy-PyMVPA mailing list [email protected] http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa

