Hello again! I'm still stuck at my analysis using multiple diffusion-derived-parameter-values for the same voxel. Thanks to your help, I think I've got the dataset together, but I run into an error when attempting the final searchlight approach.
This is what I've got by now for setting up the dataset (defined as a function): def ds_setup ( sub_list, param_list, data_path ): "Setting up the dataset, loading data etc." # make space: dss = [] modality_all = [] modality_idx_all = [] voxel_idx_all = [] subject_all = [] targets_all = [] chunks_all = [] for sub_index, sub in enumerate(sub_list): if sub.startswith('sub1'): learn = 1 elif sub.startswith('sub0'): learn = 0 else: raise ValueError("Do not know what to do with %s" % sub) # dss_subj will contain datasets for different features for the same subject ds_param = [] # idea: collect the feature attributes and glue it all together in the end modality_param = [] modality_idx_param = [] subject_param = [] targets_param = [] chunks_param = [] voxel_idx_param = [] for suf_index, suf in enumerate(param_list): ds = fmri_dataset(data_path + '/%s/%s_%s.nii.gz' % (sub,sub,suf)) ds.fa['modality'] = [suf] # for each feature, set the modality attribute ds.fa['modality_index'] = [suf_index] # this as numeric might come handy for searchlights later ds.fa['targets'] = [learn] ds.fa['chunks'] = [sub_index] modality_param.extend(ds.fa['modality'].value) modality_idx_param.extend(ds.fa['modality_index'].value) subject_param.extend(ds.fa['subject'].value) targets_param.extend(ds.fa['targets'].value) chunks_param.extend(ds.fa['chunks'].value) voxel_idx_param.extend(ds.fa['voxel_indices'].value) ds_param.append(ds) ds_subj = hstack(ds_param) # collect future feature attributes: modality_all.append(modality_param) modality_idx_all.append(modality_idx_param) voxel_idx_all.append(voxel_idx_param) # collect future sample attributes targets_all.append(learn) chunks_all.append(sub_index) subject_all.append(sub) dss.append(ds_subj) dsall = vstack(dss) # create the actual dataset DS = Dataset(dsall) DS.fa['modality'] = modality_all[0] DS.fa['modality_idx'] = modality_idx_all[0] DS.fa['voxel_indices'] = voxel_idx_all[0] DS.sa['targets'] = targets_all DS.sa['chunks'] = chunks_all DS.sa['subject'] = subject_all return DS; ################################## This gives me the following: DS is a dataset with dimensions 'no_subjects' * ('no_voxels' * 'no_parameters'). Importantly, each feature has the corresponding voxel indices associated with it - since the different parameters are just concatenated, the sequence of these indice-arrays repeats for each parameter. I've simulated a little set of perfect two group toy data (= 100% SNR) and fed the corresponding dataset into an SVM leave-one-out-cross-validation like this: clf = LinearCSVMC() cvte = CrossValidation(clf, NFoldPartitioner(), ... errorfx=lambda p, t: np.mean(p == t), ... enable_ca=['stats']) cv_results = cvte(DS) The result is indeed perfect - classification performance is 100%. :-) So far, so good. However, when I want to use this dataset for a searchlight approach, I run into an error. Here's the code: sl = sphere_searchlight(cvte, radius=3, postproc=mean_sample()) res = sl(DS) print res_clean ValueError: IndexQueryEngine has insufficient information about the dataset spaces. It is required to specify an ROI generator for each feature space in the dataset (got: ['voxel_indices'], #describable: 125000, #actual features: 375000). So the code picks up on the fact that there are multiple instances of the same voxel-index-array within one sample (three parameters = three times the same index-array). I wrongly expected that it would take all values with the corresponding indices into account for MVPA. How do I "glue" these multiple values per sample for the same voxel index-array together such that the searchlight algorithm works? Am I on the right track at all or did I get carried away? Thanks so much for your help! Best, Ulrike ----- Original Message ----- From: "Yaroslav Halchenko" <deb...@onerussian.com> To: "pkg-exppsy-pymvpa" <pkg-exppsy-pymvpa@lists.alioth.debian.org> Sent: Thursday, 5 November, 2015 15:52:55 Subject: Re: [pymvpa] Dataset with multidimensional feature vector per voxel On Thu, 05 Nov 2015, Ulrike Kuhl wrote: > Thanks again for your super quick and super helpful reply! > There is still one thing that I don't quite get at the moment: > As expected, 'dsall' is a numpy.ndarray of dimensions > 1*(numberVoxels*numberSubjects*numberParameters) - so just a really long > vector. > Since this is 'only' a vector, it does not have any attributes or other > information associated with it. This information is still contained in 'dss', > which is a list of size (numberSubjects*numberParameters) containing the > respective datasets. > I don't see right now how to bridge this information to the 'dsall'-array. > After all, 'sphere_searchlight()' should also be called with a dataset, using > this dataset's coordinates to determine the local neighborhoods, right? > So I guess I still need the 'glue' to get it all together... ;-) > Can you give me a hint on that? my bad -- I overlooked this by subject construct... you need to vstack those across subjects while hstacking features within subject sub_list = [list of subjects] param_list = [list of parameters] dss = [] for sub_index, sub in enumerate(sub_list): # yoh: will contain datasets for different features for the same subject dss_subj = [] for suf_index, suf in enumerate(param_list): ds = fmri_dataset('/path/to/image/file',mask='/path/to/mask/file') # yoh: no need to broadcast -- should do automagically ds.fa['modality'] = [suf] # for each feature, set the modality attribute ds.fa['modality_index'] = [suf_index] # this as numeric might come handy for searchlights later dss_subj.append(ds) ds_subj = hstack(dss_subj) if sub.startswith('L'): learn = 1 elif sub.startswith('N'): learn = 0 else: # yoh: make it explicit, otherwise you would assign previous learn to next one not L or N raise ValueError("Do not know what to do with %s" % sub) # yoh: seems it was incorrectly indented into elif ds_subj.sa['targets'] = [learn] ds_subj.sa['chunks'] = [sub_index] dss.append(ds_subj) dsall = vstack(dss) #################################### > Cheers, > Ulrike > ----- Original Message ----- > From: "Yaroslav Halchenko" <deb...@onerussian.com> > To: "pkg-exppsy-pymvpa" <pkg-exppsy-pymvpa@lists.alioth.debian.org> > Sent: Thursday, 5 November, 2015 15:14:24 > Subject: Re: [pymvpa] Dataset with multidimensional feature vector per voxel > On Thu, 05 Nov 2015, Ulrike Kuhl wrote: > > Dear Yaroslav, > > thanks a lot for your reply. > > With your snipped it was really easy for me to set up the matrix as you > > described it. :-) > > For those interested, this is the code that works for me: > > sub_list = [list of subjects] > > param_list = [list of parameters] > > dss = [] > > for sub_index, sub in enumerate(sub_list): > > for suf_index, suf in enumerate(param_list): > > ds = fmri_dataset('/path/to/image/file',mask='/path/to/mask/file') > > ds.fa['modality'] = [suf] * ds.nfeatures # for each > > feature, set the modality attribute > > ds.fa['modality_index'] = [suf_index] * ds.nfeatures # this as > > numeric might come handy for searchlights later > > if sub.startswith('L'): > > learn = 1 > > elif sub.startswith('N'): > > learn = 0 > > ds.sa['targets'] = [learn] > > ds.sa['chunks'] = [sub_index] > > dss.append(ds) > > dsall = hstack(dss) > Here is my tune up with # yoh: comments > sub_list = [list of subjects] > param_list = [list of parameters] > dss = [] > for sub_index, sub in enumerate(sub_list): > for suf_index, suf in enumerate(param_list): > ds = fmri_dataset('/path/to/image/file',mask='/path/to/mask/file') > # yoh: no need to broadcast -- should do automagically > ds.fa['modality'] = [suf] # for each feature, set the > modality attribute > ds.fa['modality_index'] = [suf_index] # this as numeric might come > handy for searchlights later > if sub.startswith('L'): > learn = 1 > elif sub.startswith('N'): > learn = 0 > else: > # yoh: make it explicit, otherwise you would assign previous learn to > next one not L or N > raise ValueError("Do not know what to do with %s" % sub) > # yoh: seems it was incorrectly indented into elif > ds.sa['targets'] = [learn] > ds.sa['chunks'] = [sub_index] > dss.append(ds) > dsall = hstack(dss) > cheers -- Yaroslav O. Halchenko Center for Open Neuroscience http://centerforopenneuroscience.org Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755 Phone: +1 (603) 646-9834 Fax: +1 (603) 646-1419 WWW: http://www.linkedin.com/in/yarik _______________________________________________ Pkg-ExpPsy-PyMVPA mailing list Pkg-ExpPsy-PyMVPA@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa -- Max Planck Institute for Human Cognitive and Brain Sciences Department of Neuropsychology (A219) Stephanstraße 1a 04103 Leipzig Phone: +49 (0) 341 9940 2625 Mail: k...@cbs.mpg.de Internet: http://www.cbs.mpg.de/staff/kuhl-12160 _______________________________________________ Pkg-ExpPsy-PyMVPA mailing list Pkg-ExpPsy-PyMVPA@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa