Apologies if multiple copies are received. Call for Papers:

---------------------------------------------------------------------- NIPS 2008 WORKSHOP on LEARNING FROM MULTIPLE SOURCES http://web.mac.com/davidrh/LMSworkshop08/ http://nips.cc/ ---------------------------------------------------------------------- BACKGROUND

`While the machine learning community has primarily focused on`

`analysing output of a single data source, there has been relatively`

`few attempts to develop a general framework, or heuristics, for`

`analysing several data sources in terms of a shared dependency`

`structure. Learning from multiple data sources (or alternatively, the`

`data fusion problem) is a timely research area. Due to the increasing`

`availability and sophistication of data recording techniques and`

`advances in data analysis algorithms, there exists many scenarios in`

`which it is necessary to model multiple, related data sources, i.e. in`

`fields such as bioinformatics, multimodal signal processing,`

`information retrieval etc. The relevance of this research area is`

`inspired by the human brain's ability to integrate five different`

`sensory input streams into a coherent representation of its environment.`

`The open question is to find approaches to analyse data which consists`

`of more than one set of observations (or view) of the same phenomenon.`

`In general, existing methods use a discriminative approach, where a`

`set of features for each data set is found in order to explicitly`

`optimise some dependency criterion. Existing approaches include`

`canonical correlation analysis (Hotelling, 1936), a standard`

`statistical technique for modeling two data sources, and its multiset`

`variation (Kettenring, 1971) which find linearly correlated features`

`between data sets, and kernel variants (Lai and Fyfe, 2000; Bach and`

`Jordan, 2002; Hardoon et al., 2004) and approaches that optimise the`

`mutual information between extracted features (Becker, 1996; Chechik`

`et al., 2003). However, discriminative approaches may be ad hoc,`

`require regularisation to ensure erroneous shared features are not`

`discovered, and it is difficult to incorporate prior knowledge about`

`the shared information. Generative probabilistic approaches address`

`this problem by jointly modeling each data stream as a sum of a shared`

`component and a 'private' component that models the within-set`

`variation (Bach and Jordan, 2005; Leen and Fyfe, 2006; Klami and`

`Kaski, 2006).`

`These approaches assume a simple relationship between (two) data`

`sources, i.e.assuming a so-called 'flat' data structure where the data`

`consists of N independent pairs of related data variables; whereas in`

`practice, related data sources may exhibit extremely complex co-`

`variation (for instance, audio and visual streams related to the same`

`video). A potential solution to this problem could be a fully`

`probabilistic approach, which could be used to impose structured`

`variation within and between data sources. Additional methodological`

`challenges include determining what is the 'useful' information we are`

`trying to learn from the multiple sources and building models for`

`predicting one data source given the others. As well as the`

`unsupervised learning of multiple data sources detailed above, there`

`is the closely related problem of multitask learning (Bickel et al.,`

`2008), or transfer learning, where a task is learned from other`

`related tasks.`

WORKSHOP

`The aim of the workshop is to promote discussion amongst leading`

`machine learning and applied researchers about learning from multiple,`

`related sources of data, with a focus on both methodological issues`

`and applied research problems.`

Topics of the workshop include (but not limited to):

`- unsupervised learning (generative / discriminative modeling) of`

`multiple related data sources`

- canonical correlation analysis-type methods

`- data fusion for real world applications, such as bioinformatics,`

`sensor networks, multimodal signal processing, information retrieval`

- multitask /transfer learning - multiview learning INVITED SPEAKERS Prof. Michael Jordan University of California, Berkeley http://www.cs.berkeley.edu/~jordan/ Dr. Francis Bach École normale supérieure http://www.di.ens.fr/~fbach/ Dr. Tobias Scheffer Max-Planck-Institut fur Informatik http://www.mpi-inf.mpg.de/~scheffer/ ORGANISERS David R. Hardoon (University College London) Gayle Leen (Helsinki University of Technology) Samuel Kaski (Helsinki University of Technology) John Shawe-Taylor (University College London) PROGRAM COMMITTEE Andreas Argyriou (University College London) Tom Dieithe (University College London) Colin Fyfe (University of the West of Scotland) Jaakko Peltonen (Helsinki University of Technology) SUBMISSIONS

`We invite the submission of high quality extended abstracts (2 to 4`

`pages) in the NIPS style http://nips.cc/PaperInformation/StyleFiles.`

`Abstracts should be sent (in .pdf/.ps) to [EMAIL PROTECTED], [EMAIL PROTECTED]`

`.`

`A selection of the submitted abstracts will be accepted as either an`

`oral presentation or poster presentation. The best abstracts will be`

`considered for extended versions in the workshop proceedings, and`

`possibly as a special issue of a journal.`

IMPORTANT DATES 24 Oct 08 Submission deadline for extended abstracts 28 Oct 08 Notification of acceptance 13 Dec 08 Workshop at NIPS 08, Whistler, Canada REFERENCES

`BACH, F.R., & JORDAN, M.I. 2002. Kernel Independent Component`

`Analysis. Journal of Machine Learning, 3, 1-48.`

`BACH, F.R., & JORDAN, M.I. 2005. A Probabilistic Interpretation of`

`Canonical Correlation Analysis. Tech. rept. 688. Dept of Statistics,`

`University of California.`

`BECKER, S. 1996. Mutual Information Maximization: models of cortical`

`selforganisation. Network: Computation in Neural Systems, 7, 7-31.`

`BICKEL, S., BOGOJESKA, J., LENGAUER, T., & SCHEFFER, T. Multi-task`

`learning for HIV therapy screening. ICML 2008`

`CHECHIK, G., GLOBERSON, A., TISHBY, N., & WEISS, Y. 2003. Information`

`Bottleneck for Gaussian variables. Pages 1213-1220 of: THRUN, S.,`

`SAUL, L.K., & SCH¨OLKOPF, B. (eds), Advances in Neural Information`

`Processing Systems, vol. 16.`

`HARDOON, D. R., SZEDMAK, S. & SHAWE-TAYLOR, J. 2004 Canonical`

`Correlation Analysis: An Overview with Application to Learning`

`Methods. Neural Computation, 16(12), 2639-2664`

`HOTELLING, H. 1936. Relations between two sets of variates.`

`Biometrika, 28, 312-377.`

`KETTENRING, J. R. 1971. Canonical analysis of several sets of`

`variables. Biometrika, 58(3), 433-451.`

`KLAMI, A., & KASKI, S. 2006. Generative models that discover`

`dependencies between two data sets. Pages 123-128 of: MCLOONE, S.,`

`ADALI, T., LARSEN, J., HULLE, M. VAN, ROGERS, A., & DOUGLAS, S.C.`

`(eds), Machine Learning for`

Signal Processing XVI. IEEE.

`LAI, P. L., & FYFE, C. 2000. Kernel and Nonlinear Canonical`

`Correlation Analysis. International Journal of Neural Systems, 10(5),`

`365-377.`

`LEEN, G., & FYFE, C. 2006. A Gaussian Process Latent Variable Model`

`Formulation of Canonical Correlation Analysis. Pages 413-418 of:`

`Proceedings of the 14th European Symposium of Artificial Neural`

`Networks (ESANN).`

_______________________________________________ uai mailing list uai@ENGR.ORST.EDU https://secure.engr.oregonstate.edu/mailman/listinfo/uai