*Post-doc position : Information retrieval for Medical Scientific publications *

*Advisors:*

- M. Constant (U. of Lorraine). Website : https://perso.atilf.fr/mconstant/

- M. Clausel (U. Lorraine). Website : https://sites.google.com/site/marianneclausel/

- R.S. Stoica (U. Lorraine). Website : https://sites.google.com/site/radustefanstoica/


*Other partners of the project *: C. Francois (INIST), P. Oudet (Cancéopôle Est), F. Schaffner (Cancéropôle Est), N. Thouvenin (INIST).

*Location: *University of Lorraine, Nancy (France)

*Keywords: *Natural language processing, word embeddings, biomedical text mining, graph matching


*Context:*

The Cancéropôle Est is one of the 7 Cancéropôles created by the first national cancer action in 2003. Its missions are organizing, coordinating, and strengthening research against cancer in partnership with academic and clinical institutions by associating researchers, healthcare professionals, industrials and patients.


The aim of the project is to establish a cartography of the scientific research in Oncology in the two French Regions Grand Est and Bourgogne Franche Comté using the full text of scientific publications of each research team in the two regions.


*Description of the position:*

This position is funded by AMIES, University of Lorraine and Canceropôle Est. With this position, we would like to go use text mining technics to extract characteristics related to the scientific content of the publications of each research team in Grand Est and Bourgogne Franche Comt\'e.


The recruited person will work on the following points:

- Preprocessing of the data. The data will be provided by the Canc\'eropole Est and will consist of several full texts in xml or pdf format.

- Learning of oncology embedding (see for e.g. [1]). INIST will provide training data to learn the embedding and ontology

- Extraction of characteristics related to the scientific content of publications for each research team.

- Combine these characteristics and collaboration graph of each team (see for e.g. [2]) to provide general characteristics for each team

- Integration in a vizualisation tool


The recruited person will benefit from the expertise of Canceropôle Est, INIST and University of Lorraine in text mining and statistical learning.


We would ideally like to recruit a 11 month post-doc with the following preferred skills:

- Knowledgeable in natural language processing, text mining and word embeddings

- Knowledgeable in machine learning

- Good programming skills in Python (classical NLP librairies, scikit-learn, Pytorch and/or Tensor Flow)

- Very good English (understanding and writing)


The candidates should send a CV, 2 names of referees and a cover letter to the researchers mentioned above ([email protected], [email protected], [email protected]). The selected candidates will be interviewed in February for an expected start in

March/April 2019.


*Bibliography*

[1] J. Lee et al. BioBERT: a Pre Trained Biomedical Language Representation Model for Biomedical Text Mining. Ed. Jonathan Wren. Bioinformatics (2019).

[2] Q. Laporte-Chabasse et al. Morpho-statistical description of networks through graph modelling and Bayesian inference. Preprint 2019


_______________________________________________
uai mailing list
[email protected]
https://secure.engr.oregonstate.edu/mailman/listinfo/uai

Reply via email to