Apologies for multiple copies of this messsage.
Please circulate this CFP among your colleagues and students.
----------------------------------------------------------
Call for Papers
DWA 2016
1st International Workshop on Data Wrangling Automation
Barcelona, Spain, December 12, 2016
(satellite workshop of ICDM 2016)
http://www.dsic.upv.es/~flip/DWA2016/
-----------------------------------------------------------
- Aims and Scope -
It is well known that a great proportion of the time devoted to data
mining and, especially, data science projects is devoted to data
acquisition, integration, transformation, cleansing and other highly
tedious tasks. These tasks are tedious basically because they are
repetitive and, hence, automatable. As a consequence, progress in the
automation of this process can lead to a dramatic reduction of the cost
and duration of data-oriented projects. Recently, inductive programming
in general (and the learning of declarative rules and programs from a
few user interaction examples in particular) has shown a large potential
for this automation. The release of FlashFill
(http://www.lifehacker.com.au/2012/07/excel-flash-fill-is-a-brilliant-time-saver/)
as a plug-in inductive programming tool for Microsoft Excel and
ConvertFrom-String
(https://www.sepago.com/blog/2015/01/16/powershell-convertfrom-string-the-new-way-of-extracting-data)
as a Powershell command on Windows 10 are impressive demonstrations that
inductive programming research has matured in such a way that commercial
applications become feasible.
The aim of this workshop is to gather practitioners and researchers
around the use of inductive programming techniques, programming by
example and other learning techniques to automate the data wrangling
process. It is well known that a great proportion of the time devoted to
data mining and, especially, data science projects is devoted to data
acquisition, integration, transformation, cleansing and other highly
tedious tasks. These tasks are tedious basically because they are
repetitive and, hence, automatable. As a consequence, progress in the
automation of this process can lead to a dramatic reduction of the cost
and duration of data-oriented projects.
We welcome regular papers, demo papers about benchmarks or tools, and
position papers, and encourage discussions over a broad list of topics
(not exhaustive):
* Automation applied to data cleaning, data transformation, and data
acquisition.
* Visual interfaces to accelerate the automation of data wrangling.
* Domain-specific languages for data wrangling vs general-purpose
languages.
* Explanation of data wrangling rules into natural language.
* Automation in ETL (Extraction/Transformation/Load) tools.
* Learning actionable rules automating other parts of the KDD process:
model evaluation and deployment.
* Abstraction mechanisms from inductive programming for metadata
creation and handling.
* Data wrangling showcases
Apart from the technical sessions, a demo session and the direct
involvement of companies such as Microsoft and BigML
(https://bigml.com/team), we are planning to have a panel on "Data
Wrangling Automation Challenges", a lively discussion on inductive
programming tools vs writing scripts.
- Invited Speakers -
(To be announced)
- Submission -
We solicit submissions reporting on:
A- Original research contributions
B- Applications and experiences
C- Surveys, comparisons, and state-of-the-art reports
D- Tool papers
E- Position papers and work in progress related to the topics mentioned
above.
F- Work in progress papers.
Submitted papers should be limited to a maximum of eight (8) pages, in
the IEEE 2-column format, including the bibliography and any possible
appendices. Submissions longer than 8 pages will be rejected without a
review. All papers must be formatted according to the IEEE Computer
Society proceedings manuscript style, following IEEE ICDM 2016
submission guidelines.
Manuscripts must be submitted electronically through the IEEE ICDM
CyberChair system. We do not accept email submissions.
Authors of accepted papers will be asked to prepare a presentation
(short or long) during the workshop. Pre-proceedings containing all
accepted papers will be included in the IEEE ICDM 2016 Workshops
Proceedings volume published by IEEE Computer Society Press, and will
also be included in the IEEE Xplore Digital Library. After the workshop,
contributing authors will be invited to submit a paper to a special
issue (journal to be announced).
- Important dates -
Submission deadline: August 12, 2016
Notification of acceptance: September 13, 2016
Camera-ready: September 20, 2016
Workshop: December 12, 2016
- Program chairs -
Ben Zorn Microsoft Research (Washington)
Cèsar Ferri Technical University of
Valencia (Spain)
Atakan Cetinsoy BigML (Oregon)
Gustavo Soares Berkeley (California)
Fernando Martínez-Plumed Technical University of Valencia (Spain)
- Program committee -
Luc De Raedt Katholieke Universiteit Leuven
Peter Flach University of Bristol
José Hernandez-Orallo Technical University of Valencia
Bongshin Lee Microsoft Research
Ute Schmid University of Bamberg
Mary Roth IBM Research
Armando Solar-Lezama Massachusetts Institute of Technology
Rishabh Singh Microsoft Research
Gemma C. Garriga Allianz SE
Janis Voigtländer University of Bonn
Ricardo Aler Mur University Carlos III
Umair Z. Ahmed Indian Institute of Technology
- Contact Person
Cèsar Ferri
(web) http://www.dsic.upv.es/~cferri/
(email) [email protected]