Monday
June 2
4:00 - 4:50 PM
Kelley 1001
Christopher Scaffidi
Ph.D. Candidate
School of Computer Science
Carnegie Mellon University
Topes: Reusable Abstractions for Validating and Reformatting Data
Programmers often omit input validation when inputs can appear in many
different formats or when validation criteria cannot be precisely
specified. To enable validation in these situations, we present a new
technique that puts valid inputs into a consistent format and that
identifies "questionable" inputs which might be valid or invalid, so
that these values can be double-checked by a person or a program. Our
technique relies on the concept of a "tope", which is an application-
independent abstraction describing how to recognize and transform
values in a category of data. We present our definition of topes and
describe a development environment that supports the implementation
and use of topes. Experiments with web application and spreadsheet
data indicate that using our technique improves the accuracy and
reusability of validation code and also improves the effectiveness of
subsequent data cleaning such as duplicate identification.
Biography
Christopher Scaffidi is a 4th year PhD student in the software
engineering program at Carnegie Mellon University. Prior to graduate
school, he was a professional web application developer for 6 years,
so he is well-familiar with the pains of validating data in real
applications.
_______________________________________________
Colloquium mailing list
[email protected]
https://secure.engr.oregonstate.edu/mailman/listinfo/colloquium