On 3/13/2013 9:33 AM, Tom Morris wrote:

> Just to point out the opposing view, there is the school that says
> humans should just enter things in whatever format they want and it's up
> to the machines to figure it out.  As an example, ISBN 10 vs 13, dashes
> vs no dashes is trivial for a machine to deal with.  Still and all,
> better standards/documentation would be useful for the areas where it
> matters.

Yes, but ...

Data that gets persisted into a data store should be normalized so that 
later consumers know exactly what they are getting, and can utilize it 
using common algorithmic techniques. Yes, machine processing should be 
used to convert from common forms into normalized forms, but if there is 
no computerized method to normalize the data then one has to rely on 
users to normalize the data before entry, an expectation which is 
virtually guaranteed to be unreliable.

It seems that OL's strategy in the past has been to accept whatever gets 
input as canonical, which has led to the current state of the database. 
Standardization and documentation is not for the benefit of Jane Doe who 
is entering the data in a web form, but for the benefit of the 
programmer who is 1. validating the data before accepting it or 2. 
normalizing the data before storing it. If OL is unwilling to accept 
responsibility for 1. and 2., then attempting to solve the problem 
through documentation /will/ fail.

But having standards to guide our processes will at least tell us when 
/we/ have failed even if we can't expect the world-at-large to 
understand the standards.



_______________________________________________
Ol-tech mailing list
[email protected]
http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech
To unsubscribe from this mailing list, send email to 
[email protected]

Reply via email to