Hi, folks. There currently seems to be a buzz around "data contracts". From what I can tell, these mainly advocate a cultural solution. But instead, could big data tools be used to enforce these contracts?
My questions really are: are there any plans to implement data constraints in Spark (eg, an integer must be between 0 and 100; the date in column X must be before that in column Y)? And if not, is there an appetite for them? Maybe we could associate constraints with schema metadata that are enforced in the implementation of a FileFormatDataWriter? Just throwing it out there and wondering what other people think. It's an area that interests me as it seems that over half my problems at the day job are because of dodgy data. Regards, Phillip