---------------------------------------- > To: tutor@python.org > From: alan.ga...@btinternet.com > Date: Wed, 26 Aug 2015 17:29:08 +0100 > Subject: Re: [Tutor] value range checker > > On 26/08/15 14:19, Albert-Jan Roskam wrote: > >> I have a written a function checks the validity of values. >> The ranges of valid values are stored in a database table. > > That's an unusual choice because: > > 1) using a database normally only makes sense in the case > where you are already using the database to store the > other data. But in that case you would normally get > validation done using a database constraint.
The other data are indeed also stored in the database. But CHECK constraints seem an attractive mechanism to get validation done. It should be possible to define a CHECK contstraint with CASE statements in the CREATE TABLE definition. This page even describes how to do a modulus-11 check, which I also need: https://www.simple-talk.com/sql/learn-sql-server/check-your-digits/ Will the Python exceptions be clear enough if a record is rejected because one or more contraints are not met? I mean, if I only get a ValueError or a TypeError because *one* of the 100 or so columns is invalid, this would be annoying. > 2) For small amounts of data the database introduces > a significant overhead. Databases are good for handling > large amounts of data. > > 3) A database is rather inflexible since you need to > initialise it, create it, etc. Which limits the number > of environments where it can be used. > >> Such a table contains three columns: category, min and max. ... >> a category may be spread out over multiple records. > > And searching multiple rows is even less efficient. > >> Would yaml be a better choice? Some of the tables are close to 200 records. > > Mostly I wouldn't use a data format per-se (except for > persistence between sessions). I'd load the limits into > a Python set and let the validation be a simple member-of check. > > Unless you are dealing with large ranges rather than sets > of small ranges. Even with complex options I'd still > opt for a two tier data structure. But mostly I'd query > any design that requires a lot of standalone data validation. > (Unless its function is to be a bulk data loader or similar.) > I'd probably be looking to having the data stored as > objects that did their own validation at creation/modification > time. The data are collected electronically, but also by paper-and-pencil. With a web page you can check all kinds of things right at the beginning. But that's not true for paper-and-pencil data collection. Bottom line is that *only* electronic data collection would make things easier. > If I was doing a bulk data loader/checker I'd probably create > a validation function for each category and add it to a > dictionary. So I'd write a make_validator() function that > took the validation data and created a specific validator > function for that category. Very simple example: > > def make_validator(min, max, *values): > def validate(value): > return (min <= value <= max) or value in *values) > return validator This looks simple and therefore attractive! Thank you! > for category in categories: > lookup[category] = make_validator(min,max, valueList) > ... > if lookup[category](my_value): > # process valid value > else: > raise ValueError > > -- > Alan G > Author of the Learn to Program web site > http://www.alan-g.me.uk/ > http://www.amazon.com/author/alan_gauld > Follow my photo-blog on Flickr at: > http://www.flickr.com/photos/alangauldphotos > > > _______________________________________________ > Tutor maillist - Tutor@python.org > To unsubscribe or change subscription options: > https://mail.python.org/mailman/listinfo/tutor > _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor