On Tue, Feb 28, 2017 at 5:56 PM, Michel Desmoulin <desmoulinmic...@gmail.com > wrote:
> Me, I have to deal SOAP government systems, mongodb based API built by > teenagers, geographer data set exports and FTP + CSV in marina systems > (which I happen to work on right now). > > 3rd party CSV, XML and JSON processing are just a hundred of lines of > try/except on indexing because they have many listings, data positions > is important and a lot of system got it wrong, giving you inconsistent > output with missing data and terrible labeling. > I feel your pain -- data munging is often a major mess! > And because life is unfair, the data you can extract is often a mix of > heterogeneous mappings and lists / tuples. And your tool must manage the > various versions of the data format they send to you, some with > additional fields, or missing ones. Some named, other found by position. > If I were dealing with a mix of mappings and index-able data, and the index-able data were often poorly formed (items missing), I think I"d put it all in dicts -- some of which happened to have integers as keys. Or just put a None in everywhere there should be a value in a sequence that is missing. if data is coming in from a "schema-less" system, then what CAN you do with a sequence that is inconsistent? How can yo possible know which are missing if the sequence is too short? If it is always the last N times then nit' snot hard to pad the sequence. if it's not -- then what you have is a mapping that happens to have integers as keys. Not trying to be harsh here -- I'm just not at all sure that adding a get() to sequences is the right solution to these problems. Maybe someone else will chime in with more "I'd really have a use for this" examples. -CHB This summer, I had to convert a data set provided by polls in africa > through an android form, generated from an XML schema, Actually, I'm surprised that the XML schema step didn't enforce that the data be well formed. ISn't that the whole point of an XML schema? -- but you're point is well taken -- data are often not well formed. -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/