Re: msgpack handling lists with elements of different types

2018-10-17 Thread Jean-Claude Cote
I see what you mean by a schema only gets you so far. Your Fred Flinstone example shows how you almost need the ability to apply a transformation at the reader level (instead of at the projection level) to properly read such data files. I think I agree with Charles Givre. I've always like the tag

Re: msgpack handling lists with elements of different types

2018-10-17 Thread Paul Rogers
Hi JC, Bingo, you just hit the core problem with schema-on-read: there is no "right" rule for how to handle ambiguous or inconsistent schemas. Take your string/binary example. You determined that the binary fields were actually strings (encoded in what, UTF-8? ASCII? Host's native codeset?)

msgpack handling lists with elements of different types

2018-10-17 Thread Jean-Claude Cote
I'm writing a msgpack reader and have encountered datasets where an array contains different types for example a VARCHAR and a BINARY. Turns out the BINARY is actually a string. I know this is probably just not modeled correctly in the first place but I'll still going to modify the reading of list