> I don't think we can change the behavior of the "default" key. Otherwise, > older readers would use the wrong value.
This is true, but the "human-readable default" feature is inherently incompatible with older readers. My hope was that giving an invalid type for the default would cause an error when older readers try to parse it, but that's not the case and you're right. There would still always be an issue with specially crafted record types. > I suggest that we add an optional key, like "default-as-string", that is used to fill in a missing "default" key if there is a reasonable conversion. So then if an older reader reads a schema field with "default-as-string" used instead of "default", it will decide that field has no default? I don't really like that, but it's better than using the wrong value (e.g. "default" + "default-parser") or erroring on most data reads (changing the "default" field to an object). I don't think we can make old readers fail properly, since they would have to already have the future knowledge that there is supposed to be a default value. Someone correct me if I'm wrong on this. (Generically it should be possible if we included schema spec versions in schemas.) What would be your criteria for there being a reasonable conversion? Field type and logical type? > On write, the write schema would convert to the normal "default" field for backward-compatibility. Good idea - this should be generically possible no matter how human-readable defaults are implemented in the spec. > On read, you can supply only the string default to use that instead of the binary one. I think we could take care of this entirely in the schema parser. On the same page here. - Bridger Howell On Wed, Oct 18, 2017 at 9:56 AM, Ryan Blue <[email protected]> wrote: > I don't think we can change the behavior of the "default" key. Otherwise, > older readers would use the wrong value. > > I suggest that we add an optional key, like "default-as-string", that is > used to fill in a missing "default" key if there is a reasonable > conversion. On write, the write schema would convert to the normal > "default" field for backward-compatibility. On read, you can supply only > the string default to use that instead of the binary one. I think we could > take care of this entirely in the schema parser. > > rb > > On Tue, Oct 17, 2017 at 11:53 PM, Bridger Howell <[email protected]> wrote: > > > I really like the idea of having support for human-readable default > values. > > > > I think I prefer to keep the way defaults are interpreted separate from > > logical types, since logical types having are basically optional. I would > > be surprised if my language of choice could understand an ISO-8601 > > formatted local-date for a field default based on logical type, but I > still > > had to interface with a numeric value in my code. > > > > If this doesn't conflict too much with the default value for record > fields > > (?), I would suggest having an object syntax with a "parser" or "type" > > field in addition to the default property. > > > > A sample record: > > { > > "type": "record", > > "name": "Foo", > > "fields": [ > > { > > "name: "body", > > "type": "bytes", > > "default": { > > "value": "aGVsbG8gd29ybGQ", > > "parser": "base64", > > "doc": "'hello world' as a base64-encoded string" > > } > > ] > > } > > > > If changing the "default" property like that has too many issues, I > suppose > > a parallel "default-parser" property would do the trick too. > > > > I think this type of approach keeps us neatly separated from logical > types, > > so that having a parser for a default value doesn't require a logical > type, > > and maybe makes it clearer which procedure is being performed on the JSON > > data to convert it to the base field type. > > > > -Bridger Howell > > > > On Tue, Oct 17, 2017 at 9:57 AM, Ryan Blue <[email protected]> > > wrote: > > > > > I think that the parsing canonical form of a schema > > > <https://avro.apache.org/docs/1.8.2/spec.html#Parsing+Canoni > > > cal+Form+for+Schemas> > > > doesn't include the default. I think that makes sense because the > > canonical > > > form is what's needed to read encoded data. Anyone with more context: > is > > > that correct? > > > > > > In my opinion, that makes how we handle defaults a bit more flexible > > > because schemas with different defaults are "the same". I'd support > > adding > > > a new default field that handles values more naturally. We've always > had > > a > > > problem with binary as well and I'd like to see us use base64 encoded > > > values instead of the current strategy. > > > > > > rb > > > > > > On Tue, Oct 17, 2017 at 8:16 AM, Zoltan Ivanfi <[email protected]> > wrote: > > > > > > > Hi, > > > > > > > > I would like to start a discussion about making default values and > > values > > > > in general human-readable for logical types. > > > > > > > > Currently default values for logical types have to be specified in a > > JSON > > > > string as the binary representation of the backing primary type > (e.g., > > > > "\u0000"). Some users intuitively try to specify a human-readable > > logical > > > > value in this string instead (e.g., "0.00"). This is of course a > valid > > > byte > > > > sequence and as such is accepted, but it results in unexpected > > behaviour > > > (a > > > > different default value than intended). Apart from being error prone, > > > > specifying default values this way is also tedious. To keep this > e-mail > > > > brief, I won't list specific examples here, please see AVRO-2087 > > > > <https://issues.apache.org/jira/browse/AVRO-2087> for details > instead. > > > > > > > > The problem of non-human-readable values applies to JSON encoding of > > > actual > > > > data as well. One reason for using JSON is that it is human readable > > and > > > > therefore easy to debug. Seeing "\u00018" in a JSON file is not too > > > > intuitive and this specific example is actually quite misleading as > > well > > > > (it can be easily misread as "\u0018"). > > > > > > > > Introducing a new default value field (called human-readable-default > or > > > > logical-default for example) would allow easier specification of > > default > > > > values. (It doesn't solve the problem of accidentally misusing the > > > existing > > > > field though.) It is, however, not backwards compatible. An older > Avro > > > > library would ignore the new field and use a different default value. > > > > > > > > Introducing human-readable values in the JSON encoding is even more > > > clearly > > > > a breaking change. (Although for JSON we could add the human-readable > > > value > > > > as a separate extra field that gets ignored when reading. Problem is, > > > users > > > > may be tempted to change the value and be surprised. It's a pity that > > > JSON > > > > does not allow comments.) > > > > > > > > In your opinions, what would be the best way to deal with this > problem? > > > > > > > > Thanks, > > > > > > > > Zoltan > > > > > > > > > > > > > > > > -- > > > Ryan Blue > > > Software Engineer > > > Netflix > > > > -- > > > > > > The information contained in this email message is PRIVATE and intended > > only for the personal and confidential use of the recipient named above. > If > > the reader of this message is not the intended recipient or an agent > > responsible for delivering it to the intended recipient, you are hereby > > notified that you have received this message in error and that any > review, > > dissemination, distribution or copying of this message is strictly > > prohibited. If you have received this communication in error, please > > notify us immediately by email, and delete the original message. > > > > > > -- > Ryan Blue > Software Engineer > Netflix > -- Bridger Howell Software Engineer 1200 N. Montana Ave Helena, MT 59601 M: 406.422.9225 New York Times <https://www.nytimes.com/2016/10/20/business/dealbook/sofi-an-online-lender-is-looking-for-a-relationship.html> | Inc. <http://www.inc.com/maria-aspan/sofi-plans-traditional-bank-accounts.html> | Fast Company <https://www.fastcompany.com/3060461/most-innovative-companies/inside-sofis-exclusive-club-for-great-people> Wall Street Journal <http://www.wsj.com/articles/online-lender-sofis-bond-deal-receives-moodys-highest-rating-1463847062> | Quartz <https://qz.com/721983/the-newest-workplace-benefit-for-millennials-paying-down-their-student-loans/> | Forbes <http://www.forbes.com/sites/mnewlands/2016/11/23/sofi-is-dominating-the-finance-space-heres-what-theyre-planning-next/#42c658036261> -- The information contained in this email message is PRIVATE and intended only for the personal and confidential use of the recipient named above. If the reader of this message is not the intended recipient or an agent responsible for delivering it to the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution or copying of this message is strictly prohibited. If you have received this communication in error, please notify us immediately by email, and delete the original message.
