I'm confused. TimestampStatistics uses integers not strings. .. Owen
On Mon, Jun 5, 2017 at 9:53 PM, Dain Sundstrom <d...@iq80.com> wrote: > > > On Dec 12, 2016, at 4:48 PM, Dain Sundstrom <d...@iq80.com> wrote: > > On Dec 12, 2016, at 4:36 PM, Owen O'Malley <omal...@apache.org> wrote: > >>> I think this should also be documented in the statistics section which > >> also uses UTF-16 BE, which is at least consistent, but still annoying > for > >> everything other than Java. > >> > >> Yes, it should be documented and we should replace it with UTF-8. > (Although > >> changes to the serialized form are always painful.) > > > > I think we can do something similar to the bloom filter code, where we > add a StringUtf8Stats object and have a transition period where we can > produce both. > > I was looking at the change proto changes to TimestampStatistics, and I > think the same thing could work here. We add: > > optional string minimumUtf8 = 4; > optional string maximumUtf8 = 5; > > and the update the writer write just the UTF-8 version (or both during a > transition). > > -dain