Sorry, it looks like the message was sent multiple times. Let's use this thread for the discussion!
On June 13, 2025 8:11:21 AM GMT+02:00, Sem <ssinche...@apache.org> wrote: >Hello! > >At the moment in format spec the DataType is enumeration:``` >enum DataType { > BOOL = 0; > INT32 = 1; > INT64 = 2; > FLOAT = 3; > DOUBLE = 4; > STRING = 5; > LIST = 6; > DATE = 7; > TIMESTAMP = 8; > TIME = 9; >}; >``` > >But it makes unclear what can be the subtype of the LIST. In the real >life, LIST is transformed to `list<>` in the output yaml: > >``` > - properties: > - name: feature > data_type: list<float> > is_primary: false >``` > >but it does not match with a format specification from my point of >view. > >I would like to propose an update to the format definition by making >each possible DataType a message instead of enum. Something like: > >``` >message BOOL { > string name = 1; >}; >message INT32 { > string name = 1; >}; >message INT64 { > string name = 1; >}; >... >message LIST { > string name = 1; > oneof element_type { > BOOL = 1; > INT32 = 2; > INT64 = 3; > ...; > } >} >``` > >For the case we are not going to support nested collections. > >For the real code it will look like: > >``` > - properties: > - name: feature > data_type: > name: list > element_type: > name: float > is_primary: false >``` > >Motivation of the proposed change: the current way left handling of >nested types to the specific implementation (like C++ impl writes it in >the way `list<float>`. We should enforce the way in the standard spec >instead! > > >If there won't be any negative feedback I will open a formal VOTE >process. > >Best regards, >Sem > >--------------------------------------------------------------------- >To unsubscribe, e-mail: dev-unsubscr...@graphar.apache.org >For additional commands, e-mail: dev-h...@graphar.apache.org >