With the goal of finalizing a spec for V1, I've created JIRAs on the format component. Here is a list bellow. Please discuss on individual JIRAs if you have comments.
- [ARROW-253] restrict ints to only power of 2 #bytes (8, 16, 32, 64) - [ARROW-254] remove Bit as we use Boolean for nullability array (validity vector) - [ARROW-252] add implementation guidelines - [ARROW-255] dictionary encoding spec - [ARROW-258] need to clarify Buffer.{page,offset} in mem sharing and RPC/file contexts - [ARROW-256] add format version - [ARROW-257] add types vector to union type (to enable using type ids instead of child offset) full list: https://issues.apache.org/jira/issues/?jql=project%20%3D%20ARROW%20AND%20component%20%3D%20Format%20AND%20resolution%20%3D%20Unresolved%20ORDER%20BY%20due%20ASC%2C%20priority%20DESC%2C%20created%20ASC -- Julien