pitrou commented on code in PR #41958:
URL: https://github.com/apache/arrow/pull/41958#discussion_r1627814694


##########
docs/source/format/Columnar.rst:
##########
@@ -70,21 +70,131 @@ concepts, here is a small glossary to help disambiguate.
   without taking into account any value semantics. For example, a
   32-bit signed integer array and 32-bit floating point array have the
   same layout.
-* **Parent** and **child arrays**: names to express relationships
-  between physical value arrays in a nested type structure. For
-  example, a ``List<T>``-type parent array has a T-type array as its
-  child (see more on lists below).
+* **Data type**: An application-facing semantic value type that is
+  implemented using some physical layout. For example, Decimal128
+  values are stored as 16 bytes in a fixed-size binary
+  layout. A timestamp may be stored as 64-bit fixed-size layout.
 * **Primitive type**: a data type having no child types. This includes
   such types as fixed bit-width, variable-size binary, and null types.
 * **Nested type**: a data type whose full structure depends on one or
   more other child types. Two fully-specified nested types are equal
   if and only if their child types are equal. For example, ``List<U>``
   is distinct from ``List<V>`` iff U and V are different types.
-* **Logical type**: An application-facing semantic value type that is
-  implemented using some physical layout. For example, Decimal
-  values are stored as 16 bytes in a fixed-size binary
-  layout. Similarly, strings can be stored as ``List<1-byte>``. A
-  timestamp may be stored as 64-bit fixed-size layout.
+* **Parent** and **child arrays**: names to express relationships
+  between physical value arrays in a nested type structure. For
+  example, a ``List<T>``-type parent array has a T-type array as its
+  child (see more on lists below).
+* **Parametric type**: a type which requires additional parameters
+  for full determination of its semantics. For example, all nested types
+  are parametric by construction. A timestamp is also parametric as it needs
+  a unit (such as microseconds) and a timezone.

Review Comment:
   Ahah, good question. In the context of the Flatbuffers serialization, yes. 
But there's only a small fixed number of possible parameters, so this is less 
obvious than for timestamps or nested types, where there's a virtual infinity 
of possibilities.
   
   As @jorisvandenbossche rightfully pointed out below, the Flatbuffers 
serialization in the IPC format is only one possible representation of the type 
system; others could work differently; as a matter of fact, the [C Data 
Interface](https://arrow.apache.org/docs/format/CDataInterface.html#data-type-description-format-strings)
 uses entirely different format codes for these types.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to