It's "arbitrary" from Arrow's point of view, because Arrow itself cannot represent this data (except as a binary blob). Though, as Micah said, this may change at some point.
Instead of extending Arrow to fit this use case, perhaps it would be better to write a separate library that sits atop Arrow for your purposes? Regards Antoine. Le 26/04/2019 à 04:20, Shawn Yang a écrit : > Hi Antoine, > It's not arbitrary data type, it's the type similar to data types in > https://spark.apache.org/docs/latest/sql-reference.html#data-types and > https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/table/sql.html#data-types. > Our framework is a framework that is similar to > flink streaming, but is written in c++/java/python. And data need to be > transferred from java process to python process by tcp or shared > memory if they are on the same machine. For example, one case is online > learning, the features is generated in java streaming, and > then training data is transferred to python tensorflow worker for training. > In system such as flink, data is row by row, not columnar, so there need a > serialization framework > to serialize data row by row in language-independent way for > c++/java/python. > > Regards