NOt so much about between applications, rather multiple frameworks within an application, but still related: https://cs.stanford.edu/~matei/papers/2017/cidr_weld.pdf
On Sun, Dec 25, 2016 at 8:12 PM, Kazuaki Ishizaki <ishiz...@jp.ibm.com> wrote: > Here is an interesting discussion to share data in columnar storage > between two applications. > https://github.com/apache/spark/pull/15219#issuecomment-265835049 > > One of the ideas is to prepare interfaces (or trait) only for read or > write. Each application can implement only one class to want to do (e.g. > read or write). For example, FiloDB wants to provide a columnar storage > that can be read from Spark. In that case, it is easy to implement only > read APIs for Spark. These two classes can be prepared. > However, it may lead to incompatibility in ColumnarBatch. ColumnarBatch > keeps a set of ColumnVector that can be read or written. The ColumnVector > class should have read and write APIs. How can we put the new ColumnVector > with only read APIs? Here is an example to case incompatibility at > https://gist.github.com/kiszk/00ab7d0c69f0e598e383cdc8e72bcc4d > > Another possible idea is that both applications supports Apache Arrow APIs. > Other approaches could be. > > What approach would be good for all of applications? > > Regards, > Kazuaki Ishizaki >