Yes, this is part of Matei's current research, for which code is not yet publicly available at all, much less in a form suitable for production use.
On Mon, Dec 26, 2016 at 2:29 AM, Evan Chan <vel...@gmail.com> wrote: > Looks pretty interesting, but might take a while honestly. > > On Dec 25, 2016, at 5:24 PM, Mark Hamstra <m...@clearstorydata.com> wrote: > > NOt so much about between applications, rather multiple frameworks within > an application, but still related: https://cs.stanford. > edu/~matei/papers/2017/cidr_weld.pdf > > On Sun, Dec 25, 2016 at 8:12 PM, Kazuaki Ishizaki <ishiz...@jp.ibm.com> > wrote: > >> Here is an interesting discussion to share data in columnar storage >> between two applications. >> https://github.com/apache/spark/pull/15219#issuecomment-265835049 >> >> One of the ideas is to prepare interfaces (or trait) only for read or >> write. Each application can implement only one class to want to do (e.g. >> read or write). For example, FiloDB wants to provide a columnar storage >> that can be read from Spark. In that case, it is easy to implement only >> read APIs for Spark. These two classes can be prepared. >> However, it may lead to incompatibility in ColumnarBatch. ColumnarBatch >> keeps a set of ColumnVector that can be read or written. The ColumnVector >> class should have read and write APIs. How can we put the new ColumnVector >> with only read APIs? Here is an example to case incompatibility at >> https://gist.github.com/kiszk/00ab7d0c69f0e598e383cdc8e72bcc4d >> >> Another possible idea is that both applications supports Apache Arrow >> APIs. >> Other approaches could be. >> >> What approach would be good for all of applications? >> >> Regards, >> Kazuaki Ishizaki >> > > >