Hi all, Just to clarify. Yes Arrow intends to define network protocols. The file format is merely the network messages in a file. We are also looking into IPC. Inter-process communication using shared memory.
On Thu, Nov 3, 2016 at 5:51 AM, Donald Foss <[email protected]> wrote: > Abdulrahman, your schema diagram did not come through, at least not in a > way I could view it in Mac Mail. Looking at the message source, I don’t > see the specified Content ID [cid] or inline data element for the graphic. > > Generally speaking, I believe the Arrow project defines data structures, > file formats, and in-memory processing methods and various corresponding > properties. As a project follower who eagerly tests nightlies and comments > on new release candidates, I definitely do not speak for anyone other than > myself. That said, I do not believe that Apache Arrow intended to include > network and messaging protocols. There are a large number of those > available, from the ever popular 0MQ, to what is becoming my new favorite, > Cap’n Proto (https://github.com/sandstorm-io/capnproto < > https://github.com/sandstorm-io/capnproto>), along with it’s with it’s > Java compatibility repository (https://github.com/sandstorm-io/capnproto < > https://github.com/sandstorm-io/capnproto>). Note that I have no > relationship without that project except technical jealousy. > > Side note: FWIW, even though I don’t know exactly what you’re doing, if > it’s streaming, I generally go with Flink. > > —Donald > > > On Nov 3, 2016, at 4:10 AM, Abdulrahman Kaitoua < > [email protected]> wrote: > > > > > > Dears, > > > > I would like to get more information from you in order for me to use > Arrow and be able to contribute in the near future. > > > > What i see in Arrow that i can read and write Arrow files (from the > vector test classes), i did not see tests for sending data over a network. > As i understood from the project proposal (correct me if i am wrong.), that > i can write Arrow Array from somewhere and read from somewhere else, this > means that Arrow would be such a centralised server that hold a state and > engines will connect to it to write Arrow Arrays and other engines will > read (like in the picture bellow). How far Arrow from having this > centralised system, where we are now? > > > > I am working on an application which is about moving data while changing > the schema in between the source and the destination. Like moving the data > from Apache Spark to Apache Flink and in between change the schema. > > > > Regards, > > > > ------------------------------------------------------ > > Abdulrahman Kaitoua > > > > Ph.D. Candidate at Politecnico di Milano > > > > Department of Electronics, Information and Bioengineering > > Piazza Leonardo da Vinci 32 - 20133 Milano, Italy > > > > Tel. Lab: +39 02 2399 3631 > > > > > > > > -- Julien
