hi Johan, I'm also very excited to see the possibilities of using Arrow with FPGAs. Would you be interested in adding this project to the Powered By page (http://arrow.apache.org/powered_by/)? If so, feel free to submit a pull request into the site/ portion of the project.
best Wes On Sun, Feb 11, 2018 at 6:42 AM, Uwe L. Korn <[email protected]> wrote: > Dear Johan, > > this is an exciting use case for Arrow. Nice to hear about the benefits that > Arrow brings to the world of FPGAs. > > Greetings > > Uwe > > On Fri, Feb 9, 2018, at 10:11 PM, Johan Peltenburg - EWI wrote: >> Dear community, >> >> In follow-up of the e-mail below, we have made public our repository >> that contains our framework called Fletcher: A framework to integrate >> FPGA accelerators with Apache Arrow. >> >> https://github.com/johanpel/fletcher >> >> With this framework you are able to provide an Arrow schema from which >> an easy-to-use hardware interface for FPGAs is generated, reaping all >> the benefits that Arrow already offers. On top of that it increases the >> programmability of any acceleration project you'd want to build on top >> of Arrow. During run-time, you simply pass your Arrow table to the run- >> time part of the framework and your hardware will be able to read from >> it by using row index ranges, receiving streams of data in the form of >> the type you've defined through the schema. >> >> Currently there is an example project that does regular expression >> matching on an Arrow table with strings, running on the Amazon EC2 F1 >> platform. We are not sponsored by Amazon, but as anyone can launch an >> instance with an FPGA there, we thought it would be a good starting >> point to hopefully gain some interest, even if you don't have an FPGA >> card yourself. >> >> FPGA accelerators can be so fast that more often than not serialization >> kills a relatively large part of the performance. Our measurements in >> this (relatively simple) example show that by using Arrow to prevent >> serialization, we sometimes get up to 6X improvement in performance over >> not using Arrow, especially if we start in languages that run on JVMs, >> for example. (Thanks everyone!) >> >> We are looking forward for people with a little bit of FPGA experience >> to try it out and receive their thoughts, comments, etc. Please drop me >> an e-mail. >> >> With kind regards, >> >> Johan Peltenburg >> Computer Engineering Lab >> Delft University of Technology >> ________________________________________ >> From: Johan Peltenburg [[email protected]] >> Sent: Tuesday, November 28, 2017 16:29 >> To: [email protected] >> Subject: Development of an FPGA Accelerator framework around Apache Arrow >> >> Dear community, >> >> Over the last year we have been looking into integration of FPGA >> accelerators >> with big data frameworks such as Spark. Before Arrow took off, we >> experienced >> many issues like serialization overhead but also garbage collection issues, >> as well as language interoperability issues with our low-level stack. These >> are all problems that Arrow is now already solving for us in a very nice >> manner. >> >> We see a growing amount of support for infrastructure providers such as >> Amazon >> that offer instances with FPGA resources already. Also, we see very rapid >> advancements from the hardware technology side, where soon enough >> accelerators can (cache-coherently) be attached to host memory (for >> example in >> OpenCAPI), allowing accelerators to work in the same virtual address >> space as >> the host process. >> >> We believe that a somewhat standardized format for data in-memory like >> Arrow >> can help us generalize big data processing in FPGAs tremendously. At the >> same >> time, it is known to us that FPGAs are notorious for their high >> development time >> and low programmability. Therefore, to alleviate some of these burdens >> put upon >> an accelerator developer, we are building a generalized framework around >> Arrow >> that abstracts away a very cumbersome aspect of FPGA design; interfacing >> with >> the data. >> >> The framework takes Arrow Schemas as input, and generates a layer that >> on the >> one side interfaces with whatever the host platform provides to access host >> memory (our initial framework will target support for AXI and OpenCAPI), >> and >> on the other side will interface with the user kernel. >> >> The user can express request for access to the data in terms of row index >> ranges. The generated layer will then provide data streams to the user, >> which >> the user may read using some kernel that they designed using high-level >> synthesis (for example they could write the kernel in OpenCL). Thus, >> they do >> not need to go into the specifics of the Arrow in-memory format, bother >> with >> creating hardware constructs to deal with index buffers and validity >> buffers, >> interfacing with the host-side bus, implementing FIFO's, etc... anymore. >> Hopefully this will be beneficial to faster deployment of FPGA accelerated >> applications based on data represented in the Arrow format. >> >> Currently the framework supports schemas of primitive data types, (nested) >> lists and structs. The major challenge here was to be able to generate >> hardware >> structures from the many forms of schemas that users may provide, but these >> challenges have been solved. We are in the process of testing the >> framework in >> simulation, and will soon move to a test on real FPGA systems. With a >> bit of luck >> we hope to initially release our framework in January. >> >> We will fully open-source this framework and will attempt to make it as >> vendor >> independent as possible. Initially we hope to provide some example >> applications >> that demonstrate some of the benefits of using our framework in terms of >> productivity and the benefits of using FPGAs for specific problems in big >> data in general. >> >> We are reaching out for your comments, questions, suggestions, etc... Please >> give us your thoughts about this. Thank you in advance. >> >> With kind regards, >> >> Johan Peltenburg >> Computer Engineering Lab >> Delft University of Technology
