It could be useful to cite some related work. There is a project called Weld <https://blog.acolyer.org/2017/01/16/weld-a-common-runtime-for-high-performance-data-analytics/> (source code <https://github.com/weld-project/weld>) in development at Databricks. They aim to create a common runtime underneath a bunch of python libraries (NumPy, Tensorflow, Spark, Pandas, etc). When these different libraries use a common runtime (e.g., common data structures), there is no need to copy data between them. Weld goes a step further in the ability to optimize expressions across these libraries, rather than just data transfer.
I think there are many projects that aim to provide "common ground" between a set of related libraries. Intel's Nervana Graph <https://www.intelnervana.com/intel-nervana-graph-preview-release/> (source code <https://github.com/NervanaSystems/ngraph>) is another that comes to mind for ML frameworks. The difference of FluoBytes is that it is low-level and well-scoped, making it modular and easy to use in any Java project. FluoBytes does not aim to boil the ocean by optimizing all kinds of expressions on bytes; it only relates to the storage and transfer of bytes. On Wed, Aug 9, 2017 at 11:54 AM, Christopher <[email protected]> wrote: > Apparently, you can't fork and create pull requests until the repo is > non-empty. So, I pushed a commit with just the LICENSE and NOTICE files. > Feel free to sanity check those. Everything else I'll put in a proper PR > for review. > > On Wed, Aug 9, 2017 at 10:12 AM Keith Turner <[email protected]> wrote: > > > On Tue, Aug 8, 2017 at 7:50 PM, Christopher <[email protected]> wrote: > > > On Tue, Aug 8, 2017 at 1:33 PM Keith Turner <[email protected]> wrote: > > > > > >> On Fri, Aug 4, 2017 at 4:42 PM, Christopher <[email protected]> > > wrote: > > >> > Fluoers, > > >> > > > >> > I created a fluo-bytes repository in GitBox[1], so we can try to > > create a > > >> > > >> This is great. I will take a stab at putting together the initial PR > > >> for the repo unless someone else was interested. > > >> > > >> > > > It was my intention to put some effort into this this week, but I don't > > > mind collaborating. I just don't want to be stuck doing only reviews. > :) > > > > I'll wait for your initial PR to the repo. Let me know if you want > > help with anything before then. > > > > > > > > > > > > > >> > dependency-free, standalone implementation of the basic Bytes > > features we > > >> > need, based on Keith's observations in his blog post[2]. > > >> > > >> At some point we need to circle back to the openjdk discuss list and > > >> let them know we are working on this. There were a few people there > > >> who expressed interest in a project like this. Maybe we can do that > > >> after we get the basic readme and initial import up. Reading this post > > >> made me think of the readme a lot. > > >> > > >> > > > +1; do you have a link to that discuss thread? I'm not familiar with > > this, > > > and was not a participant. > > > > http://mail.openjdk.java.net/pipermail/discuss/2016-November/004062.html > > > > > > > > > > >> > > > >> > Over the next few weeks, I'd like to try to start using it to > create a > > >> > small library of the following: > > >> > > > >> > * A ByteSequence interface (analogous to CharSequence) > > >> > * BytesBuilder (analogous to StringBuilder) > > >> > * an immutable Bytes implementation of ByteSequence (analogous to > > String) > > >> > > > >> > Maybe later, we can add useful InputStream and OutputStream > > >> implementations > > >> > and other useful tools, but it should always be a small library > with a > > >> > narrow focus on manipulating byte sequences. > > >> > > > >> > The idea is that this will be semver, but will very strong prefer to > > >> avoid > > >> > ever going to a breaking 2.0 change, instead insuring it will be > > >> backwards > > >> > compatible for a *LONG* time, making it safe for use in other > > projects' > > >> > APIs. > > >> > > >> When creating the readme, it would be good to try to explain the > > >> rational for avoiding dropping methods. I attempted that in my blog > > >> post, but not sure how well I did at getting the point across. I > > >> think it would be best shown with an example that shows how it can be > > >> hard to use two projects where one uses newer methods and another uses > > >> older dropped methods. Couple that with both projects having the > > >> library in their API and its a huge headache for any users of both > > >> libraries. > > >> > > >> > > > +1 > > > > > > > > >> > > > >> > I think this library would be useful not only for Fluo's API, but > as a > > >> > separate dependency-free library, it could be easily reused by many > > other > > >> > > >> We will also need to explain why dependencies are so important. If > > >> their were dependencies they would also need to follow very strict API > > >> guarantees. Having Java standard libs as the only dep is good because > > >> Java itself is very rigorous about its API. > > >> > > >> Another thing that will need to be explained well in the readme is the > > >> benefit of multiple APIs using the same immutable type, it avoids > > >> copies when going between APIs. > > >> > > >> > > > +1; it sounds like you've already got the README half written ;) > > > > An extremely rough outline is half written. > > > > > > > > > > >> > projects, such as Accumulo (and anybody else). > > >> > > > >> > [1]: https://github.com/apache/fluo-bytes > > >> > [2]: https://fluo.apache.org/blog/2016/11/10/immutable-bytes/ > > >> > > >
