On Wed, Aug 9, 2017 at 11:26 PM, Dylan Hutchison <[email protected]> wrote: > It could be useful to cite some related work. There is a project called > Weld > <https://blog.acolyer.org/2017/01/16/weld-a-common-runtime-for-high-performance-data-analytics/> > (source code <https://github.com/weld-project/weld>) in development at > Databricks. They aim to create a common runtime underneath a bunch of > python libraries (NumPy, Tensorflow, Spark, Pandas, etc). When these > different libraries use a common runtime (e.g., common data structures), > there is no need to copy data between them. Weld goes a step further in > the ability to optimize expressions across these libraries, rather than > just data transfer. > > I think there are many projects that aim to provide "common ground" between > a set of related libraries. Intel's Nervana Graph > <https://www.intelnervana.com/intel-nervana-graph-preview-release/> (source > code <https://github.com/NervanaSystems/ngraph>) is another that comes to > mind for ML frameworks. The difference of FluoBytes is that it is > low-level and well-scoped, making it modular and easy to use in any Java
It's currently well-scoped, I have been wondering how can it be kept that way going forward... > project. FluoBytes does not aim to boil the ocean by optimizing all kinds > of expressions on bytes; it only relates to the storage and transfer of > bytes. > > On Wed, Aug 9, 2017 at 11:54 AM, Christopher <[email protected]> wrote: > >> Apparently, you can't fork and create pull requests until the repo is >> non-empty. So, I pushed a commit with just the LICENSE and NOTICE files. >> Feel free to sanity check those. Everything else I'll put in a proper PR >> for review. >> >> On Wed, Aug 9, 2017 at 10:12 AM Keith Turner <[email protected]> wrote: >> >> > On Tue, Aug 8, 2017 at 7:50 PM, Christopher <[email protected]> wrote: >> > > On Tue, Aug 8, 2017 at 1:33 PM Keith Turner <[email protected]> wrote: >> > > >> > >> On Fri, Aug 4, 2017 at 4:42 PM, Christopher <[email protected]> >> > wrote: >> > >> > Fluoers, >> > >> > >> > >> > I created a fluo-bytes repository in GitBox[1], so we can try to >> > create a >> > >> >> > >> This is great. I will take a stab at putting together the initial PR >> > >> for the repo unless someone else was interested. >> > >> >> > >> >> > > It was my intention to put some effort into this this week, but I don't >> > > mind collaborating. I just don't want to be stuck doing only reviews. >> :) >> > >> > I'll wait for your initial PR to the repo. Let me know if you want >> > help with anything before then. >> > >> > > >> > > >> > > >> > >> > dependency-free, standalone implementation of the basic Bytes >> > features we >> > >> > need, based on Keith's observations in his blog post[2]. >> > >> >> > >> At some point we need to circle back to the openjdk discuss list and >> > >> let them know we are working on this. There were a few people there >> > >> who expressed interest in a project like this. Maybe we can do that >> > >> after we get the basic readme and initial import up. Reading this post >> > >> made me think of the readme a lot. >> > >> >> > >> >> > > +1; do you have a link to that discuss thread? I'm not familiar with >> > this, >> > > and was not a participant. >> > >> > http://mail.openjdk.java.net/pipermail/discuss/2016-November/004062.html >> > >> > > >> > > >> > >> > >> > >> > Over the next few weeks, I'd like to try to start using it to >> create a >> > >> > small library of the following: >> > >> > >> > >> > * A ByteSequence interface (analogous to CharSequence) >> > >> > * BytesBuilder (analogous to StringBuilder) >> > >> > * an immutable Bytes implementation of ByteSequence (analogous to >> > String) >> > >> > >> > >> > Maybe later, we can add useful InputStream and OutputStream >> > >> implementations >> > >> > and other useful tools, but it should always be a small library >> with a >> > >> > narrow focus on manipulating byte sequences. >> > >> > >> > >> > The idea is that this will be semver, but will very strong prefer to >> > >> avoid >> > >> > ever going to a breaking 2.0 change, instead insuring it will be >> > >> backwards >> > >> > compatible for a *LONG* time, making it safe for use in other >> > projects' >> > >> > APIs. >> > >> >> > >> When creating the readme, it would be good to try to explain the >> > >> rational for avoiding dropping methods. I attempted that in my blog >> > >> post, but not sure how well I did at getting the point across. I >> > >> think it would be best shown with an example that shows how it can be >> > >> hard to use two projects where one uses newer methods and another uses >> > >> older dropped methods. Couple that with both projects having the >> > >> library in their API and its a huge headache for any users of both >> > >> libraries. >> > >> >> > >> >> > > +1 >> > > >> > > >> > >> > >> > >> > I think this library would be useful not only for Fluo's API, but >> as a >> > >> > separate dependency-free library, it could be easily reused by many >> > other >> > >> >> > >> We will also need to explain why dependencies are so important. If >> > >> their were dependencies they would also need to follow very strict API >> > >> guarantees. Having Java standard libs as the only dep is good because >> > >> Java itself is very rigorous about its API. >> > >> >> > >> Another thing that will need to be explained well in the readme is the >> > >> benefit of multiple APIs using the same immutable type, it avoids >> > >> copies when going between APIs. >> > >> >> > >> >> > > +1; it sounds like you've already got the README half written ;) >> > >> > An extremely rough outline is half written. >> > >> > > >> > > >> > >> > projects, such as Accumulo (and anybody else). >> > >> > >> > >> > [1]: https://github.com/apache/fluo-bytes >> > >> > [2]: https://fluo.apache.org/blog/2016/11/10/immutable-bytes/ >> > >> >> > >>
