Re: About fluo-bytes

Dylan Hutchison Wed, 09 Aug 2017 20:27:22 -0700

It could be useful to cite some related work.  There is a project called
Weld
<https://blog.acolyer.org/2017/01/16/weld-a-common-runtime-for-high-performance-data-analytics/>
 (source code <https://github.com/weld-project/weld>) in development at
Databricks.  They aim to create a common runtime underneath a bunch of
python libraries (NumPy, Tensorflow, Spark, Pandas, etc).  When these
different libraries use a common runtime (e.g., common data structures),
there is no need to copy data between them.  Weld goes a step further in
the ability to optimize expressions across these libraries, rather than
just data transfer.


I think there are many projects that aim to provide "common ground" between
a set of related libraries.  Intel's Nervana Graph
<https://www.intelnervana.com/intel-nervana-graph-preview-release/> (source
code <https://github.com/NervanaSystems/ngraph>) is another that comes to
mind for ML frameworks.  The difference of FluoBytes is that it is
low-level and well-scoped, making it modular and easy to use in any Java
project.  FluoBytes does not aim to boil the ocean by optimizing all kinds
of expressions on bytes; it only relates to the storage and transfer of
bytes.

On Wed, Aug 9, 2017 at 11:54 AM, Christopher <[email protected]> wrote:

> Apparently, you can't fork and create pull requests until the repo is
> non-empty. So, I pushed a commit with just the LICENSE and NOTICE files.
> Feel free to sanity check those. Everything else I'll put in a proper PR
> for review.
>
> On Wed, Aug 9, 2017 at 10:12 AM Keith Turner <[email protected]> wrote:
>
> > On Tue, Aug 8, 2017 at 7:50 PM, Christopher <[email protected]> wrote:
> > > On Tue, Aug 8, 2017 at 1:33 PM Keith Turner <[email protected]> wrote:
> > >
> > >> On Fri, Aug 4, 2017 at 4:42 PM, Christopher <[email protected]>
> > wrote:
> > >> > Fluoers,
> > >> >
> > >> > I created a fluo-bytes repository in GitBox[1], so we can try to
> > create a
> > >>
> > >> This is great. I will take a stab at putting together the initial PR
> > >> for the repo unless someone else was interested.
> > >>
> > >>
> > > It was my intention to put some effort into this this week, but I don't
> > > mind collaborating. I just don't want to be stuck doing only reviews.
> :)
> >
> > I'll wait for your initial PR to the repo.  Let me know if you want
> > help with anything before then.
> >
> > >
> > >
> > >
> > >> > dependency-free, standalone implementation of the basic Bytes
> > features we
> > >> > need, based on Keith's observations in his blog post[2].
> > >>
> > >> At some point we need to circle back to the openjdk discuss list and
> > >> let them know we are working on this.  There were a few people there
> > >> who expressed interest in a project like this.  Maybe we can do that
> > >> after we get the basic readme and initial import up. Reading this post
> > >> made me think of the readme a lot.
> > >>
> > >>
> > > +1; do you have a link to that discuss thread? I'm not familiar with
> > this,
> > > and was not a participant.
> >
> > http://mail.openjdk.java.net/pipermail/discuss/2016-November/004062.html
> >
> > >
> > >
> > >> >
> > >> > Over the next few weeks, I'd like to try to start using it to
> create a
> > >> > small library of the following:
> > >> >
> > >> > * A ByteSequence interface (analogous to CharSequence)
> > >> > * BytesBuilder (analogous to StringBuilder)
> > >> > * an immutable Bytes implementation of ByteSequence (analogous to
> > String)
> > >> >
> > >> > Maybe later, we can add useful InputStream and OutputStream
> > >> implementations
> > >> > and other useful tools, but it should always be a small library
> with a
> > >> > narrow focus on manipulating byte sequences.
> > >> >
> > >> > The idea is that this will be semver, but will very strong prefer to
> > >> avoid
> > >> > ever going to a breaking 2.0 change, instead insuring it will be
> > >> backwards
> > >> > compatible for a *LONG* time, making it safe for use in other
> > projects'
> > >> > APIs.
> > >>
> > >> When creating the readme, it would be good to try to explain the
> > >> rational for avoiding dropping methods.  I attempted that in my blog
> > >> post, but not sure how well I did at getting the point across.  I
> > >> think it would be best shown with an example that shows how it can be
> > >> hard to use two projects where one uses newer methods and another uses
> > >> older dropped methods.  Couple that with both projects having the
> > >> library in their API and its a huge headache for any users of both
> > >> libraries.
> > >>
> > >>
> > > +1
> > >
> > >
> > >> >
> > >> > I think this library would be useful not only for Fluo's API, but
> as a
> > >> > separate dependency-free library, it could be easily reused by many
> > other
> > >>
> > >> We will also need to explain why dependencies are so important.  If
> > >> their were dependencies they would also need to follow very strict API
> > >> guarantees.  Having Java standard libs as the only dep is good because
> > >> Java itself is very rigorous about its API.
> > >>
> > >> Another thing that will need to be explained well in the readme is the
> > >> benefit of multiple APIs using the same immutable type, it avoids
> > >> copies when going between APIs.
> > >>
> > >>
> > > +1; it sounds like you've already got the README half written ;)
> >
> > An extremely rough outline is half written.
> >
> > >
> > >
> > >> > projects, such as Accumulo (and anybody else).
> > >> >
> > >> > [1]: https://github.com/apache/fluo-bytes
> > >> > [2]: https://fluo.apache.org/blog/2016/11/10/immutable-bytes/
> > >>
> >
>

Re: About fluo-bytes

Reply via email to