Re: About fluo-bytes

Keith Turner Fri, 11 Aug 2017 10:15:35 -0700

On Wed, Aug 9, 2017 at 11:26 PM, Dylan Hutchison
<[email protected]> wrote:
> It could be useful to cite some related work.  There is a project called
> Weld
> <https://blog.acolyer.org/2017/01/16/weld-a-common-runtime-for-high-performance-data-analytics/>
>  (source code <https://github.com/weld-project/weld>) in development at
> Databricks.  They aim to create a common runtime underneath a bunch of
> python libraries (NumPy, Tensorflow, Spark, Pandas, etc).  When these
> different libraries use a common runtime (e.g., common data structures),
> there is no need to copy data between them.  Weld goes a step further in
> the ability to optimize expressions across these libraries, rather than
> just data transfer.
>
> I think there are many projects that aim to provide "common ground" between
> a set of related libraries.  Intel's Nervana Graph
> <https://www.intelnervana.com/intel-nervana-graph-preview-release/> (source
> code <https://github.com/NervanaSystems/ngraph>) is another that comes to
> mind for ML frameworks.  The difference of FluoBytes is that it is
> low-level and well-scoped, making it modular and easy to use in any Java


It's currently well-scoped, I have been wondering how can it be kept
that way going forward...

> project.  FluoBytes does not aim to boil the ocean by optimizing all kinds
> of expressions on bytes; it only relates to the storage and transfer of
> bytes.
>
> On Wed, Aug 9, 2017 at 11:54 AM, Christopher <[email protected]> wrote:
>
>> Apparently, you can't fork and create pull requests until the repo is
>> non-empty. So, I pushed a commit with just the LICENSE and NOTICE files.
>> Feel free to sanity check those. Everything else I'll put in a proper PR
>> for review.
>>
>> On Wed, Aug 9, 2017 at 10:12 AM Keith Turner <[email protected]> wrote:
>>
>> > On Tue, Aug 8, 2017 at 7:50 PM, Christopher <[email protected]> wrote:
>> > > On Tue, Aug 8, 2017 at 1:33 PM Keith Turner <[email protected]> wrote:
>> > >
>> > >> On Fri, Aug 4, 2017 at 4:42 PM, Christopher <[email protected]>
>> > wrote:
>> > >> > Fluoers,
>> > >> >
>> > >> > I created a fluo-bytes repository in GitBox[1], so we can try to
>> > create a
>> > >>
>> > >> This is great. I will take a stab at putting together the initial PR
>> > >> for the repo unless someone else was interested.
>> > >>
>> > >>
>> > > It was my intention to put some effort into this this week, but I don't
>> > > mind collaborating. I just don't want to be stuck doing only reviews.
>> :)
>> >
>> > I'll wait for your initial PR to the repo.  Let me know if you want
>> > help with anything before then.
>> >
>> > >
>> > >
>> > >
>> > >> > dependency-free, standalone implementation of the basic Bytes
>> > features we
>> > >> > need, based on Keith's observations in his blog post[2].
>> > >>
>> > >> At some point we need to circle back to the openjdk discuss list and
>> > >> let them know we are working on this.  There were a few people there
>> > >> who expressed interest in a project like this.  Maybe we can do that
>> > >> after we get the basic readme and initial import up. Reading this post
>> > >> made me think of the readme a lot.
>> > >>
>> > >>
>> > > +1; do you have a link to that discuss thread? I'm not familiar with
>> > this,
>> > > and was not a participant.
>> >
>> > http://mail.openjdk.java.net/pipermail/discuss/2016-November/004062.html
>> >
>> > >
>> > >
>> > >> >
>> > >> > Over the next few weeks, I'd like to try to start using it to
>> create a
>> > >> > small library of the following:
>> > >> >
>> > >> > * A ByteSequence interface (analogous to CharSequence)
>> > >> > * BytesBuilder (analogous to StringBuilder)
>> > >> > * an immutable Bytes implementation of ByteSequence (analogous to
>> > String)
>> > >> >
>> > >> > Maybe later, we can add useful InputStream and OutputStream
>> > >> implementations
>> > >> > and other useful tools, but it should always be a small library
>> with a
>> > >> > narrow focus on manipulating byte sequences.
>> > >> >
>> > >> > The idea is that this will be semver, but will very strong prefer to
>> > >> avoid
>> > >> > ever going to a breaking 2.0 change, instead insuring it will be
>> > >> backwards
>> > >> > compatible for a *LONG* time, making it safe for use in other
>> > projects'
>> > >> > APIs.
>> > >>
>> > >> When creating the readme, it would be good to try to explain the
>> > >> rational for avoiding dropping methods.  I attempted that in my blog
>> > >> post, but not sure how well I did at getting the point across.  I
>> > >> think it would be best shown with an example that shows how it can be
>> > >> hard to use two projects where one uses newer methods and another uses
>> > >> older dropped methods.  Couple that with both projects having the
>> > >> library in their API and its a huge headache for any users of both
>> > >> libraries.
>> > >>
>> > >>
>> > > +1
>> > >
>> > >
>> > >> >
>> > >> > I think this library would be useful not only for Fluo's API, but
>> as a
>> > >> > separate dependency-free library, it could be easily reused by many
>> > other
>> > >>
>> > >> We will also need to explain why dependencies are so important.  If
>> > >> their were dependencies they would also need to follow very strict API
>> > >> guarantees.  Having Java standard libs as the only dep is good because
>> > >> Java itself is very rigorous about its API.
>> > >>
>> > >> Another thing that will need to be explained well in the readme is the
>> > >> benefit of multiple APIs using the same immutable type, it avoids
>> > >> copies when going between APIs.
>> > >>
>> > >>
>> > > +1; it sounds like you've already got the README half written ;)
>> >
>> > An extremely rough outline is half written.
>> >
>> > >
>> > >
>> > >> > projects, such as Accumulo (and anybody else).
>> > >> >
>> > >> > [1]: https://github.com/apache/fluo-bytes
>> > >> > [2]: https://fluo.apache.org/blog/2016/11/10/immutable-bytes/
>> > >>
>> >
>>

Re: About fluo-bytes

Reply via email to