Re: [DISCUSS] using protobuf than thrift

2017-01-05 Thread Khurrum Nasim
one question - bk is using protobuf 2.x while gRPC is using 3.x. IMO, they
are not backward compatible. Are you also considering moving bk's protobuf
to 3.x?

On Thu, Jan 5, 2017 at 10:29 PM, Gerrit Sundaram 
wrote:

> Hello all,
>
> for the comment in
> https://github.com/apache/incubator-distributedlog/pull/99, I am starting
> this email thread for discussing using protobuf to store metadata for ease
> extension.
>
> I have a few reasons for using protobuf rather than using thrift:
>
> - bookkeeper is using protobuf for storing metadata. so there is no extra
> dependency.   and it will make things consistent.
> - the thrift version that DL is using now is 0.5.0-1, which is an
> out-of-date thrift version and seems to be a special version that Twitter
> customized for finagle. it makes me impossible to build a c++ client to
> access DL.
> - using protobuf, I can easily write a gRPC request handler for current
> proxy service to support c++.
>
> Any thoughts?
>
> - Sijie
>


Re: vote process for proposals?

2017-01-05 Thread Xi Liu
I think there is not a lot of activities on proposals. A 'lazy approval'
might be just good enough - a proposal with lazy approval  is implicitly
allowed/accepted unless a  -1 vote is received. That's probably the best
for now. what do you think?

On Wed, Jan 4, 2017 at 12:50 AM, Sijie Guo  wrote:

> Ping?
>
> Xi, Jon, any updates about this? Do any of you want to drive this?
>
> - Sijie
>
>
>
> On Thu, Dec 15, 2016 at 11:18 AM, Sijie Guo 
> wrote:
>
> > Xi, Jon, are any of you interested in making a draft about about the
> > proposal workflow?
> >
> > On Wed, Dec 14, 2016 at 6:14 PM, Jon Derrick <
> jonathan.derri...@gmail.com>
> > wrote:
> >
> > > I think it really worth having a voting proposal, as sometime I might
> > lose
> > > track of if a proposal is accepted or not and whether it is under
> > > development.
> > >
> > > Beam's process looks promising. You can try to start with that.
> > >
> > > Another suggestion is it would be awesome if the DL jira queue can have
> > new
> > > type, called 'Proposal'. Then we can enforce the proposal workflow in
> the
> > > jira.
> > >
> >
> > I think it is possible to ask INFRA team to create a new jira
> type/workflow
> > for us, if we can come up with more details. Can you tell us more about
> > your thoughts?
> >
> > - Sijie
> >
> >
> > >
> > > On Tue, Dec 13, 2016 at 1:10 AM, Xi Liu  wrote:
> > >
> > > > Thank you Sijie. I feel it is good to have a voting process, so that
> it
> > > > would be good to track if a proposal is accepted for developing or
> > > > discarded due to any reasons. I will start with my proposal and see
> how
> > > it
> > > > is going with the community.
> > > >
> > > > - Xi
> > > >
> > > > On Thu, Dec 8, 2016 at 9:11 PM, Sijie Guo  wrote:
> > > >
> > > > > Xi, thank you for raising this up. I don't think we have a formal
> > > process
> > > > > for track proposals. I think we can learn the proposals from other
> > > apache
> > > > > projects. For example, beam has very nice documentation on
> > contribution
> > > > > guide (http://beam.incubator.apache.org/contribute/contribution-
> > guide/
> > > ).
> > > > > We
> > > > > probably can adopt it.
> > > > >
> > > > > I don't feel strong about the voting process. If it is easier for
> > > making
> > > > > conclusion on the proposal discussion, let's vote for any discussed
> > > > > proposal.
> > > > >
> > > > > - Sijie
> > > > >
> > > > >
> > > > >
> > > > > On Thu, Dec 8, 2016 at 9:10 AM, Xi Liu 
> wrote:
> > > > >
> > > > > > Hi all,
> > > > > >
> > > > > > It is great that we have a process to track/discuss proposals.
> but
> > > the
> > > > > > process is still a bit unclear to me. do we need a vote phase to
> > > adopt
> > > > > the
> > > > > > proposals? and shall we document the process in wiki page?
> > > > > >
> > > > > > my basic understand about the process is:
> > > > > >
> > > > > > - create a proposal wiki page to describe the proposal
> > > > > > - start the '[discussion]' email thread for the proposal
> > > > > > - conversation will happen in the '[discussion]' email thread and
> > the
> > > > > wiki
> > > > > > page will be refined
> > > > > >
> > > > > > I feel there will be a phase to decide whether this proposal will
> > be
> > > > > > accepted or discarded and update the state of the proposals.
> shall
> > I
> > > > vote
> > > > > > DP-2?
> > > > > >
> > > > > > - Xi
> > > > > >
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > - jderrick
> > >
> >
>


Re: [Review] The first release of Apache DistributedLog

2017-01-05 Thread Xi Liu
It would be great to include any performance numbers.

On Thu, Jan 5, 2017 at 5:51 PM, Sijie Guo  wrote:

> Cool to see you here, Enrico. And thank you for your suggestion.
>
> I will try to write a separate one for DL and BK. Try to keep this one
> focus on a short release post.
>
> - Sijie
>
> On Thu, Jan 5, 2017 at 1:43 PM, Enrico Olivelli 
> wrote:
>
> > Hi Sijie,
> > I am following this release and the great work DL comunity is doing.
> > Maybe it would be worth to write some paragraph about the difference from
> > BookKeeper and/or the relation with it.
> >
> >
> > Enrico
> >
> > Il gio 5 gen 2017, 19:13 Asko Kauppi  ha
> scritto:
> >
> > > Hi Sijie,
> > >
> > > most readers will likely not know about DistributedLog. A short
> > comparison
> > > - or a link to one - e.g. with Kafka might help set the stage for them.
> > > i.e. why does it exist?
> > >
> > > This is even more important since Uber recently also publicized their
> > > persistent message bus solution. If these start dropping down, there
> > needs
> > > to be more (technical) reason than just another vendor opening their
> > chest.
> > > :)
> > >
> > > Other than that, the structure seemed nice but it can be slightly
> > shorter.
> > >
> > > Just my 2c
> > > - asko
> > >
> > >
> > > On 4 January 2017 at 10:38, Sijie Guo  wrote:
> > >
> > > > I drafted a blog post for announcing the first apache release. Here
> is
> > > the
> > > > draft. Please help review it :D
> > > >
> > > > https://docs.google.com/document/d/1IXVmP2cHkf4ydeUHUJN9p5ZWTpA1a
> > > > uwBhfqMnYBu4A0/edit
> > > >
> > > > - Sijie
> > > >
> > >
> > --
> >
> >
> > -- Enrico Olivelli
> >
>


Re: [Discuss] Transaction Support

2017-01-05 Thread Xi Liu
Asko and Sijie,

Thank you so much for your feedbacks.

We are not targeting at building a general XA transaction coordinator. The
feature we want is be able to write data to multiple log streams in an
atomic way.

I totally agreed with you about building minimal logic. We also don't want
to enforce this feature to all the users of DL. Building the TC as a
separated service sounds clear to me. We will do it follow your suggestion.

I am also replying the comments to you and Leigh on the doc. Hopefully we
can come to an agreement so that our changes can be accepted.

- Xi

On Wed, Jan 4, 2017 at 1:14 AM, Asko Kauppi  wrote:

> > Beside that, I have one general question - What is the major goal for
> this
> > feature? Are you targeting on building a general XA transaction
> coordinator
> > or just for supporting things like `copy-modify-write' style workflow?
>
> The use case I would have for transactions - at some level of the stack -
> is supporting dynamic configurations.
>
> If a config changes in e.g. three lines, some of the changes may logically
> belong together. E.g. changing both “host” and “port” (if separate
> entries). One shouldn’t be able to read a state, even temporarily, that has
> new host but old port.
>
> I can do this in the application level - it does not need to be part of
> the DL protocol.
>
>
> Asko Kauppi
> Zalando Tech Helsinki
>
> > On 4 Jan 2017, at 9.18, Sijie Guo  wrote:
> >
> > Sorry for late response. I think Leigh and you already had some very
> > valuable discussions in the doc. I will try to add some of my questions
> to
> > the discussion.
> >
> > Beside that, I had a discussion with Leigh today about this. first of
> all,
> > I think it is very good to add transaction support in distributedlog. It
> is
> > one of the primitives that would help building distributed service. But
> we
> > have a concern about making this system become complicated and introduce
> > operational overhead when it runs in the large scale system on
> production.
> > There are two major suggestions that I have for this feature -
> >
> > Build the 'minimum' logic in core - I think the minimum logic that need
> to
> > be added to the core is -  the special control records (begin, commit and
> > abort) and make the reader be able to detect those special control
> records
> > and know what do they mean and how to interrupt with them. Since they are
> > special control records, there is not overhead to other readers that
> > doesn't require this feature.
> >
> > Build the transaction coordinator as a separated proxy service  - I think
> > the major concern that we have is putting more complexities into the
> 'write
> > proxy' service. We architected distributedlog in a more microservice-like
> > way - we have the core as the stream store, the proxy for serving write
> and
> > read traffic. It would be good that the transaction feature can be done
> in
> > a similar way. So the architecture would be like this -
> >
> > *[ write service ] [ read service ] [ transaction coordinator ]*
> > *[ stream store
> >]*
> >
> > if people doesn't need the transaction feature, they can turn if off
> > completely without any operational overhead.
> >
> > Beside that, I have one general question - What is the major goal for
> this
> > feature? Are you targeting on building a general XA transaction
> coordinator
> > or just for supporting things like `copy-modify-write' style workflow?
> >
> >
> > Thanks,
> > Sijie
> >
> >
> >
> >
> >
> > On Wed, Dec 28, 2016 at 1:12 PM, Xi Liu  wrote:
> >
> >> Ping?
> >>
> >> On Mon, Dec 19, 2016 at 8:28 AM, Xi Liu  wrote:
> >>
> >>> Sijie,
> >>>
> >>> No. I thought it might be easier for people to comment on a google doc
> to
> >>> gather the initial feedback. I will put the content back to wiki page
> >> once
> >>> addressing the comments. Does that sound good to you?
> >>>
> >>> And thank you in advance.
> >>>
> >>> - Xi
> >>>
> >>>
> >>>
> >>> On Sun, Dec 18, 2016 at 8:48 AM, Sijie Guo  wrote:
> >>>
>  Hi Xi,
> 
>  sorry for late response. I will review it soon.
> 
>  regarding this, a separate question "are we going to use google doc
>  instead
>  of email thread for any discussion"? I am a bit worried that the
>  discussion
>  will become lost after moving to google doc. No idea on how other
> apache
>  projects are doing.
> 
>  - Sijie
> 
>  On Wed, Dec 14, 2016 at 11:41 PM, Xi Liu 
> wrote:
> 
> > Hi all,
> >
> > I finalized the first version of the design. This time I used a
> google
>  doc
> > so that it is easier for commenting and add a link the wiki page. I
> >> will
> > update this to the wiki page once we come to the finalized design.
> >
> > https://docs.google.com/document/d/14Ns05M8Z5a6DF6fHmWQwISyD5jjeK
> > bSIGgSzXuTI5BA/edit
> >

Re: [DISCUSS] using protobuf than thrift

2017-01-05 Thread Gerrit Sundaram
Sijie, sorry. I typed your name in the wrong line. was planning to mention
since you raised the comment in the pull request.

On Thu, Jan 5, 2017 at 10:29 PM, Gerrit Sundaram 
wrote:

> Hello all,
>
> for the comment in https://github.com/apache/
> incubator-distributedlog/pull/99, I am starting this email thread for
> discussing using protobuf to store metadata for ease extension.
>
> I have a few reasons for using protobuf rather than using thrift:
>
> - bookkeeper is using protobuf for storing metadata. so there is no extra
> dependency.   and it will make things consistent.
> - the thrift version that DL is using now is 0.5.0-1, which is an
> out-of-date thrift version and seems to be a special version that Twitter
> customized for finagle. it makes me impossible to build a c++ client to
> access DL.
> - using protobuf, I can easily write a gRPC request handler for current
> proxy service to support c++.
>
> Any thoughts?
>
> - Sijie
>


[DISCUSS] using protobuf than thrift

2017-01-05 Thread Gerrit Sundaram
Hello all,

for the comment in
https://github.com/apache/incubator-distributedlog/pull/99, I am starting
this email thread for discussing using protobuf to store metadata for ease
extension.

I have a few reasons for using protobuf rather than using thrift:

- bookkeeper is using protobuf for storing metadata. so there is no extra
dependency.   and it will make things consistent.
- the thrift version that DL is using now is 0.5.0-1, which is an
out-of-date thrift version and seems to be a special version that Twitter
customized for finagle. it makes me impossible to build a c++ client to
access DL.
- using protobuf, I can easily write a gRPC request handler for current
proxy service to support c++.

Any thoughts?

- Sijie


[GitHub] incubator-distributedlog issue #99: DL-172: Provide a IDL definition for str...

2017-01-05 Thread gerritsundaram
Github user gerritsundaram commented on the issue:

https://github.com/apache/incubator-distributedlog/pull/99
  
@sijie sorry my bad. exclude the generated files from findbugs


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [Review] The first release of Apache DistributedLog

2017-01-05 Thread Sijie Guo
Cool to see you here, Enrico. And thank you for your suggestion.

I will try to write a separate one for DL and BK. Try to keep this one
focus on a short release post.

- Sijie

On Thu, Jan 5, 2017 at 1:43 PM, Enrico Olivelli  wrote:

> Hi Sijie,
> I am following this release and the great work DL comunity is doing.
> Maybe it would be worth to write some paragraph about the difference from
> BookKeeper and/or the relation with it.
>
>
> Enrico
>
> Il gio 5 gen 2017, 19:13 Asko Kauppi  ha scritto:
>
> > Hi Sijie,
> >
> > most readers will likely not know about DistributedLog. A short
> comparison
> > - or a link to one - e.g. with Kafka might help set the stage for them.
> > i.e. why does it exist?
> >
> > This is even more important since Uber recently also publicized their
> > persistent message bus solution. If these start dropping down, there
> needs
> > to be more (technical) reason than just another vendor opening their
> chest.
> > :)
> >
> > Other than that, the structure seemed nice but it can be slightly
> shorter.
> >
> > Just my 2c
> > - asko
> >
> >
> > On 4 January 2017 at 10:38, Sijie Guo  wrote:
> >
> > > I drafted a blog post for announcing the first apache release. Here is
> > the
> > > draft. Please help review it :D
> > >
> > > https://docs.google.com/document/d/1IXVmP2cHkf4ydeUHUJN9p5ZWTpA1a
> > > uwBhfqMnYBu4A0/edit
> > >
> > > - Sijie
> > >
> >
> --
>
>
> -- Enrico Olivelli
>


Re: [Review] The first release of Apache DistributedLog

2017-01-05 Thread Sijie Guo
Thank you for your suggestion, Asko.

Try to make the announcement post shorter and keep it focused on release.
Will try to prepare another post for education.

- Sijie


On Thu, Jan 5, 2017 at 10:13 AM, Asko Kauppi  wrote:

> Hi Sijie,
>
> most readers will likely not know about DistributedLog. A short comparison
> - or a link to one - e.g. with Kafka might help set the stage for them.
> i.e. why does it exist?
>
> This is even more important since Uber recently also publicized their
> persistent message bus solution. If these start dropping down, there needs
> to be more (technical) reason than just another vendor opening their chest.
> :)
>
> Other than that, the structure seemed nice but it can be slightly shorter.
>
> Just my 2c
> - asko
>
>
> On 4 January 2017 at 10:38, Sijie Guo  wrote:
>
> > I drafted a blog post for announcing the first apache release. Here is
> the
> > draft. Please help review it :D
> >
> > https://docs.google.com/document/d/1IXVmP2cHkf4ydeUHUJN9p5ZWTpA1a
> > uwBhfqMnYBu4A0/edit
> >
> > - Sijie
> >
>


[GitHub] incubator-distributedlog pull request #102: DL-176: Rename the DL artifact f...

2017-01-05 Thread sijie
GitHub user sijie opened a pull request:

https://github.com/apache/incubator-distributedlog/pull/102

DL-176: Rename the DL artifact from com.twitter to org.apache.distributedlog



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sijie/incubator-distributedlog 
sijie/fix_pom_file_layout

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-distributedlog/pull/102.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #102


commit bcbeaf1bb701256f59206962250b484706398e2d
Author: Sijie Guo 
Date:   2017-01-06T01:43:06Z

DL-176: Rename the DL artifact from com.twitter to org.apache.distributedlog




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-distributedlog pull request #101: DL-173: changed FileUtiles.delet...

2017-01-05 Thread adamtracymartin
Github user adamtracymartin closed the pull request at:

https://github.com/apache/incubator-distributedlog/pull/101


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [Review] The first release of Apache DistributedLog

2017-01-05 Thread Enrico Olivelli
Hi Sijie,
I am following this release and the great work DL comunity is doing.
Maybe it would be worth to write some paragraph about the difference from
BookKeeper and/or the relation with it.


Enrico

Il gio 5 gen 2017, 19:13 Asko Kauppi  ha scritto:

> Hi Sijie,
>
> most readers will likely not know about DistributedLog. A short comparison
> - or a link to one - e.g. with Kafka might help set the stage for them.
> i.e. why does it exist?
>
> This is even more important since Uber recently also publicized their
> persistent message bus solution. If these start dropping down, there needs
> to be more (technical) reason than just another vendor opening their chest.
> :)
>
> Other than that, the structure seemed nice but it can be slightly shorter.
>
> Just my 2c
> - asko
>
>
> On 4 January 2017 at 10:38, Sijie Guo  wrote:
>
> > I drafted a blog post for announcing the first apache release. Here is
> the
> > draft. Please help review it :D
> >
> > https://docs.google.com/document/d/1IXVmP2cHkf4ydeUHUJN9p5ZWTpA1a
> > uwBhfqMnYBu4A0/edit
> >
> > - Sijie
> >
>
-- 


-- Enrico Olivelli


Re: [Discuss] Transaction Support

2017-01-05 Thread Sijie Guo
Xi, I added more comments. We are looking forward to your reply and seeing
this happen.

- Sijie

On Tue, Jan 3, 2017 at 11:18 PM, Sijie Guo  wrote:

> Sorry for late response. I think Leigh and you already had some very
> valuable discussions in the doc. I will try to add some of my questions to
> the discussion.
>
> Beside that, I had a discussion with Leigh today about this. first of all,
> I think it is very good to add transaction support in distributedlog. It is
> one of the primitives that would help building distributed service. But we
> have a concern about making this system become complicated and introduce
> operational overhead when it runs in the large scale system on production.
> There are two major suggestions that I have for this feature -
>
> Build the 'minimum' logic in core - I think the minimum logic that need to
> be added to the core is -  the special control records (begin, commit and
> abort) and make the reader be able to detect those special control records
> and know what do they mean and how to interrupt with them. Since they are
> special control records, there is not overhead to other readers that
> doesn't require this feature.
>
> Build the transaction coordinator as a separated proxy service  - I think
> the major concern that we have is putting more complexities into the 'write
> proxy' service. We architected distributedlog in a more microservice-like
> way - we have the core as the stream store, the proxy for serving write and
> read traffic. It would be good that the transaction feature can be done in
> a similar way. So the architecture would be like this -
>
> *[ write service ] [ read service ] [ transaction coordinator ]*
> *[ stream store
>   ]*
>
> if people doesn't need the transaction feature, they can turn if off
> completely without any operational overhead.
>
> Beside that, I have one general question - What is the major goal for this
> feature? Are you targeting on building a general XA transaction coordinator
> or just for supporting things like `copy-modify-write' style workflow?
>
>
> Thanks,
> Sijie
>
>
>
>
>
> On Wed, Dec 28, 2016 at 1:12 PM, Xi Liu  wrote:
>
>> Ping?
>>
>> On Mon, Dec 19, 2016 at 8:28 AM, Xi Liu  wrote:
>>
>> > Sijie,
>> >
>> > No. I thought it might be easier for people to comment on a google doc
>> to
>> > gather the initial feedback. I will put the content back to wiki page
>> once
>> > addressing the comments. Does that sound good to you?
>> >
>> > And thank you in advance.
>> >
>> > - Xi
>> >
>> >
>> >
>> > On Sun, Dec 18, 2016 at 8:48 AM, Sijie Guo  wrote:
>> >
>> >> Hi Xi,
>> >>
>> >> sorry for late response. I will review it soon.
>> >>
>> >> regarding this, a separate question "are we going to use google doc
>> >> instead
>> >> of email thread for any discussion"? I am a bit worried that the
>> >> discussion
>> >> will become lost after moving to google doc. No idea on how other
>> apache
>> >> projects are doing.
>> >>
>> >> - Sijie
>> >>
>> >> On Wed, Dec 14, 2016 at 11:41 PM, Xi Liu  wrote:
>> >>
>> >> > Hi all,
>> >> >
>> >> > I finalized the first version of the design. This time I used a
>> google
>> >> doc
>> >> > so that it is easier for commenting and add a link the wiki page. I
>> will
>> >> > update this to the wiki page once we come to the finalized design.
>> >> >
>> >> > https://docs.google.com/document/d/14Ns05M8Z5a6DF6fHmWQwISyD5jjeK
>> >> > bSIGgSzXuTI5BA/edit
>> >> >
>> >> > Let me know if you have any questions. Appreciate your reviews!
>> >> >
>> >> > - Xi
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> > On Fri, Oct 28, 2016 at 7:58 AM, Leigh Stewart
>> >> > > >> > > wrote:
>> >> >
>> >> > > Interesting proposal. A couple quick notes while you continue to
>> flesh
>> >> > this
>> >> > > out.
>> >> > >
>> >> > > a. just to be sure - does this eliminate the need to save seqno
>> with
>> >> > > checkpoint?
>> >> > >
>> >> > > b. i.e. another way to describe this kind of improvement is
>> "support
>> >> > > records (atomic writes) larger than 1MB", iiuc. the advantage
>> being it
>> >> > > avoids the baggage of transactions. disadvantages include inability
>> >> to do
>> >> > > cross stream transactions, and flexibility (interleaving, etc) (are
>> >> there
>> >> > > others?).
>> >> > >
>> >> > > c. proxy use case is for supporting multiple writers - have you
>> >> thought
>> >> > > about how this would work with multiple writers?
>> >> > >
>> >> > > Thanks!
>> >> > >
>> >> > >
>> >> > > On Tue, Oct 18, 2016 at 6:45 PM, Sijie Guo
>> > >> >
>> >> > > wrote:
>> >> > >
>> >> > > > Sound good to me. look forward to the detailed proposal.
>> >> > > >
>> >> > > > (I don't mind the format if it makes things easier to you)
>> >> > > >
>> >> > > > Sijie
>> >> > > >
>> >> > > > On Friday, October 14, 2016, Xi Liu 
>> wrote:
>> 

Re: [Discuss] Transaction Support

2017-01-05 Thread Sijie Guo
On Wed, Jan 4, 2017 at 1:14 AM, Asko Kauppi  wrote:

> > Beside that, I have one general question - What is the major goal for
> this
> > feature? Are you targeting on building a general XA transaction
> coordinator
> > or just for supporting things like `copy-modify-write' style workflow?
>
> The use case I would have for transactions - at some level of the stack -
> is supporting dynamic configurations.
>
> If a config changes in e.g. three lines, some of the changes may logically
> belong together. E.g. changing both “host” and “port” (if separate
> entries). One shouldn’t be able to read a state, even temporarily, that has
> new host but old port.
>
> I can do this in the application level - it does not need to be part of
> the DL protocol.
>

Yeah, I can see 'transaction' as a large atomic write in DL is very useful.
Currently DL limits the record size to 1MB. If people wants to write a
record larger than 1MB, it will potentially produce a `partial` write if
application breaks their record into multiple records.

Your use case falls into this category.

I think the minimal support is large atomic write. That is good enough for
most of the log use cases.

Having a separated TC (transaction coordinator) is cool. but it can be an
opt-in solution.



>
>
> Asko Kauppi
> Zalando Tech Helsinki
>
> > On 4 Jan 2017, at 9.18, Sijie Guo  wrote:
> >
> > Sorry for late response. I think Leigh and you already had some very
> > valuable discussions in the doc. I will try to add some of my questions
> to
> > the discussion.
> >
> > Beside that, I had a discussion with Leigh today about this. first of
> all,
> > I think it is very good to add transaction support in distributedlog. It
> is
> > one of the primitives that would help building distributed service. But
> we
> > have a concern about making this system become complicated and introduce
> > operational overhead when it runs in the large scale system on
> production.
> > There are two major suggestions that I have for this feature -
> >
> > Build the 'minimum' logic in core - I think the minimum logic that need
> to
> > be added to the core is -  the special control records (begin, commit and
> > abort) and make the reader be able to detect those special control
> records
> > and know what do they mean and how to interrupt with them. Since they are
> > special control records, there is not overhead to other readers that
> > doesn't require this feature.
> >
> > Build the transaction coordinator as a separated proxy service  - I think
> > the major concern that we have is putting more complexities into the
> 'write
> > proxy' service. We architected distributedlog in a more microservice-like
> > way - we have the core as the stream store, the proxy for serving write
> and
> > read traffic. It would be good that the transaction feature can be done
> in
> > a similar way. So the architecture would be like this -
> >
> > *[ write service ] [ read service ] [ transaction coordinator ]*
> > *[ stream store
> >]*
> >
> > if people doesn't need the transaction feature, they can turn if off
> > completely without any operational overhead.
> >
> > Beside that, I have one general question - What is the major goal for
> this
> > feature? Are you targeting on building a general XA transaction
> coordinator
> > or just for supporting things like `copy-modify-write' style workflow?
> >
> >
> > Thanks,
> > Sijie
> >
> >
> >
> >
> >
> > On Wed, Dec 28, 2016 at 1:12 PM, Xi Liu  wrote:
> >
> >> Ping?
> >>
> >> On Mon, Dec 19, 2016 at 8:28 AM, Xi Liu  wrote:
> >>
> >>> Sijie,
> >>>
> >>> No. I thought it might be easier for people to comment on a google doc
> to
> >>> gather the initial feedback. I will put the content back to wiki page
> >> once
> >>> addressing the comments. Does that sound good to you?
> >>>
> >>> And thank you in advance.
> >>>
> >>> - Xi
> >>>
> >>>
> >>>
> >>> On Sun, Dec 18, 2016 at 8:48 AM, Sijie Guo  wrote:
> >>>
>  Hi Xi,
> 
>  sorry for late response. I will review it soon.
> 
>  regarding this, a separate question "are we going to use google doc
>  instead
>  of email thread for any discussion"? I am a bit worried that the
>  discussion
>  will become lost after moving to google doc. No idea on how other
> apache
>  projects are doing.
> 
>  - Sijie
> 
>  On Wed, Dec 14, 2016 at 11:41 PM, Xi Liu 
> wrote:
> 
> > Hi all,
> >
> > I finalized the first version of the design. This time I used a
> google
>  doc
> > so that it is easier for commenting and add a link the wiki page. I
> >> will
> > update this to the wiki page once we come to the finalized design.
> >
> > https://docs.google.com/document/d/14Ns05M8Z5a6DF6fHmWQwISyD5jjeK
> > bSIGgSzXuTI5BA/edit
> >
> > Let me know if you have any questions. Appreciate your 

Re: [Review] The first release of Apache DistributedLog

2017-01-05 Thread Asko Kauppi
Hi Sijie,

most readers will likely not know about DistributedLog. A short comparison
- or a link to one - e.g. with Kafka might help set the stage for them.
i.e. why does it exist?

This is even more important since Uber recently also publicized their
persistent message bus solution. If these start dropping down, there needs
to be more (technical) reason than just another vendor opening their chest.
:)

Other than that, the structure seemed nice but it can be slightly shorter.

Just my 2c
- asko


On 4 January 2017 at 10:38, Sijie Guo  wrote:

> I drafted a blog post for announcing the first apache release. Here is the
> draft. Please help review it :D
>
> https://docs.google.com/document/d/1IXVmP2cHkf4ydeUHUJN9p5ZWTpA1a
> uwBhfqMnYBu4A0/edit
>
> - Sijie
>


[GitHub] incubator-distributedlog pull request #98: DL-171 : adding a short sleep to ...

2017-01-05 Thread xieliang
GitHub user xieliang opened a pull request:

https://github.com/apache/incubator-distributedlog/pull/98

DL-171 : adding a short sleep to let the WriteCompleteListener have time to 
run before the final position be requested

once the "writer.write" is done, if "writer.position()" be invoked easier 
than the WriteCompleteListener onSuccess callback, due to the "synchronized", 
the position result will be 0, not the expected 33. we can just add a short 
sleep to avoid this test issue.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/xieliang/incubator-distributedlog 
DL-171-Fix-TestAppendOnlyStreamWriter

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-distributedlog/pull/98.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #98


commit 9de4d177d2c4d41aaaeaa69012be01fa4e0104ad
Author: xieliang 
Date:   2017-01-05T08:56:37Z

adding a short sleep to let the WriteCompleteListener have time to run 
before the final position be requested




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---