[GitHub] sijie commented on issue #774: Open should run recovery by default

2017-11-28 Thread GitBox
sijie commented on issue #774: Open should run recovery by default
URL: https://github.com/apache/bookkeeper/pull/774#issuecomment-347651531
 
 
   @ivankelly : I think @jiazhai also put down his opinion as well (just to 
make sure everyone that express the opinion in this thread is not missed)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] sijie commented on issue #774: Open should run recovery by default

2017-11-27 Thread GitBox
sijie commented on issue #774: Open should run recovery by default
URL: https://github.com/apache/bookkeeper/pull/774#issuecomment-347371758
 
 
   "Personally, I consider that I know what I'm doing in this area, and I got 
it wrong when updating the tutorial, and I only spotted the mistake by pure 
chance. As for reading the docs, I generally only do that when something is 
non-obvious. createOpenLedgerOp looks easy to use and its usage seems obvious, 
so why would people carefully read the docs for it?"
   
   @ivankelly "why would people carefully read the docs for it?" - Because you 
built the original API. In your mind, "open a ledger" is fencing and close the 
ledger. It doesn't mean people will have same assumption that "open a ledger" 
will fence and close the ledger. In contrast, if you ask most of the people who 
doesn't work with bookkeeper before, what are their expectations of a `open for 
read` behavior. I believe most of people will assume `open for read` generally 
has no side effects to the ledger or the writer. A default behavior for `open` 
should be as general as other storage system, no side effects (you can open as 
many time as you can without impacting others), easy to reason about without 
reading any documentation. `open-to-close` behavior is a special case for 
general `open`, it is used when a leader (writer) failed, it wants to fence the 
previous ledger and open a new one. because `open-to-close` is a special case, 
it should be called out explicitly.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] sijie commented on issue #774: Open should run recovery by default

2017-11-27 Thread GitBox
sijie commented on issue #774: Open should run recovery by default
URL: https://github.com/apache/bookkeeper/pull/774#issuecomment-347368506
 
 
   @ivankelly I think this problem here is more about an assumption problem. 
All the points you made here assume one thing - bookkeeper can only used in the 
use case you described: people open a ledger is to fence and close the writer.  
Also you have this assumption in your mind is more because you added this 
feature before. This assumption was kept in your mind for so long. If we are 
growing bookkeeper beyond this use case, we should look beyond this assumption 
and learn what is the most common behavior for opening and what is more neutral 
behavior when people use bookkeeper.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] sijie commented on issue #774: Open should run recovery by default

2017-11-26 Thread GitBox
sijie commented on issue #774: Open should run recovery by default
URL: https://github.com/apache/bookkeeper/pull/774#issuecomment-347057001
 
 
   @jvrao @jiazhai  and other people are welcome to chime in.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] sijie commented on issue #774: Open should run recovery by default

2017-11-26 Thread GitBox
sijie commented on issue #774: Open should run recovery by default
URL: https://github.com/apache/bookkeeper/pull/774#issuecomment-347056937
 
 
   the summary of my opinion on this : 
   
   - from a storage system perspective, open should not have any implicit 
impacts on the current writer. if open wants to have a side effort (e.g. fence 
and recovery), it should be called out explicitly either by 
`withRecovery(true)` or `withOpenFlags(O_FENCE)`.
   - from user education/adoption awareness, a `multiple readers` semantic has 
more use cases that a `single reader` semantic. from this sense, we should 
encourage tailing and have recovery default to `false`.
   - the original api was designed for logging, `openLedgerNoRecovery` was 
added after `openLedger`. if we want to grow beyond logging into a storage 
system, we should have openLedger with no recovery as default. It is what most 
of the people would expect from a storage system.
   - regarding `safer` point, if a user already used bookkeeper and he/she knew 
bookkeeper, knew the difference between openLedger and openLedgerNoRecovery, 
when he migrates to the new API, they will find a way mapping openLedger and 
openLedgerNoRecovery into the new builder API. Javadoc, documentation, and 
release notes will provide the guide for existing users. It is not a problem 
for existing users. We should develop the storage mindshare for people who is 
new to bookkeeper, where *open ledger doesn't implicit close current writer* is 
making more sense as a storage.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] sijie commented on issue #774: Open should run recovery by default

2017-11-24 Thread GitBox
sijie commented on issue #774: Open should run recovery by default
URL: https://github.com/apache/bookkeeper/pull/774#issuecomment-346910714
 
 
   Regarding, TOAB. LAC is the key to achieve TOAB, and make bookkeeper 
meaningful in multiple reader use cases. It is not a sidecar thing. Without 
LAC, you only can read entries after closing a ledger. The whole bookkeeper can 
only be used in single writer and single reader use cases. We have different 
views on LAC, is just because of the way how we have been used bookkeeper. In 
HDFS name mode, hedwig, pulsar, even pravega, bookkeeper is only used in single 
writer and single reader use cases, in those cases consistency on writer is 
important, fencing and recovery is the key. In distributed log, we use bk for 
streaming, multiple reader use cases, LAC is the key. Both of them are needed 
at some sense and key to the success of TOAB. There is no sense for the 
discussion to continue, I would stop here. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] sijie commented on issue #774: Open should run recovery by default

2017-11-24 Thread GitBox
sijie commented on issue #774: Open should run recovery by default
URL: https://github.com/apache/bookkeeper/pull/774#issuecomment-346908459
 
 
   I don't think we should go down with split. It doesn't make any sense. The 
whole idea of a fluent style API is to avoid such split. Coming from storage 
background, it is more natural to have there in mind: create to get a handle to 
write, open to get a handle to read. Recovery or no recovery is the flag 
provided to open, indicating what will be the open behavior. Ideally this 
should be done with read flags, recovery is the flag of O_FENCE or O_RECOVERY. 
It should be as easy and natural as how people use a filesystem: open a file to 
read, open doesn't have any impacts to writers unless a O_FENCE flag is 
explicitly provided. from a storage system perspective, I would prefer a no 
recovery behavior as default. People shouldn't think too much about open 
implicitly fence and close a ledger. But again, I wouldn't block this PR, if 
other people want to keep the default behavior. Lastly, if we want BookKeeper 
to grow beyond a logging system to become a storage, we need to think and 
design a API that people have common sense on a storage system.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] sijie commented on issue #774: Open should run recovery by default

2017-11-24 Thread GitBox
sijie commented on issue #774: Open should run recovery by default
URL: https://github.com/apache/bookkeeper/pull/774#issuecomment-346891627
 
 
   @ivankelly 
   
   as I said in my previous comment, I am -0 for this. that means I wouldn't 
block the change making recovery as default. I don't like splitting the builder 
into two.
   
   the different views on LAC exactly reflects the views on how bookkeeper 
would look like and can be used for. LAC is not an optimization (it might be an 
optimization when you guys initially developed it for hdfs namenode). However, 
LAC is the most beautifully thing when you comes from streaming and tailing 
world. And it is the key differentiator how bookkeeper is different from a 
traditional storage system like filesystem, blobstore.  LAC is the mechanism 
for propagating how your length of a stream grow and provides the consistency 
boundary for simple repeatable read consistency. while fencing and recovery 
can't. fencing and recovery are the mechanism for preventing split brain, which 
is more on writer side.
   
   Many other storage systems have fencing and recovery mechanism. It is just a 
way to prevent split brain. They might do it in many different ways. However in 
a bytes-orient storage system, like blob store, filesystem, you can't have LAC. 
LAC is unique to bookkeeper, because it is record-orient storage system. LAC 
makes bookkeeper so unique for streaming and tailing. That is something I would 
like us to emphasize and educate people to use bookkeeper.
   
   The reason why do you think fencing and recovery is cool in bookkeeper, is 
exactly because the way how you use bookkeeper. Most of the use cases like hdfs 
namenode, hedwig, pulsar are only use bookkeeper for writes. You need fencing 
and recovery to prevent split brain when ownership or leadership change. But 
this category of use cases is so limited.
   
   in distributedlog, at Twitter, we use bookkeeper in a lot of streaming and 
tailing use cases, where LAC is the soul. It guarantees TOAB and simple 
repeatable consistency while you do streaming and tailing. That's where 
bookkeeper is so unique from other storage systems. If we want bookkeeper to 
grow into a large community, we have to educate people in a way that people 
will realize this uniqueness. 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] sijie commented on issue #774: Open should run recovery by default

2017-11-23 Thread GitBox
sijie commented on issue #774: Open should run recovery by default
URL: https://github.com/apache/bookkeeper/pull/774#issuecomment-346684200
 
 
   yes we intentionally made no recovery as default. because that is what 
people would typically expect for a "single-writer, multiple-readers" semantic. 
If people want to open and recovery a ledger, `withRecovery(true)` should be 
used for people aware of what exactly happen regarding "recovery".
   
   This is a new API we are adding for bookkeeper. so it doesn't need to carry 
the old behavior. so I am -1 on reverting the default back to true.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services