[GitHub] reddycharan commented on issue #570: Multiple active entrylogs

2018-03-02 Thread GitBox
reddycharan commented on issue #570: Multiple active entrylogs
URL: https://github.com/apache/bookkeeper/issues/570#issuecomment-369856263
 
 
   As far as I understood, EntryLogger is the abstraction layer for all of the 
operations on entrylog files. Be it how entries are written to entrylog file 
(how active/rotated entrylog files are organized), read from entrylog files 
(how read files are organized), flush/checkpoint path and getting info about 
the state of entrylog files for garbage collection/compaction. 
   
   Also, I tried to explain in detail why it is the right thing to do 
abstraction in EntryLogger in this comment 
   https://github.com/apache/bookkeeper/issues/570#issuecomment-368597767
   
   Entrylogger is the right place to deal with multiple entrylogs, it is more 
organic, less churn of the code, since ideally other components of the bookie 
doesn't need to be modified for anything to do with entry log files.
   
   But yes, regarding sub-task4, it is valid point to raise the need of 
SortedLedgerStorage if we are going to have entrylog per ledger. I need to have 
some perf numbers to validate the write/read scenarios in the case of entrylog 
per ledger using InterleavedLedgerStorage vs SortedLedgerStorage.
   
   **sub-task4** - introduce parallel (have just one Runnable per ledger) 
EntryMemTable flusher for SortedLedgerStorage which can be used in the case of 
entrylogperledger
   
   > @reddycharan is there any way you can sub-divide your subtask-5?
   
   Yes, I think it should be possible. Probably in the first task I could 
introduce EntryLogManager interface and EntryLogManagerForSingleEntryLog and in 
the final task I can introduce EntryLogManagerForEntryLogPerLedger.
   
   So

   **sub-task1** - Removal of unnecessary synchronization in 
InterleavedLedgerStorage methods. As described in 
http://mail-archives.apache.org/mod_mbox/bookkeeper-dev/201707.mbox/%3CCAO2yDyZ946fp2S_qR2iL178hPiXgrnFGb%3DpvkyK4ReSYAtNLBw%40mail.gmail.com%3E
   
   **sub-task2** - move the complete logic of flushIntervalInBytes from 
EntryLogger to BufferedChannel
   
   **sub-task3** - make changes to SyncThread/checkpoint logic, so that incase 
of entrylogperledger, checkpoint happens for every flushInterval but not when 
active entrylog is created/rolled over.
   
   **sub-task4** - introduce parallel (have just one Runnable per ledger) 
EntryMemTable flusher for SortedLedgerStorage which can be used in the case of 
entrylogperledger. (evaluate the need of SortedLedgerStorage by doing perf 
comparisons)
   
   **sub-task5** - introduce EntryLogManager interface and 
EntryLogManagerForSingleEntryLog
   
   **sub-task6** - introduce EntryLogManagerForEntryLogPerLedger
   
   Thanks guys for providing the feedback. Will proceed with the plan and start 
creating new pull requests for the sub-tasks. But I'll leave the existing 
pullrequest (https://github.com/apache/bookkeeper/pull/1201), since it might be 
helpful to refer to get the full picture.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] reddycharan commented on issue #570: Multiple active entrylogs

2018-03-02 Thread GitBox
reddycharan commented on issue #570: Multiple active entrylogs
URL: https://github.com/apache/bookkeeper/issues/570#issuecomment-369856263
 
 
   As far as I understood, EntryLogger is the abstraction layer for all of the 
operations on entrylog files. Be it how entries are written to entrylog file 
(how active/rotated entrylog files are organized), read from entrylog files 
(how read files are organized), flush/checkpoint path and getting info about 
the state of entrylog files for garbage collection/compaction. 
   
   Also, I tried to explain in detail why it is the right thing to do 
abstraction in EntryLogger in this comment 
   https://github.com/apache/bookkeeper/issues/570#issuecomment-368597767
   
   Entrylogger is the right place to deal with multiple entrylogs, it is more 
organic, less churn of the code, since ideally other components of the bookie 
doesn't need to be modified much.
   
   But yes, regarding sub-task4, it is valid point to raise the need of 
SortedLedgerStorage if we are going to have entrylog per ledger. I need to have 
some perf numbers to validate the write/read scenarios in the case of entrylog 
per ledger using InterleavedLedgerStorage vs SortedLedgerStorage.
   
   **sub-task4** - introduce parallel (have just one Runnable per ledger) 
EntryMemTable flusher for SortedLedgerStorage which can be used in the case of 
entrylogperledger
   
   > @reddycharan is there any way you can sub-divide your subtask-5?
   
   Yes, I think it should be possible. Probably in the first task I could 
introduce EntryLogManager interface and EntryLogManagerForSingleEntryLog and in 
the final task I can introduce EntryLogManagerForEntryLogPerLedger.
   
   So

   **sub-task1** - Removal of unnecessary synchronization in 
InterleavedLedgerStorage methods. As described in 
http://mail-archives.apache.org/mod_mbox/bookkeeper-dev/201707.mbox/%3CCAO2yDyZ946fp2S_qR2iL178hPiXgrnFGb%3DpvkyK4ReSYAtNLBw%40mail.gmail.com%3E
   
   **sub-task2** - move the complete logic of flushIntervalInBytes from 
EntryLogger to BufferedChannel
   
   **sub-task3** - make changes to SyncThread/checkpoint logic, so that incase 
of entrylogperledger, checkpoint happens for every flushInterval but not when 
active entrylog is created/rolled over.
   
   **sub-task4** - introduce parallel (have just one Runnable per ledger) 
EntryMemTable flusher for SortedLedgerStorage which can be used in the case of 
entrylogperledger. (evaluate the need of SortedLedgerStorage by doing perf 
comparisons)
   
   **sub-task5** - introduce EntryLogManager interface and 
EntryLogManagerForSingleEntryLog
   
   **sub-task6** - introduce EntryLogManagerForEntryLogPerLedger
   
   Thanks guys for providing the feedback. Will proceed with the plan and start 
creating new pull requests for the sub-tasks. But I'll leave the existing 
pullrequest (https://github.com/apache/bookkeeper/pull/1201), since it might be 
helpful to refer to get the full picture.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] reddycharan commented on issue #570: Multiple active entrylogs

2018-03-01 Thread GitBox
reddycharan commented on issue #570: Multiple active entrylogs
URL: https://github.com/apache/bookkeeper/issues/570#issuecomment-369537838
 
 
   >Just a question, is some code based on this change running in some 
production system ?
   @jvrao
   
   No, not yet, we are in the process of pushing all of our internal changes to 
the community code and eventually use the community repo version, so that we 
can be close to the community. Since we are in this transition phase we kept 
hold of any big changes to our repo and instead work directly with the 
community repo. 
   
   >Effectively it seems to me that this new kind of storage is vert different 
from InterleavedStorage. I remember we already discussed about this. Is it 
really a benefit to change InterleavedS and SortedS and not to start from 
scratch eventually creating a base class for common code?
   
   Ideally with the design approach i chose, there shouldn't be any changes to 
InterleavedLedgerStorage class and SortedLedgerStorage class. Even now the 
changes made to these two classes are minimal. These changes are required for 
the following reasons - 
   
   1) Removal of unnecessary synchronization in InterleavedLedgerStorage 
methods. As described in 
http://mail-archives.apache.org/mod_mbox/bookkeeper-dev/201707.mbox/%3CCAO2yDyZ946fp2S_qR2iL178hPiXgrnFGb%3DpvkyK4ReSYAtNLBw%40mail.gmail.com%3E
   2) minor changes to checkpoint logic, since in entryLogPerLedger it should 
not checkpoint when a new entrylog is created but instead it should checkpoint 
for every "flushInterval" period and also during this time it should flush both 
rotatedLogs and current active logs.
   3) couple of minor changes in SortedLedgerStorage regarding how entryLogger 
methods are called and creation of flushexecutor and passing it to 
EntryMemTable in the case of entryLogPerLedger.
   
   The crux of the implementation change is in EntryLogger class and for the 
reasons explained above abstraction is done in EntryLogger class.
   
   >I think a more confortable way to get this in is to split the patch into 
smaller changes. @reddycharan
   
   Ok, I've spent multiple weeks on this work item. If it is going to make 
things easier on reviewers I can consider splitting this PR into multiple PRs 
(hoping it wont take heavy toll on me and complicate things further). Probably 
I can split like following
   
   **sub-task1** - Removal of unnecessary synchronization in 
InterleavedLedgerStorage methods. As described in 
http://mail-archives.apache.org/mod_mbox/bookkeeper-dev/201707.mbox/%3CCAO2yDyZ946fp2S_qR2iL178hPiXgrnFGb%3DpvkyK4ReSYAtNLBw%40mail.gmail.com%3E
   
   **sub-task2** - move the complete logic of flushIntervalInBytes from 
EntryLogger to BufferedChannel
   
   **sub-task3** - make changes to SyncThread/checkpoint logic, so that incase 
of entrylogperledger, checkpoint happens for every flushInterval but not when 
active entrylog is created/rolled over.
   
   **sub-task4** - introduce parallel (have just one Runnable per ledger) 
EntryMemTable flusher for SortedLedgerStorage which can be used in the case of 
entrylogperledger
   
   **sub-task5** - Introduce EntryLogManager abstraction in EntryLogger and let 
it deal with the singleentrylog/entrylogperledger functionality and core 
business logic.
   
   As you can imagine, sub-task5 is going to be the crux of this work item and 
majority of my code change and the design choice of abstraction at EntryLogger 
is going to remain.
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] reddycharan commented on issue #570: Multiple active entrylogs

2018-03-01 Thread GitBox
reddycharan commented on issue #570: Multiple active entrylogs
URL: https://github.com/apache/bookkeeper/issues/570#issuecomment-369537838
 
 
   >Just a question, is some code based on this change running in some 
production system ?
   @jvrao
   
   No, not yet, we are in the process of pushing all of our internal changes to 
the community code and eventually use the community repo version, so that we 
can be close to the community. Since we are in this transition phase we kept 
hold of any big changes to our repo and instead work directly with the 
community repo. 
   
   >Effectively it seems to me that this new kind of storage is vert different 
from InterleavedStorage. I remember we already discussed about this. Is it 
really a benefit to change InterleavedS and SortedS and not to start from 
scratch eventually creating a base class for common code?
   
   Ideally with the design approach i chose, there shouldn't be any changes to 
InterleavedLedgerStorage class and SortedLedgerStorage class. Even now the 
changes made to these two classes are minimal. These changes are required for 
the following reasons - 
   
   1) Removal of unnecessary synchronization in InterleavedLedgerStorage 
methods. As described in 
http://mail-archives.apache.org/mod_mbox/bookkeeper-dev/201707.mbox/%3CCAO2yDyZ946fp2S_qR2iL178hPiXgrnFGb%3DpvkyK4ReSYAtNLBw%40mail.gmail.com%3E
   2) minor changes to checkpoint logic, since in entryLogPerLedger it should 
not checkpoint when a new entrylog is created but instead it should checkpoint 
for every "flushInterval" period and also during this time it should flush both 
rotatedLogs and current active logs.
   3) couple of minor changes in SortedLedgerStorage regarding how entryLogger 
methods are called and creation of flushexecutor and passing it to 
EntryMemTable in the case of entryLogPerLedger.
   
   The crux of the implementation change is in EntryLogger class and for the 
reasons explained above abstraction is done in EntryLogger class.
   
   >I think a more confortable way to get this in is to split the patch into 
smaller changes. @reddycharan
   
   Ok, I've spent multiple weeks on this work item. If it is going to make 
things easier on reviewers I can consider splitting this PR into multiple PRs 
(hoping it wont take heavy toll on me and complicate things further). Probably 
I can split like following
   
   **sub-task1** - Removal of unnecessary synchronization in 
InterleavedLedgerStorage methods. As described in 
http://mail-archives.apache.org/mod_mbox/bookkeeper-dev/201707.mbox/%3CCAO2yDyZ946fp2S_qR2iL178hPiXgrnFGb%3DpvkyK4ReSYAtNLBw%40mail.gmail.com%3E
   
   **sub-task2** - moved the complete logic of flushIntervalInBytes from 
EntryLogger to BufferedChannel
   
   **sub-task3** - make changes to SyncThread, so that incase of 
entrylogperledger, checkpoint happens for every flushInterval but not when 
active entrylog is created/rolled over.
   
   **sub-task4** - introduce parallel (have just one Runnable per ledger) 
EntryMemTable flusher for SortedLedgerStorage which can be used in the case of 
entrylogperledger
   
   **sub-task5** - Introduce EntryLogManager abstraction in EntryLogger and let 
it deal with the singleentrylog/entrylogperledger functionality and core 
business logic.
   
   As you can imagine, sub-task5 is going to be the crux of this work item and 
majority of my code change and the design choice of abstraction at EntryLogger 
is going to remain.
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] reddycharan commented on issue #570: Multiple active entrylogs

2018-03-01 Thread GitBox
reddycharan commented on issue #570: Multiple active entrylogs
URL: https://github.com/apache/bookkeeper/issues/570#issuecomment-369537838
 
 
   >Just a question, is some code based on this change running in some 
production system ?
   @jvrao
   
   No, not yet, we are in the process of pushing all of our internal changes to 
the community code and eventually use the community repo version, so that we 
can be close to the community. Since we are in this transition phase we kept 
hold of any big changes to our repo and instead work directly with the 
community repo. 
   
   >Effectively it seems to me that this new kind of storage is vert different 
from InterleavedStorage. I remember we already discussed about this. Is it 
really a benefit to change InterleavedS and SortedS and not to start from 
scratch eventually creating a base class for common code?
   
   Ideally with the design approach i chose, there shouldn't be any changes to 
InterleavedLedgerStorage class and SortedLedgerStorage class. Even now the 
changes made to these two classes are minimal. These changes are required for 
the following reasons - 
   
   1) Removal of unnecessary synchronization in InterleavedLedgerStorage 
methods. As described in 
http://mail-archives.apache.org/mod_mbox/bookkeeper-dev/201707.mbox/%3CCAO2yDyZ946fp2S_qR2iL178hPiXgrnFGb%3DpvkyK4ReSYAtNLBw%40mail.gmail.com%3E
   2) minor changes to checkpoint logic, since in entryLogPerLedger it should 
not checkpoint when a new entrylog is created but instead it should checkpoint 
for every "flushInterval" period and also during this time it should flush both 
rotatedLogs and current active logs.
   3) couple of minor changes in SortedLedgerStorage regarding how entryLogger 
methods are called and creation of flushexecutor and passing it to 
EntryMemTable in the case of entryLogPerLedger.
   
   The crux of the implementation change is in EntryLogger class and for the 
reasons explained above abstraction is done in EntryLogger class.
   
   >I think a more confortable way to get this in is to split the patch into 
smaller changes. @reddycharan
   
   Ok, I've spent multiple weeks on this work item. If it is going to make 
things easier on reviewers I can consider splitting this PR into multiple PRs 
(hoping it wont take heavy toll on me and complicate things further). Probably 
I can split like following
   
   **sub-task1** - Removal of unnecessary synchronization in 
InterleavedLedgerStorage methods. As described in 
http://mail-archives.apache.org/mod_mbox/bookkeeper-dev/201707.mbox/%3CCAO2yDyZ946fp2S_qR2iL178hPiXgrnFGb%3DpvkyK4ReSYAtNLBw%40mail.gmail.com%3E
   
   **sub-task2** - move the complete logic of flushIntervalInBytes from 
EntryLogger to BufferedChannel
   
   **sub-task3** - make changes to SyncThread, so that incase of 
entrylogperledger, checkpoint happens for every flushInterval but not when 
active entrylog is created/rolled over.
   
   **sub-task4** - introduce parallel (have just one Runnable per ledger) 
EntryMemTable flusher for SortedLedgerStorage which can be used in the case of 
entrylogperledger
   
   **sub-task5** - Introduce EntryLogManager abstraction in EntryLogger and let 
it deal with the singleentrylog/entrylogperledger functionality and core 
business logic.
   
   As you can imagine, sub-task5 is going to be the crux of this work item and 
majority of my code change and the design choice of abstraction at EntryLogger 
is going to remain.
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] reddycharan commented on issue #570: Multiple active entrylogs

2018-03-01 Thread GitBox
reddycharan commented on issue #570: Multiple active entrylogs
URL: https://github.com/apache/bookkeeper/issues/570#issuecomment-369537838
 
 
   >Just a question, is some code based on this change running in some 
production system ?
   @jvrao
   
   No, not yet, we are in the process of pushing all of our internal changes to 
the community code and eventually use the community repo version, so that we 
can be close to the community. Since we are in this transition phase we kept 
hold of any big changes to our repo and instead work directly with the 
community repo. 
   
   >Effectively it seems to me that this new kind of storage is vert different 
from InterleavedStorage. I remember we already discussed about this. Is it 
really a benefit to change InterleavedS and SortedS and not to start from 
scratch eventually creating a base class for common code?
   
   Ideally with the design approach i chose, there shouldn't be any changes to 
InterleavedLedgerStorage class and SortedLedgerStorage class. Even now the 
changes made to these two classes are minimal. These changes are required for 
the following reasons - 
   
   1) Removal of unnecessary synchronization in InterleavedLedgerStorage 
methods. As described in 
http://mail-archives.apache.org/mod_mbox/bookkeeper-dev/201707.mbox/%3CCAO2yDyZ946fp2S_qR2iL178hPiXgrnFGb%3DpvkyK4ReSYAtNLBw%40mail.gmail.com%3E
   2) minor changes to checkpoint logic, since in entryLogPerLedger it should 
not checkpoint when a new entrylog is created but instead it should checkpoint 
for every "flushInterval" period and also during this time it should flush both 
rotatedLogs and current active logs.
   3) couple of minor changes in SortedLedgerStorage regarding how entryLogger 
methods are called and creation of flushexecutor and passing it to 
EntryMemTable in the case of entryLogPerLedger.
   
   The crux of the implementation change is in EntryLogger class and for the 
reasons explained above abstraction is done in EntryLogger class.
   
   >I think a more confortable way to get this in is to split the patch into 
smaller changes. @reddycharan
   
   Ok, I've spent multiple weeks on this work item. If it is going to make 
things easier on reviewers I can consider splitting this PR into multiple PRs 
(hoping it wont take heavy toll on me and complicate things further). Probably 
I can split like following
   
   sub-task1 - Removal of unnecessary synchronization in 
InterleavedLedgerStorage methods. As described in 
http://mail-archives.apache.org/mod_mbox/bookkeeper-dev/201707.mbox/%3CCAO2yDyZ946fp2S_qR2iL178hPiXgrnFGb%3DpvkyK4ReSYAtNLBw%40mail.gmail.com%3E
   
   sub-task2 - moved the complete logic of flushIntervalInBytes from 
EntryLogger to BufferedChannel
   
   sub-task3 - make changes to SyncThread, so that incase of entrylogperledger, 
checkpoint happens for every flushInterval but not when active entrylog is 
created/rolled over.
   
   sub-task4 - introduce parallel (have just one Runnable per ledger) 
EntryMemTable flusher for SortedLedgerStorage which can be used in the case of 
entrylogperledger
   
   sub-task5 - Introduce EntryLogManager abstraction in EntryLogger and let it 
deal with the singleentrylog/entrylogperledger functionality and core business 
logic.
   
   As you can imagine, sub-task5 is going to be the crux of this work item and 
majority of my code change and the design choice of abstraction at EntryLogger 
is going to remain.
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] reddycharan commented on issue #570: Multiple active entrylogs

2018-02-28 Thread GitBox
reddycharan commented on issue #570: Multiple active entrylogs
URL: https://github.com/apache/bookkeeper/issues/570#issuecomment-369416511
 
 
   Hey @ivankelly , 
   
   to answer ?when/where was this discussed part?, I can point you to the 
meeting notes, mail exchanges, slack messages and git links which I shared with 
the community throughout the period. This work item was initially created by 
@jvrao during April 2017 https://issues.apache.org/jira/browse/BOOKKEEPER-1041. 
And by May 2017 I came up with design overview and implementation choices and 
presented to the community on May 18th community meeting. You can find the 
meeting minutes of what I presented on that day - 
https://cwiki.apache.org/confluence/display/BOOKKEEPER/2017-5-18+Meeting+Notes 
and I explained the design overview in the same jira issue. And yes, initially 
the design was to have configured number of entrylogs per ledgerdir. While 
implementing it, I noticed few issues with how checkpoint is done, regarding 
synchronization logic and preallocation of entry logs and raised the concerns 
regarding the same in the community by starting mail threads and having 
conversations in the community bi-weekly meets. In one of such mail 
conversation, @sijie asked same set of questions regarding where the 
abstraction/composition should happen. I explained in detail in the following 
mail 
http://mail-archives.apache.org/mod_mbox/bookkeeper-dev/201707.mbox/%3CCAAFz1KNJbPnhLGwp27kKKGgOtLiDPR7FHpSngdcN-C-Njxt7eQ%40mail.gmail.com%3E
 . Also in this email I provided git link for the initial implementation of 
code https://github.com/reddycharan/bookkeeper/tree/multipleentrylogs. 
Following this mail conversation @sijie , @jvrao  and I had in length 
conversation regarding where the composition should happen and convinced @sijie 
 that given the way code is structured it is right thing to do 
(abstraction/composition) at lowest level - Entrylogger level. 
   
   But in the follow up, I?ve been questioned about the intricacies of 
multithreaded/synchronization aspects of slot map management in configured 
number of entrylogs per ledgerdir approach and the benefits of having 
configured number of entrylogs per ledgerdir vs explicit entry log per ledger 
in the compaction story (since it is binary decision of whether to keep the 
entry log or not during compaction in the entrylog per ledger case). So I 
changed the design to entrylog per ledger, which changed the logic of choosing 
entry log for the given ledger but the underlying changes of the abstraction of 
entry logger remained the same. This has been informed to the community 
formally in several community meet calls and Sijie in particular (multiple 
times) before proceeding. 
https://cwiki.apache.org/confluence/display/BOOKKEEPER/2017-06-01+Meeting+Notes 
https://cwiki.apache.org/confluence/display/BOOKKEEPER/2017-10-19+Meeting+Notes 
https://cwiki.apache.org/confluence/display/BOOKKEEPER/2017-11-30+Meeting+Notes 
 
https://cwiki.apache.org/confluence/display/BOOKKEEPER/2017-12-14+Meeting+Notes.
 As I requested in 2017/12/14 meeting I shared preview version of my code in 
community slack 
https://apachebookkeeper.slack.com/archives/C6G5104SF/p1513212259000262 and the 
code is shared in my GitHub repo - 
https://github.com/reddycharan/bookkeeper/tree/entrylogperledger . @eolivelli 
and @sijie were helpful to do early CR, provide some initial feedback and gave 
their approval. Finally as I envisioned, discussed and shared, after rebasing 
my changes on the recent community code I created formal pull request - 
https://github.com/apache/bookkeeper/pull/1201 and also updated this issue with 
formal description of entry log per ledger design which I?ve been discussing 
all along.
   
   Agreed, I could have updated this issue description with the new design back 
then itself. But I?m not sure if it would have made difference since this was 
communicated/explained multiple times in the community meets and particularly 
to the interested people. For the people who are coming across this work item 
for the first time, it is needed/required when I created the formal pull 
request and hence I made sure I updated issue with full description (new 
design) before sending the Pull Request. I can say community has changed a lot 
in the amount of activity, processes, systems used and the list of participants 
in the last year (2017). This Work Item has spanned during this transition 
phase (it took quite longer time, this is because of change in priorities on 
our side, me dealing with multiple repos - internals salesforce one and 
community, deluge of commits in the second half of 2017 and the need to rebase 
my commits on top of them every single time I wanted to push it forward) and I 
see why you might have different opinion, considering you weren?t present in 
any of the above mentioned conversations/communications. But I see you are the 
one who migrated jira work item to git issues, but I?m not sur

[GitHub] reddycharan commented on issue #570: Multiple active entrylogs

2018-02-28 Thread GitBox
reddycharan commented on issue #570: Multiple active entrylogs
URL: https://github.com/apache/bookkeeper/issues/570#issuecomment-369416511
 
 
   Hey @ivankelly , 
   
   to answer ?when/where was this discussed part?, I can point you to the 
meeting notes, mail exchanges, slack messages and git links which I shared with 
the community throughout the period. This work item was initially created by 
@jvrao during April 2017 https://issues.apache.org/jira/browse/BOOKKEEPER-1041. 
And by May 2017 I came up with design overview and implementation choices and 
presented to the community on May 18th community meeting. You can find the 
meeting minutes of what I presented on that day - 
https://cwiki.apache.org/confluence/display/BOOKKEEPER/2017-5-18+Meeting+Notes 
and I explained the design overview in the same jira issue. And yes, initially 
the design was to have configured number of entrylogs per ledgerdir. While 
implementing it, I noticed few issues with how checkpoint is done, regarding 
synchronization logic and preallocation of entry logs and raised the concerns 
regarding the same in the community by starting mail threads and having 
conversations in the community bi-weekly meets. In one of such mail 
conversation, @sijie asked same set of questions regarding where the 
abstraction/composition should happen. I explained in detail in the following 
mail 
http://mail-archives.apache.org/mod_mbox/bookkeeper-dev/201707.mbox/%3CCAAFz1KNJbPnhLGwp27kKKGgOtLiDPR7FHpSngdcN-C-Njxt7eQ%40mail.gmail.com%3E
 . Also in this email I provided git link for the initial implementation of 
code https://github.com/reddycharan/bookkeeper/tree/multipleentrylogs. 
Following this mail conversation @sijie , @jvrao  and I had in length 
conversation regarding where the composition should happen and convinced @sijie 
 that given the way code is structured it is right thing to do 
(abstraction/composition) at lowest level - Entrylogger level. 
   
   But in the follow up, I?ve been questioned about the intricacies of 
multithreaded/synchronization aspects of slot map management in configured 
number of entrylogs per ledgerdir approach and the benefits of explicit entry 
log per ledger in the compaction story (binary decision of whether to keep the 
entry log or not during compaction). So I changed the design to entrylog per 
ledger, which changed the logic of choosing entry log for the given ledger but 
the underlying changes of the abstraction of entry logger remained the same. 
This has been informed to the community formally in several community meet 
calls and Sijie in particular (multiple times) before proceeding. 
https://cwiki.apache.org/confluence/display/BOOKKEEPER/2017-06-01+Meeting+Notes 
https://cwiki.apache.org/confluence/display/BOOKKEEPER/2017-10-19+Meeting+Notes 
https://cwiki.apache.org/confluence/display/BOOKKEEPER/2017-11-30+Meeting+Notes 
 
https://cwiki.apache.org/confluence/display/BOOKKEEPER/2017-12-14+Meeting+Notes.
 As I requested in 2017/12/14 meeting I shared preview version of my code in 
community slack 
https://apachebookkeeper.slack.com/archives/C6G5104SF/p1513212259000262 and the 
code is shared in my GitHub repo - 
https://github.com/reddycharan/bookkeeper/tree/entrylogperledger . @eolivelli 
and @sijie were helpful to do early CR, provide some initial feedback and gave 
their approval. Finally as I envisioned, discussed and shared, after rebasing 
my changes on the recent community code I created formal pull request - 
https://github.com/apache/bookkeeper/pull/1201 and also updated this issue with 
formal description of entry log per ledger design which I?ve been discussing 
all along.
   
   Agreed, I could have updated this issue description with the new design back 
then itself. But I?m not sure if it would have made difference since this was 
communicated/explained multiple times in the community meets and particularly 
to the interested people. For the people who are coming across this work item 
for the first time, it is needed/required when I created the formal pull 
request and hence I made sure I updated issue with full description (new 
design) before sending the Pull Request. I can say community has changed a lot 
in the amount of activity, processes, systems used and the list of participants 
in the last year (2017). This Work Item has spanned during this transition 
phase (it took quite longer time, this is because of change in priorities on 
our side, me dealing with multiple repos - internals salesforce one and 
community, deluge of commits in the second half of 2017 and the need to rebase 
my commits on top of them every single time I wanted to push it forward) and I 
see why you might have different opinion, considering you weren?t present in 
any of the above mentioned conversations/communications. But I see you are the 
one who migrated jira work item to git issues, but I?m not sure if it was 
automated task or well curated manual task. 
   
   Hope this answers ?when/where was th

[GitHub] reddycharan commented on issue #570: Multiple active entrylogs

2018-02-28 Thread GitBox
reddycharan commented on issue #570: Multiple active entrylogs
URL: https://github.com/apache/bookkeeper/issues/570#issuecomment-369416511
 
 
   Hey @ivankelly , 
   
   to answer ?when/where was this discussed part?, I can point you to the 
meeting notes, mail exchanges, slack messages and git links which I shared with 
the community throughout the period. This work item was initially created by 
@jvrao during April 2017 https://issues.apache.org/jira/browse/BOOKKEEPER-1041. 
And by May 2017 I came up with design overview and implementation choices and 
presented to the community on May 18th community meeting. You can find the 
meeting minutes of what I presented on that day - 
https://cwiki.apache.org/confluence/display/BOOKKEEPER/2017-5-18+Meeting+Notes 
and I explained the design overview in the same jira issue. And yes, initially 
the design was to have configured number of entrylogs per ledgerdir. While 
implementing it, I noticed few issues with how checkpoint is done, regarding 
synchronization logic and preallocation of entry logs and raised the concerns 
regarding the same in the community by starting mail threads and having 
conversations in the community bi-weekly meets. In one of such mail 
conversation, @sijie asked same set of questions regarding where the 
abstraction/composition should happen. I explained in detail in the following 
mail 
http://mail-archives.apache.org/mod_mbox/bookkeeper-dev/201707.mbox/%3CCAAFz1KNJbPnhLGwp27kKKGgOtLiDPR7FHpSngdcN-C-Njxt7eQ%40mail.gmail.com%3E
 . Also in this email I provided git link for the initial implementation of 
code https://github.com/reddycharan/bookkeeper/tree/multipleentrylogs. 
Following this mail conversation @sijie , @jvrao  and I had in length 
conversation regarding where the composition should happen and convinced @sijie 
 that given the way code is structured it is right thing to do 
(abstraction/composition) at lowest level - Entrylogger level. 
   
   But in the follow up, I?ve been questioned about the intricacies of 
multithreaded/synchronization aspects of slot map management in configured 
number of entrylogs per ledgerdir approach and the benefits of explicit entry 
log per ledger in the compaction story (binary decision of whether to keep the 
entry log or not during compaction). So I changed the design to entrylog per 
ledger, which changed the logic of choosing entry log for the given ledger but 
the underlying changes of the abstraction of entry logger remained the same. 
This has been informed to the community formally in several community meet 
calls and Sijie in particular (multiple times) before proceeding. 
https://cwiki.apache.org/confluence/display/BOOKKEEPER/2017-06-01+Meeting+Notes 
https://cwiki.apache.org/confluence/display/BOOKKEEPER/2017-10-19+Meeting+Notes 
https://cwiki.apache.org/confluence/display/BOOKKEEPER/2017-11-30+Meeting+Notes 
 
https://cwiki.apache.org/confluence/display/BOOKKEEPER/2017-12-14+Meeting+Notes.
 As I requested in 2017/12/14 meeting I shared preview version of my code in 
community slack 
https://apachebookkeeper.slack.com/archives/C6G5104SF/p1513212259000262 and the 
code is shared in my GitHub repo - 
https://github.com/reddycharan/bookkeeper/tree/entrylogperledger . @eolivelli 
and @sijie were helpful to do early CR, provide some initial feedback and gave 
their approval. Finally as I envisioned, discussed and shared, after rebasing 
my changes on the recent community code I created formal pull request - 
https://github.com/apache/bookkeeper/pull/1201 and also updated this issue with 
formal description of entry log per ledger design which I?ve been discussing 
all along.
   
   Agreed, I could have updated this issue description with the new design back 
then itself. But I?m not sure if it would have made difference since this was 
communicated/explained multiple times in the community meets and particularly 
to the interested people. For the people who are coming across this work item 
for the first time, it is needed/required when I created the formal pull 
request and hence I made sure I updated issue with full description (new 
design) before sending the Pull Request. I can say community has changed a lot 
in the amount of activity, processes, systems used and the list of participants 
in the last year (2017). This Work Item has spanned during this transition 
phase and I see why you might have different opinion, considering you weren?t 
present in any of the above mentioned conversations/communications. But I see 
you are the one who migrated jira work item to git issues, but I?m not sure if 
it was automated task or well curated manual task. 
   
   Hope this answers ?when/where was this discussed part?.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infra

[GitHub] reddycharan commented on issue #570: Multiple active entrylogs

2018-02-28 Thread GitBox
reddycharan commented on issue #570: Multiple active entrylogs
URL: https://github.com/apache/bookkeeper/issues/570#issuecomment-369416511
 
 
   Hey @ivankelly , 
   
   to answer ?when/where was this discussed part?, I can point you to the 
meeting notes, mail exchanges, slack messages and git links which I shared with 
the community throughout the period. This work item was initially created by 
@jvrao during April 2017 https://issues.apache.org/jira/browse/BOOKKEEPER-1041. 
And by May 2017 I came up with design overview and implementation choices and 
presented to the community on May 18th community meeting. You can find the 
meeting minutes of what I presented on that day - 
https://cwiki.apache.org/confluence/display/BOOKKEEPER/2017-5-18+Meeting+Notes 
and I explained the design overview in the same jira issue. And yes, initially 
the design was to have configured number of entrylogs per ledgerdir. While 
implementing it, I noticed few issues with how checkpoint is done, regarding 
synchronization logic and preallocation of entry logs and raised the concerns 
regarding the same in the community by starting mail threads and having 
conversations in the community bi-weekly meets. In one of such mail 
conversation, @sijie asked same set of questions regarding where the 
abstraction/composition should happen. I explained in detail in the following 
mail 
http://mail-archives.apache.org/mod_mbox/bookkeeper-dev/201707.mbox/%3CCAAFz1KNJbPnhLGwp27kKKGgOtLiDPR7FHpSngdcN-C-Njxt7eQ%40mail.gmail.com%3E
 . Also in this email I provided git link for the initial implementation of 
code https://github.com/reddycharan/bookkeeper/tree/multipleentrylogs. 
Following this mail conversation @sijie , @jvrao  and I had in length 
conversation regarding where the composition should happen and convinced @sijie 
 that given the way code is structured it is right thing to do 
(abstraction/composition) at lowest level - Entrylogger level. 
   
   But in the follow up, I?ve been questioned about the intricacies of 
multithreaded/synchronization aspects of slot map management in configured 
number of entrylogs per ledgerdir approach and the benefits of explicit entry 
log per ledger in the compaction story (binary decision of whether to keep the 
entry log or not during compaction). So I changed the design to entrylog per 
ledger, which changed the logic of choosing entry log for the given ledger but 
the underlying changes of the abstraction of entry logger remained the same. 
This has been informed to the community formally in several community meet 
calls and Sijie in particular (multiple times) before proceeding. 
https://cwiki.apache.org/confluence/display/BOOKKEEPER/2017-06-01+Meeting+Notes 
https://cwiki.apache.org/confluence/display/BOOKKEEPER/2017-10-19+Meeting+Notes 
https://cwiki.apache.org/confluence/display/BOOKKEEPER/2017-11-30+Meeting+Notes 
 
https://cwiki.apache.org/confluence/display/BOOKKEEPER/2017-12-14+Meeting+Notes.
 As I requested in 2017/12/14 meeting I shared preview version of my code in 
community slack 
https://apachebookkeeper.slack.com/archives/C6G5104SF/p1513212259000262 and the 
code is shared in my GitHub repo - 
https://github.com/reddycharan/bookkeeper/tree/entrylogperledger . @eolivelli 
and @sijie were helpful to do early CR, provide some initial feedback and gave 
their approval. Finally as I envisioned, discussed and shared, after rebasing 
my changes on the recent community code I created formal pull request - 
https://github.com/apache/bookkeeper/pull/1201 and also updated this issue with 
formal description of entry log per ledger design which I?ve been discussing 
all along.
   
   Agreed, I could have updated this issue description with the new design back 
then itself. But I?m not sure if it would have made difference since this was 
communicated/explained multiple times in the community meets and particularly 
to the interested people multiple times. For the people who are coming across 
this work item for the first time, it is needed/required when I created the 
formal pull request and hence I made sure I updated issue with full description 
(new design) before sending the Pull Request. I can say community has changed a 
lot in the amount of activity, processes, systems used and the list of 
participants in the last year (2017). This Work Item has spanned during this 
transition phase and I see why you might have different opinion, considering 
you weren?t present in any of the above mentioned conversations/communications. 
But I see you are the one who migrated jira work item to git issues, but I?m 
not sure if it was automated task or well curated manual task. 
   
   Hope this answers ?when/where was this discussed part?.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, pleas

[GitHub] reddycharan commented on issue #570: Multiple active entrylogs

2018-02-26 Thread GitBox
reddycharan commented on issue #570: Multiple active entrylogs
URL: https://github.com/apache/bookkeeper/issues/570#issuecomment-368597767
 
 
   Hey Ivan.. Sijie, JV and our internal team had long conversation about the 
same and I responded to community aswell in one of the mail
   
   Initially when we designed to implement this feature by providing config for 
number of entrylogs per ledgerdir and because of that the logic was convoluted 
with slot mapping management and Sijie felt that it is complicated and 
questioned about it. But now with entrylog per ledger, the management logic in 
entrylogger is simpler and straightforward. The same amount of complication 
will be applicable even in the case of composition at LedgerStorage level. When 
we get addEntry at ComposedLedgerStorage there should be logic to which 
LedgerStorage it should goto. So with this fact I can say that composing at 
LedgerStorage level it will be atleast as complicated as composing at 
EntryLogger level.
   
   with the ComposedLedgerStorage approach (Composite Pattern), 
ComposedLedgerStorage managing N InterleavedLedgerStorage/SortedLedgerStorages, 
there is going to be considerable change in the way the resources and state 
information are handled. Each Interleaved/SortedLedgerStorage will have its own 
EntryLogger (to serve our MultipleEntryLogs feature), but rest all needs to be 
shared, mainly 'LedgerCache', 'gcThread' and 'activeLedgers'. So there is going 
to be major churn in the code for moving the operations dealing with shared 
resources and state to ComposedLedgerStorage and leave the rest in 
InterleavedLedgerStorage. Amount of changes required in testcode to deal with 
these changes is even more. We have to do this while providing backward 
compatibility (Single EntryLog). Now this should make one question where should 
the composition happen. As far as I can say, it is supposed to happen at the 
lowest possible level rather than at higher level, which needs extra band-aid 
efforts to deal separately for the common resources/state and multiplexable 
resources.
   
   with composition at EntryLogger level, currently in my implementation 
getEntry/readEntry path, index manager and GC components are unchanged. but 
with composition at LedgerStorage even for readEntry/getEntry the 
multiplexing/redirection needs to happen. This is dealbreaker for me. Things 
get very messy with SortedLedgerSotrage entries in Memtable when Ledgerdir of 
the entryLog becomes full (we have to rotate current entrylog and create new 
one in writableledgerdir) while simulataneous writes and read are happening. 
Also this needs instance of LedgerStorage class for each entryLog in readpath 
and corresponding EntryMemTable if it is SortedLedgerStorage, though Memtable 
is not needed in read case. Most importantly we need to provide this feature 
with backward compatibility (we should be able to read previous data) and also 
provide existing behaviour (single entrylog) with config option.
   
   and mainly LedgerStorage is not the only consumer of EntryLogger, but also 
GarbageCollectorThread. It calls quite a few methods of EntryLogger - 
(entryLogger.addEntry, entryLogger.flush, entryLogger.removeEntryLog, 
entryLogger.scanEntryLog, entryLogger.getLeastUnflushedLogId, 
entryLogger.logExists, entryLogger.getEntryLogMetadata). It is going to be an 
issue with ComposedLedgerStorage, because with that EntryLogger is going to be 
responsible for just one EntryLog. For GarbageCollectorThread to work correctly 
with multiple EntryLogs, composition is required here as well. Which is double 
whammy when it comes down to implementation.
   
   Anyhow the only posiible shortcoming i see with multiplexing at Entrylogger 
level is that we are having single MemTable in the case of SortedLedgerStorage, 
but we mitigated that issue by doing parallel flushes. In terms of complexity 
and code churn if we go with multiplexing at ledgerstorage class level it will 
be several times higher than the current implementation with not much benefits.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] reddycharan commented on issue #570: Multiple active entrylogs

2018-02-23 Thread GitBox
reddycharan commented on issue #570: Multiple active entrylogs
URL: https://github.com/apache/bookkeeper/issues/570#issuecomment-368185454
 
 
   **Design Overview** 
   
   **Entrylog per Ledger**
   
   As name suggests, in this approach in a bookie we are going to have active 
entrylog dedicated to active ledger. But it is not a strict enforcement of 
one-to-one mapping of ledger to entrylog in a bookie. Strict one-to-one 
relationship between ledger and entrylog in a bookie is not possible for 
multiple reasons (ledgerdir might become full, entrylog might reach its 
capacity, segmentation/replication can happen, possibility of bookie crash,..). 
Besides, once  entrylog is rotated  it cannt be reopened. Since while rotating 
entrylog file, EntryLogger appends the ledger map at the end of the entry log 
and updates the entrylog file  with the offset and size of the map. It is like 
sealing the entrylog file. And EntryLogger maintains a pointer called 
'leastUnflushedLogId', which specifies the least  entrylogid which is not yet 
rotated and closed and GC considers all the entrylogs with logid lesser than 
'leastUnflushedLogId' (entrylogids are sequential numbers) are eligible for 
compaction/garbagecollection. In summary once the entrylog is rotated and 
closed we need to maintain immutable semantics on entrylog file.
   
So instead we can provide relaxed constraint where an entrylog is 
dedicated/committed to a ledger but not otherway around. So in most cases there 
would be just one entrylog for a ledger in a bookie, but in situations like 
when entrylog reaches capacity it is rotated and new one is created for that 
ledger, when ledgerdir is full all the entrylogs in that ledgerdir are rotated 
and new ones are created for those ledgers, because of segmentation and 
replication various segments of ledger might end up in different entrylogs and 
because of a bookie crash while replaying the journal new entrylog will be 
created for the leftout entries in the journal., entries of a ledger might end 
up in different entrylogs in a bookie. To summarize briefly about this approach.
   
   
   - is to have server configuration specifying entrylogperledger is enabled
   - for the previous behavior (one active entrylog) that config can be set to 
false
   - when entrylogger receives addEntry call, it needs to know the entry log 
for the current ledger
   -  so EntryLogger needs to maintain state information of mapping of ledgerId 
to entrylogid. If the in-memory map doesn't contain entry for the ledger, then 
EntryLogger will create a new Entrylog and add the mapping of ledgerId to 
EntryLog.
   - for creation of new entrylog, EntryLogger will pick writable ledgerdir 
with least number of active entrylogs
   - if entrylog reaches the capacity, then it will be rotated and new entrylog 
will be assigned to that ledger and mapping will be updated
   - If a ledgerdir becomes full, then all the entrylogs in that ledgerdir, 
should be rotated. New EntryLogs should be created in the available writable 
ledgerdirs for those ledgers and the mapping should be updated. 
   - when ledgerdir becomes writable again that ledgerdir should become 
eligible for creation of new entrylogs
   - Currently Bookie is not informed about the writeclose of the ledger, so 
there needs to be a way to know when to remove the mapping entry from the map 
and rotate the entrylog. One simple way to handle it is to use cache (Guava 
Cache library) with timebased eviction policy (on last access) and as part of 
removal listener we can rotate the corresponding entrylog.
   - Time based eviction policy is simple to provide, but untill entrylog file 
is rotated and flushed, filehandles of entrylogs are kept open and it wont 
become eligible for compaction/garbage collection. So explicit writeclose call 
from client to bookies ensemble of that ledger is needed for better handling of 
entrylogs.
   - Both the time based eviction and removal policy and explicit writeclose 
call are required because not in all cases explicit write close calls to 
bookies are guaranteed, like during ensemble change of ledger, client crash and 
unreliable write close protocol. Advisory Write Close is explained below in 
detail.
   - For this feature I need to make changes to checkpoint logic. Currently 
with BOOKKEEPER-564 change, we are scheduling checkpoint only when current 
entrylog file is rotated. So we dont call 'flushCurrentLog' when we checkpoint. 
But for this feature, since there are going to be multiple active entrylogs, 
scheduling checkpoint when entrylog file is rotated, is not an option. So I 
need to call flushCurrentLogs when checkpoint is made for every 'flushinterval' 
period
   - With entrylogperledger feature we are not changing format of the entrylog 
contents in anyway, so it should be possible to switch entrylogperledger 
configuration back and forth.
   
   **Advisory Write Close implementation details**
   
   - Advisory write close messag

[GitHub] reddycharan commented on issue #570: Multiple active entrylogs

2018-02-23 Thread GitBox
reddycharan commented on issue #570: Multiple active entrylogs
URL: https://github.com/apache/bookkeeper/issues/570#issuecomment-368185000
 
 
   For this work item, there are two possible approaches
   
   1) is to have configurable number of active entrylogs per ledgerdir
   2) active entrylog per ledger
   
   For the first approach I mentioned the design overview earlier. But we 
decided to go with the second approach for the following reasons
   
   There are couple of major benefits by going with entrylogperledger. The 
implementation would become simpler, because it is a complex logic to handle 
and maintain slotmap in configured number of Entrylogs per LedgerDir 
design/implementation and it needs to be thoroughly analyzed/tested from 
multi-thread perspective. But most importantly the garbagecollection component 
becomes much simpler because there is no need of compaction in entrylog per 
ledger approach. So there is huge gain in indirect performance because of 
absence of IO activity for compaction and huge improvement in space reclamation 
during compaction.



This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services