[GitHub] [incubator-pinot] mcvsubbu commented on issue #4914: [POC] By-passing deep-store requirement for Realtime segment completion

2020-04-06 Thread GitBox
mcvsubbu commented on issue #4914: [POC] By-passing deep-store requirement for 
Realtime segment completion
URL: https://github.com/apache/incubator-pinot/pull/4914#issuecomment-610097015
 
 
   @chenboat  I believe @Jackie-Jiang  suggested we don't go with another 
class, but with a table config instead. I somehow prefer another class, it also 
allows people to do their own segment commit and try multiple stores, or 
whatever.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] mcvsubbu commented on issue #4914: [POC] By-passing deep-store requirement for Realtime segment completion

2020-04-06 Thread GitBox
mcvsubbu commented on issue #4914: [POC] By-passing deep-store requirement for 
Realtime segment completion
URL: https://github.com/apache/incubator-pinot/pull/4914#issuecomment-610074062
 
 
   The config change in step 5 is because we want the segment split-committer 
to look like this:
   `
   doCommitStart();
   url = uploadSegmentToDeepStore();
   if (url == null) { // upload failed, or no deep store configured
 if (tableConfig.commitEvenIfUploadFailed) {
   segmentUrlForCommitEnd = PEER_URL
 } else {
   throw CommitFailedException
 }
   } else {
 segmentUrilForCommitEnd = url
   }
   doCommitEnd();
   `
   
   The config is needed in order to decide the action when upload returns null 
url.
   
   Another way to do it is to configure a new class in `StreamConfig` section 
and use that class to bypass deep-store. Them you can leave the main 
split-committer as is, and build another class that can be configured in.
   
   Either way, you need a config.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] mcvsubbu commented on issue #4914: [POC] By-passing deep-store requirement for Realtime segment completion

2020-04-05 Thread GitBox
mcvsubbu commented on issue #4914: [POC] By-passing deep-store requirement for 
Realtime segment completion
URL: https://github.com/apache/incubator-pinot/pull/4914#issuecomment-609453001
 
 
   Hi @jamesyfshao , we need to get this in without blocking you, so I suggest 
we proceed with small increments that can be easily reviewed by people. Please 
include me in all reviews.  Here are the steps I would follow. These will make 
sure that the feature is NOT enabled until the very end, and each can be tested 
independently -- either manually or via unit/integration tests added in the PR 
itself.
   
   1. PR for server to download a segment (be it realtime or offline). Download 
should happen either with http or https endpoints. Both should be possible, 
depending on the config supplied (at server level, so we can use 
indexingTableConfig for this). You can add a test in the integ test to download 
one existing realtime segment in LLRealtimeClusterInteg test. Use http or https 
downloader already available. Since this is a spec change, please include the 
spec (exact URI you plan to use, etc.) in your design doc, and send out an 
email to devs, and add in general slack channel
   
   2. Introduce an Uploader abstraction, that will upload a segment to either 
controller or deep store. Make sure this class is unit tested, tested to work 
with your deepstore, but is not used anywhere else yet (except for tests of 
course)
   
   3. Change the Split committer to use the uploader abstraction introduced, 
instead of current hard- oded controller-only. This PR maybe can go wth (2) 
depending on how but (2) gets.
   
   4. Introduce a peer downloader (but do not hook it up yet), based on either 
http or https downloader, depending on configuration. Maybe we dont even need 
this, so you can even skip this step. We can directly use http or https 
downloader
   
   This completes phase-1
   
   In phase-2, we will do:
   
   5. Intoduce a table config (inside stream config? or maybe in indexing 
config?) a boolean on whether to abort split commit if upload to deep store 
fails. This config is not used anywhere as yet. Since this is an architectural 
change, be sure to email dev mailing list, and also post in slack. Get 
consensus on where the config should go, and what exactly it means, naming, 
etc. These things can take time, and are not changeable once introduced.

   5. Change the controller to accept a new URL in the protocol (I suggest we 
insert the following when segment upload fails: 
"peer://peer/URI_INTRODUCED_IN_STEP_1". Or, simply the word in all caps "PEER". 
I like the previous option, because it is in URI format, and the controller can 
do one check with the URL decoder for the scheme. When the controller detects 
peer, it should not set the download URL in  the metadata (or, insert a null, 
but I prefer NO url  in metadata). Add unit tests to cover this case. Since the 
server never inserts PEER, we should be good so far.
   
   6. Change the server to include logic to fetch from peer (using http or 
https fetcher as configured) IF the fetch from deep store fails, include 
retries as discussed in phone conference. (i.e. try deep store, if. it fails, 
try peer, if that fails, try the whole sequence again, N times). Try not to 
hook up this logic as yet. Add unit tests
   
   7. Change realtime segment fixer in the controller to fix segments that are 
not there in deep store. Add integration tests.
   
   8.  Hook up logic in 6, and change the split committer to use the table 
config, add an integration test
   
   You are done at this point


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] mcvsubbu commented on issue #4914: [POC] By-passing deep-store requirement for Realtime segment completion

2020-02-12 Thread GitBox
mcvsubbu commented on issue #4914: [POC] By-passing deep-store requirement for 
Realtime segment completion
URL: https://github.com/apache/incubator-pinot/pull/4914#issuecomment-585384008
 
 
   Is this POC going to be merged, or are we going to split things as per the 
design doc?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org