[ 
https://issues.apache.org/jira/browse/PHOENIX-4130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16341590#comment-16341590
 ] 

ASF GitHub Bot commented on PHOENIX-4130:
-----------------------------------------

Github user JamesRTaylor commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/290#discussion_r164220749
  
    --- Diff: 
phoenix-core/src/main/java/org/apache/phoenix/hbase/index/write/IndexWriterUtils.java
 ---
    @@ -70,13 +70,13 @@
        public static final String HTABLE_KEEP_ALIVE_KEY = 
"hbase.htable.threads.keepalivetime";
     
        public static final String INDEX_WRITER_RPC_RETRIES_NUMBER = 
"phoenix.index.writes.rpc.retries.number";
    -   /**
    -    * Based on the logic in HBase's AsyncProcess, a default of 11 retries 
with a pause of 100ms
    -    * approximates 48 sec total retry time (factoring in backoffs).  The 
total time should be less
    -    * than HBase's rpc timeout (default of 60 sec) or else the client will 
retry before receiving
    -    * the response
    -    */
    -   public static final int DEFAULT_INDEX_WRITER_RPC_RETRIES_NUMBER = 11;
    +    /**
    +     * Retry server-server index write rpc only once, and let the client 
retry the data write
    +     * instead to avoid typing up the handler
    +     */
    +   // note in HBase 2+, numTries = numRetries + 1
    +   // in prior versions, numTries = numRetries
    +   public static final int DEFAULT_INDEX_WRITER_RPC_RETRIES_NUMBER = 1;
    --- End diff --
    
    I was thinking it'd be ok if the onus was on the operator. I think we're in 
the minority in having a mix of old/new clients with zero downtime. If you 
think it's worth checking the client version (or checking if upgraded yet) and 
can come up with a reasonable way of doing that, that's fine too. Not sure it's 
worth the effort, though. Here's some ideas:
    
    * add client version to IndexMaintainer (which is protobuf-ed, so it'd be 
doable). Then use the "correct" IndexBuilder (or config?) based on the client 
version pulled out of the IndexMaintainer. If you have separate IndexBuilder 
instances, then you'd get duplicate thread pools to do the writing too which 
isn't ideal. Maybe within the parallel writer you could use the correct config?
    * less dynamic, but check the timestamp of system.catalog to know what 
version you're at. Somewhat scary to have an RPC in the start path of the 
coprocessor, though, as if system.catalog RS can't be reached, then you're kind 
of screwed (but many you are already anyway).
    * least dynamic - put onus on operator and change to setIfUnset.


> Avoid server retries for mutable indexes
> ----------------------------------------
>
>                 Key: PHOENIX-4130
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-4130
>             Project: Phoenix
>          Issue Type: Improvement
>            Reporter: Lars Hofhansl
>            Assignee: Vincent Poon
>            Priority: Major
>             Fix For: 4.14.0
>
>         Attachments: PHOENIX-4130.v1.master.patch, 
> PHOENIX-4130.v2.master.patch, PHOENIX-4130.v3.master.patch
>
>
> Had some discussions with [~jamestaylor], [~samarthjain], and [~vincentpoon], 
> during which I suggested that we can possibly eliminate retry loops happening 
> at the server that cause the handler threads to be stuck potentially for 
> quite a while (at least multiple seconds to ride over common scenarios like 
> splits).
> Instead we can do the retries at the Phoenix client that.
> So:
> # The index updates are not retried on the server. (retries = 0)
> # A failed index update would set the failed index timestamp but leave the 
> index enabled.
> # Now the handler thread is done, it throws an appropriate exception back to 
> the client.
> # The Phoenix client can now retry. When those retries fail the index is 
> disabled (if the policy dictates that) and throw the exception back to its 
> caller.
> So no more waiting is needed on the server, handler threads are freed 
> immediately.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to