[ 
https://issues.apache.org/jira/browse/FLUME-962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13240255#comment-13240255
 ] 

[email protected] commented on FLUME-962:
-----------------------------------------------------



bq.  On 2012-03-28 06:03:54, Arvind Prabhakar wrote:
bq.  > Thanks for incorporating some of the previous feedback Hari. 
bq.  > 
bq.  > I did notice though that some of my feedback was not incorporated - 
which is OK as long as we discuss and agree to do so on the review. The best 
way to discuss this is by responding to individual comments with your feedback 
so that I know what to expect when you update the diffs. Without any comments 
from you, I expect that all of the feedback has been incorporated which is 
misleading.  Conversely, if on every diff I have to do full review - then that 
poses a scalability challenge for the reviewer as you can see the number of 
diff increments that have gone into this issue alone.
bq.  > 
bq.  > That said, I am putting marker comments on the items that were 
previously raised in the review and were not discussed or addressed. Please let 
me know your thoughts.
bq.  > 
bq.  >

Interestingly, I had actually added a bunch of comments to your review. I will 
add them again here. Not sure why you cannot see them. Sorry for the 
inconvenience, I will try not to make the same mistake again. 


bq.  On 2012-03-28 06:03:54, Arvind Prabhakar wrote:
bq.  > flume-ng-sdk/src/main/java/org/apache/flume/api/FailoverRpcClient.java, 
line 180
bq.  > <https://reviews.apache.org/r/4380/diff/21/?file=97183#file97183line180>
bq.  >
bq.  >     Previous request.

The only reason I did not add logs for these, is because they are not exactly 
error conditions, since the client is still ok, just one of the hosts just 
failed. I believe adding a logger.info might be ok.


bq.  On 2012-03-28 06:03:54, Arvind Prabhakar wrote:
bq.  > flume-ng-sdk/src/main/java/org/apache/flume/api/FailoverRpcClient.java, 
lines 287-299
bq.  > <https://reviews.apache.org/r/4380/diff/21/?file=97183#file97183line287>
bq.  >
bq.  >     Previous request.

I had added this in the previous review itself. Here it the reasoning. It is 
also mentioned in the code as a comment.    

Basically the logic is explained above(Mike's comments). Anyway here it is:

    We are currently connected to host number: i. i fails, we check all the 
hosts from i+1 to hosts.size()-1. If all fail, we check from host 0 to host i. 
Now, if none of these are available, throw exception. This is basically 
assuming that any host that fails might come back alive at some point in time. 
So we must go back and check every host and see if we can connect.


- Hari


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4380/#review6473
-----------------------------------------------------------


On 2012-03-28 05:22:34, Hari Shreedharan wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/4380/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2012-03-28 05:22:34)
bq.  
bq.  
bq.  Review request for Flume.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Submitting an initial cut of FailoverRpcClient that uses the 
NettyRpcClient under the hood. In this version, host selection is not exactly 
the best, please make suggestions on how to improve it. As of now, the first 
version will not have a backoff mechanism to not retry a host for a fixed time 
etc(as discussed in the jira). I will add unit tests soon.
bq.  
bq.  Note that the actual "connect" call to a host is hidden from the 
FailoverClient (by the Netty client or any other implementation, which we may 
choose to use later). Since this connect call is hidden, failure to create a 
client(the build function throwing an exception) is not being considered a 
failure. Only a failure to append is considered a failure, and counted towards 
the maximum number of tries. In other words, as far as the FailoverClient(for 
that matter, any implementation of RpcClient interface) would consider an 
append failure as failure, not a failure to a build() call - if we want to make 
sure that a connect failure also is counted, we should move the connect call to 
the append function and keep track of the connection state internally, and not 
expect any code depending on an implementation of RpcClient(including other 
clients which depend on pre-existing clients) to know that a build call also 
creates a connection - this is exactly like a socket implementation, creating a 
new socket does not initialize a connection, it is done explicitly.
bq.  
bq.  
bq.  This addresses bug FLUME-962.
bq.      https://issues.apache.org/jira/browse/FLUME-962
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    flume-ng-sdk/src/main/java/org/apache/flume/api/AbstractRpcClient.java 
PRE-CREATION 
bq.    flume-ng-sdk/src/main/java/org/apache/flume/api/FailoverRpcClient.java 
PRE-CREATION 
bq.    flume-ng-sdk/src/main/java/org/apache/flume/api/NettyAvroRpcClient.java 
965b2ff 
bq.    flume-ng-sdk/src/main/java/org/apache/flume/api/RpcClient.java a601213 
bq.    flume-ng-sdk/src/main/java/org/apache/flume/api/RpcClientFactory.java 
351b5b1 
bq.    flume-ng-sdk/src/test/java/org/apache/flume/api/RpcTestUtils.java 
93bfee9 
bq.    
flume-ng-sdk/src/test/java/org/apache/flume/api/TestFailoverRpcClient.java 
PRE-CREATION 
bq.    
flume-ng-sdk/src/test/java/org/apache/flume/api/TestNettyAvroRpcClient.java 
a33e9c8 
bq.    
flume-ng-sdk/src/test/java/org/apache/flume/api/TestRpcClientFactory.java 
0c94231 
bq.  
bq.  Diff: https://reviews.apache.org/r/4380/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  Unit tests added for the new functionality
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Hari
bq.  
bq.


                
> Failover capability for Client SDK
> ----------------------------------
>
>                 Key: FLUME-962
>                 URL: https://issues.apache.org/jira/browse/FLUME-962
>             Project: Flume
>          Issue Type: Sub-task
>    Affects Versions: v1.0.0
>            Reporter: Kathleen Ting
>            Assignee: Hari Shreedharan
>             Fix For: v1.2.0
>
>         Attachments: FLUME-962-2.patch, FLUME-962-2.patch, FLUME-962-3.patch, 
> FLUME-962-3.patch, FLUME-962-4.patch, FLUME-962-5.patch, FLUME-962-6.patch, 
> FLUME-962-rebased-1.patch, FLUME-962-rebased-4.patch
>
>
> Need a client SDK for Flume that will allow clients to be able to failover 
> from one source to another in case the first agent is not available. This 
> will help in keeping client implementations developed outside of the project 
> decoupled from internal details of HA implementation within Flume.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


Reply via email to