liu jing created KUDU-3358:
------------------------------

             Summary: kudu tables fail to insert and scan when k8s network 
changes
                 Key: KUDU-3358
                 URL: https://issues.apache.org/jira/browse/KUDU-3358
             Project: Kudu
          Issue Type: Bug
    Affects Versions: 1.10.0
            Reporter: liu jing


h3. Description

When I use the k8s to manage the kudu's network, there is a problem that if the 
k8s restart or any other way to change the kudu pod's ip, then kudu's tables 
will fail to insert or scan.
h3. Make a reappear

There is a way to trigger the problem, using the impala to make a test.

1. First, the original k8s pod network like this figure1:

 
{panel:title=figure1}
service-kudu-test01-entry                  ClusterIP   10.98.78.224     <none>  
      8051/TCP,8050/TCP,7051/TCP,7050/TCP                                    
2d22h service-kudu-test01-master-0               ClusterIP   10.109.78.49     
<none>        7051/TCP,8051/TCP,20051/TCP                                       
     2d22h service-kudu-test01-master-1               ClusterIP   10.98.28.69   
   <none>        7051/TCP,8051/TCP,20051/TCP                                    
        2d22h service-kudu-test01-master-2               ClusterIP   
10.105.180.113   <none>        7051/TCP,8051/TCP,20051/TCP                      
                      2d22h service-kudu-test01-{color:#FF0000}tserver-0        
      ClusterIP   10.106.224.20    <none>        7050/TCP,8050/TCP,20050/TCP    
                                        2d22h{color} 
{color:#FF0000}service-kudu-test01-tserver-1              ClusterIP   
10.110.69.131    <none>        7050/TCP,8050/TCP,20050/TCP                      
                      2d22h{color} service-kudu-test01-tserver-2              
ClusterIP   10.108.30.59     <none>        7050/TCP,8050/TCP,20050/TCP          
                                  2d22h 
{panel}
 

 

2. Second, using impala to create a table named *testTable.*

3. Then, restart the pod service, using the command:

 
{code:java}
kubectl delete --force -f ${dirname}/xx.yaml

kubectl apply --force -f ${dirname}/xx.yaml{code}
This will lead the kudu pod service to another new network, like this:

 

 

 
{panel:title=figure2}
service-kudu-test01-entry                  ClusterIP   10.108.85.55     <none>  
      8051/TCP,8050/TCP,7051/TCP,7050/TCP                                    
2m22s
service-kudu-test01-master-0               ClusterIP   10.96.245.192    <none>  
      7051/TCP,8051/TCP,20051/TCP                                            
2m22s
service-kudu-test01-master-1               ClusterIP   10.105.96.68     <none>  
      7051/TCP,8051/TCP,20051/TCP                                            
2m22s
service-kudu-test01-master-2               ClusterIP   10.103.221.65    <none>  
      7051/TCP,8051/TCP,20051/TCP                                            
2m22s
{color:#FF0000}service-kudu-test01-tserver-0              ClusterIP   
10.101.128.27    <none>        7050/TCP,8050/TCP,20050/TCP                      
                      2m22s{color}
{color:#FF0000}service-kudu-test01-tserver-1              ClusterIP   
10.111.9.225     <none>        7050/TCP,8050/TCP,20050/TCP                      
                      2m22s{color}
service-kudu-test01-tserver-2              ClusterIP   10.104.26.31     <none>  
      7050/TCP,8050/TCP,20050/TCP                                            
2m22s
{panel}
4. Then, using the impala to scan the table {*}testTable{*}, like this:

 

 
{code:java}
select * from testTable
{code}
then, the impala client return a error, like this:

 

 
{code:java}
[service-impala-test01-server-0:21000] default> select * from testTable;
Query: select * from testTable
Query submitted at: 2022-03-07 15:13:04 (Coordinator: 
http://service-impala-test01-server-0:25000)
Query progress can be monitored at: 
http://service-impala-test01-server-0:25000/query_plan?query_id=c84e8a34795ca311:953d6fd800000000
ERROR: Unable to open scanner for node with id '0' for Kudu table 
'impala::default.testTable': Timed out: exceeded configured scan timeout of 
180.000s: after 3 scan attempts: Client connection negotiation failed: client 
connection to 10.110.69.131:7050: Timeout exceeded waiting to connect: Network 
error: Client connection negotiation failed: client connection to 
10.106.224.20:7050: connect: Connection refused (error 111) {code}
>From this error log, we can find that kudu master return an old tserver ip to 
>impala client(we can use *figure1* to check the ip) . But, this ip is not 
>available, so impala fail to make a scan.

 

5. Depending on the new network,  using the impala to create a new table 
{*}testTable2{*}. It will succeed. But, if we use impala to make a insert or 
select for the {*}testTable2{*}, it will return the same error log, like this:
{code:java}
ERROR: Unable to open scanner for node with id '0' for Kudu table 
'impala::default.testTable2': Timed out: exceeded configured scan timeout of 
180.000s: after 3 scan attempts: Client connection negotiation failed: client 
connection to 10.110.69.131:7050: Timeout exceeded waiting to connect: Network 
error: Client connection negotiation failed: client connection to 
10.106.224.20:7050: connect: Connection refused (error 111)  {code}
This indicates that the kudu master still uses the old network to manage the 
new table.

 
h3. To avoid the problem

If I use the local machine's network for kudu, the problem will not happen

 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to