[jira] [Commented] (KUDU-3064) client_symbol-test failed on aarch64 server

2020-03-26 Thread huangtianhua (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17068209#comment-17068209
 ] 

huangtianhua commented on KUDU-3064:


Fixed by https://gerrit.cloudera.org/#/c/15419/

> client_symbol-test failed on aarch64 server
> ---
>
> Key: KUDU-3064
> URL: https://issues.apache.org/jira/browse/KUDU-3064
> Project: Kudu
>  Issue Type: Sub-task
>Reporter: huangtianhua
>Assignee: huangtianhua
>Priority: Major
> Fix For: 1.12.0
>
>
> I test kudu on aarch64 server based on https://gerrit.cloudera.org/#/c/14964/ 
> , the test client_symbol-test failed, error info as below:
> Found nm: /usr/bin/nm
> Found kudu client library: ./../lib/exported/libkudu_client.so
> Found bad symbol '_ULaarch64_dwarf_find_debug_frame'
> Found bad symbol '_ULaarch64_dwarf_search_unwind_table'
> Found bad symbol '_ULaarch64_get_reg'
> Found bad symbol '_ULaarch64_init_local'
> Found bad symbol '_ULaarch64_init_local2'
> Found bad symbol '_ULaarch64_is_signal_frame'
> Found bad symbol '_ULaarch64_resume'
> Found bad symbol '_ULaarch64_step'
> Found bad symbol '_Uaarch64_flush_cache'
> Found bad symbol '_Uaarch64_get_accessors'
> Found bad symbol '_Uaarch64_get_elf_image'
> Found bad symbol '_Uaarch64_get_exe_image_path'
> Found bad symbol '_Uaarch64_is_fpreg'
> ..
> Kudu client library contains 13 bad symbols
> I executed `nm` manually on x86 then, and found that the symbols of libunwind 
> code above are different with aarch64, like:
> on aarch64:
>   004d8fb8 T _ULaarch64_get_reg   
> /opt/kudu/thirdparty/src/libunwind-1.3.1/src/mi/Gget_reg.c:29 
> on x86_64:
>   004c60e0 t  _ULx86_64_get_reg   
> /opt/kudu/thirdparty/src/libunwind-1.3.1/src/mi/Gget_reg.c:29
> I am not familar with this scope and don't know why the symbol type is 
> different between x86_64 and aarch64, maybe the logic of 
> client_symbol-test.sh should be modified for aarch64 to avoid bad symbols?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KUDU-3082) tablets in "CONSENSUS_MISMATCH" state for a long time

2020-03-26 Thread Alexey Serbin (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17068181#comment-17068181
 ] 

Alexey Serbin commented on KUDU-3082:
-

[~zhangyifan27], do you have an idea what might lead to such a situation?  
Anything specific happened to the cluster?  I'm trying to have a reproduction 
scenario for this.  Any hint might be useful.  Thanks!

> tablets in "CONSENSUS_MISMATCH" state for a long time
> -
>
> Key: KUDU-3082
> URL: https://issues.apache.org/jira/browse/KUDU-3082
> Project: Kudu
>  Issue Type: Bug
>  Components: consensus
>Affects Versions: 1.10.1
>Reporter: YifanZhang
>Priority: Major
>
> Lately we found a few tablets in one of our clusters are unhealthy, the ksck 
> output is like:
>  
> {code:java}
> Tablet Summary
> Tablet 7404240f458f462d92b6588d07583a52 of table '' is conflicted: 3 
> replicas' active configs disagree with the leader master's
>   7380d797d2ea49e88d71091802fb1c81 (kudu-ts26): RUNNING
>   d1952499f94a4e6087bee28466fcb09f (kudu-ts25): RUNNING
>   47af52df1adc47e1903eb097e9c88f2e (kudu-ts27): RUNNING [LEADER]
> All reported replicas are:
>   A = 7380d797d2ea49e88d71091802fb1c81
>   B = d1952499f94a4e6087bee28466fcb09f
>   C = 47af52df1adc47e1903eb097e9c88f2e
>   D = 08beca5ed4d04003b6979bf8bac378d2
> The consensus matrix is:
>  Config source | Replicas | Current term | Config index | Committed?
> ---+--+--+--+
>  master| A   B   C*   |  |  | Yes
>  A | A   B   C*   | 5| -1   | Yes
>  B | A   B   C| 5| -1   | Yes
>  C | A   B   C*  D~   | 5| 54649| No
> Tablet 6d9d3fb034314fa7bee9cfbf602bcdc8 of table '' is conflicted: 2 
> replicas' active configs disagree with the leader master's
>   d1952499f94a4e6087bee28466fcb09f (kudu-ts25): RUNNING
>   47af52df1adc47e1903eb097e9c88f2e (kudu-ts27): RUNNING [LEADER]
>   5a8aeadabdd140c29a09dabcae919b31 (kudu-ts21): RUNNING
> All reported replicas are:
>   A = d1952499f94a4e6087bee28466fcb09f
>   B = 47af52df1adc47e1903eb097e9c88f2e
>   C = 5a8aeadabdd140c29a09dabcae919b31
>   D = 14632cdbb0d04279bc772f64e06389f9
> The consensus matrix is:
>  Config source | Replicas | Current term | Config index | Committed?
> ---+--+--+--+
>  master| A   B*  C|  |  | Yes
>  A | A   B*  C| 5| 5| Yes
>  B | A   B*  C   D~   | 5| 96176| No
>  C | A   B*  C| 5| 5| Yes
> Tablet bf1ec7d693b94632b099dc0928e76363 of table '' is conflicted: 1 
> replicas' active configs disagree with the leader master's
>   a9eaff3cf1ed483aae84954d649a (kudu-ts23): RUNNING
>   f75df4a6b5ce404884313af5f906b392 (kudu-ts19): RUNNING
>   47af52df1adc47e1903eb097e9c88f2e (kudu-ts27): RUNNING [LEADER]
> All reported replicas are:
>   A = a9eaff3cf1ed483aae84954d649a
>   B = f75df4a6b5ce404884313af5f906b392
>   C = 47af52df1adc47e1903eb097e9c88f2e
>   D = d1952499f94a4e6087bee28466fcb09f
> The consensus matrix is:
>  Config source | Replicas | Current term | Config index | Committed?
> ---+--+--+--+
>  master| A   B   C*   |  |  | Yes
>  A | A   B   C*   | 1| -1   | Yes
>  B | A   B   C*   | 1| -1   | Yes
>  C | A   B   C*  D~   | 1| 2| No
> Tablet 3190a310857e4c64997adb477131488a of table '' is conflicted: 3 
> replicas' active configs disagree with the leader master's
>   47af52df1adc47e1903eb097e9c88f2e (kudu-ts27): RUNNING [LEADER]
>   f0f7b2f4b9d344e6929105f48365f38e (kudu-ts24): RUNNING
>   f75df4a6b5ce404884313af5f906b392 (kudu-ts19): RUNNING
> All reported replicas are:
>   A = 47af52df1adc47e1903eb097e9c88f2e
>   B = f0f7b2f4b9d344e6929105f48365f38e
>   C = f75df4a6b5ce404884313af5f906b392
>   D = d1952499f94a4e6087bee28466fcb09f
> The consensus matrix is:
>  Config source | Replicas | Current term | Config index | Committed?
> ---+--+--+--+
>  master| A*  B   C|  |  | Yes
>  A | A*  B   C   D~   | 1| 1991 | No
>  B | A*  B   C| 1| 4| Yes
>  C | A*  B   C| 1| 4| Yes{code}
> These tablets couldn't recover for a couple of days until we restart 
> kudu-ts27.

[jira] [Commented] (KUDU-3082) tablets in "CONSENSUS_MISMATCH" state for a long time

2020-03-26 Thread YifanZhang (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17067541#comment-17067541
 ] 

YifanZhang commented on KUDU-3082:
--

[~aihai] It seems a different problem, what I encountered was not a checksum 
error but a consistency mismatch error.

> tablets in "CONSENSUS_MISMATCH" state for a long time
> -
>
> Key: KUDU-3082
> URL: https://issues.apache.org/jira/browse/KUDU-3082
> Project: Kudu
>  Issue Type: Bug
>  Components: consensus
>Affects Versions: 1.10.1
>Reporter: YifanZhang
>Priority: Major
>
> Lately we found a few tablets in one of our clusters are unhealthy, the ksck 
> output is like:
>  
> {code:java}
> Tablet Summary
> Tablet 7404240f458f462d92b6588d07583a52 of table '' is conflicted: 3 
> replicas' active configs disagree with the leader master's
>   7380d797d2ea49e88d71091802fb1c81 (kudu-ts26): RUNNING
>   d1952499f94a4e6087bee28466fcb09f (kudu-ts25): RUNNING
>   47af52df1adc47e1903eb097e9c88f2e (kudu-ts27): RUNNING [LEADER]
> All reported replicas are:
>   A = 7380d797d2ea49e88d71091802fb1c81
>   B = d1952499f94a4e6087bee28466fcb09f
>   C = 47af52df1adc47e1903eb097e9c88f2e
>   D = 08beca5ed4d04003b6979bf8bac378d2
> The consensus matrix is:
>  Config source | Replicas | Current term | Config index | Committed?
> ---+--+--+--+
>  master| A   B   C*   |  |  | Yes
>  A | A   B   C*   | 5| -1   | Yes
>  B | A   B   C| 5| -1   | Yes
>  C | A   B   C*  D~   | 5| 54649| No
> Tablet 6d9d3fb034314fa7bee9cfbf602bcdc8 of table '' is conflicted: 2 
> replicas' active configs disagree with the leader master's
>   d1952499f94a4e6087bee28466fcb09f (kudu-ts25): RUNNING
>   47af52df1adc47e1903eb097e9c88f2e (kudu-ts27): RUNNING [LEADER]
>   5a8aeadabdd140c29a09dabcae919b31 (kudu-ts21): RUNNING
> All reported replicas are:
>   A = d1952499f94a4e6087bee28466fcb09f
>   B = 47af52df1adc47e1903eb097e9c88f2e
>   C = 5a8aeadabdd140c29a09dabcae919b31
>   D = 14632cdbb0d04279bc772f64e06389f9
> The consensus matrix is:
>  Config source | Replicas | Current term | Config index | Committed?
> ---+--+--+--+
>  master| A   B*  C|  |  | Yes
>  A | A   B*  C| 5| 5| Yes
>  B | A   B*  C   D~   | 5| 96176| No
>  C | A   B*  C| 5| 5| Yes
> Tablet bf1ec7d693b94632b099dc0928e76363 of table '' is conflicted: 1 
> replicas' active configs disagree with the leader master's
>   a9eaff3cf1ed483aae84954d649a (kudu-ts23): RUNNING
>   f75df4a6b5ce404884313af5f906b392 (kudu-ts19): RUNNING
>   47af52df1adc47e1903eb097e9c88f2e (kudu-ts27): RUNNING [LEADER]
> All reported replicas are:
>   A = a9eaff3cf1ed483aae84954d649a
>   B = f75df4a6b5ce404884313af5f906b392
>   C = 47af52df1adc47e1903eb097e9c88f2e
>   D = d1952499f94a4e6087bee28466fcb09f
> The consensus matrix is:
>  Config source | Replicas | Current term | Config index | Committed?
> ---+--+--+--+
>  master| A   B   C*   |  |  | Yes
>  A | A   B   C*   | 1| -1   | Yes
>  B | A   B   C*   | 1| -1   | Yes
>  C | A   B   C*  D~   | 1| 2| No
> Tablet 3190a310857e4c64997adb477131488a of table '' is conflicted: 3 
> replicas' active configs disagree with the leader master's
>   47af52df1adc47e1903eb097e9c88f2e (kudu-ts27): RUNNING [LEADER]
>   f0f7b2f4b9d344e6929105f48365f38e (kudu-ts24): RUNNING
>   f75df4a6b5ce404884313af5f906b392 (kudu-ts19): RUNNING
> All reported replicas are:
>   A = 47af52df1adc47e1903eb097e9c88f2e
>   B = f0f7b2f4b9d344e6929105f48365f38e
>   C = f75df4a6b5ce404884313af5f906b392
>   D = d1952499f94a4e6087bee28466fcb09f
> The consensus matrix is:
>  Config source | Replicas | Current term | Config index | Committed?
> ---+--+--+--+
>  master| A*  B   C|  |  | Yes
>  A | A*  B   C   D~   | 1| 1991 | No
>  B | A*  B   C| 1| 4| Yes
>  C | A*  B   C| 1| 4| Yes{code}
> These tablets couldn't recover for a couple of days until we restart 
> kudu-ts27.
> I found so many duplicated logs in kudu-ts27 are like:
> {code:java}
> I0314 04:38:41.511279