[ 
https://issues.apache.org/jira/browse/KUDU-2342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16397537#comment-16397537
 ] 

Todd Lipcon commented on KUDU-2342:
-----------------------------------

For reference, here's the ksck report on this tablet:
{code}
Tablet b8431200388d486995a4426c88bc06a2 of table 
'impala::tpch_30000_kudu.lineitem' is under-replicated: 1 replica(s) not RUNNING
  14b2404c50b540ae8957adff9a6c7548 (vd1336.halxg.cloudera.com:7050): RUNNING
  a260dca5a9c846e99cb621881a7b86b8 (vc1515.halxg.cloudera.com:7050): RUNNING 
[LEADER]
  e3fdd8da21a643aba21b7acdd6b17499 (va1038.halxg.cloudera.com:7050): TS 
unavailable
  f7376c96c6b64e7fa6a7bfc84fd0cd64 (vc1534.halxg.cloudera.com:7050): RUNNING 
[NONVOTER]

2 replicas' active configs differ from the master's.
  All the peers reported by the master and tablet servers are:
  A = 14b2404c50b540ae8957adff9a6c7548
  B = a260dca5a9c846e99cb621881a7b86b8
  C = e3fdd8da21a643aba21b7acdd6b17499
  D = f7376c96c6b64e7fa6a7bfc84fd0cd64

The consensus matrix is:
 Config source |        Replicas        | Current term | Config index | 
Committed?
---------------+------------------------+--------------+--------------+------------
 master        | A   B*  C   D~         |              |              | Yes
 A             | A   B*  C   D          | 1            | 1233         | No
 B             | A   B*  C   D          | 1            | 1233         | No
 C             | [config not available] |              |              | 
 D             | A   B*  C   D~         | 1            | 1141         | Yes
Table impala::tpch_30000_kudu.lineitem has 1 under-replicated tablet(s)
{code}

It would be nice if ksck could report some info on opid indexes too, but that's 
a separate improvement.

> Insert into Lineitem table with 1340 tablets on 129 node cluster failed with 
> "Failed to write batch "
> -----------------------------------------------------------------------------------------------------
>
>                 Key: KUDU-2342
>                 URL: https://issues.apache.org/jira/browse/KUDU-2342
>             Project: Kudu
>          Issue Type: Bug
>          Components: tablet
>    Affects Versions: 1.7.0
>            Reporter: Mostafa Mokhtar
>            Assignee: Alexey Serbin
>            Priority: Blocker
>              Labels: scalability
>         Attachments: Impala query profile.txt, tablet-info.html
>
>
> While loading TPCH 30TB on 129 node cluster via Impala, write operation 
> failed with :
>     Query Status: Kudu error(s) reported, first error: Timed out: Failed to 
> write batch of 38590 ops to tablet b8431200388d486995a4426c88bc06a2 after 1 
> attempt(s): Failed to write to server: a260dca5a9c846e99cb621881a7b86b8 
> (vc1515.halxg.cloudera.com:7050): Write RPC to X.X.X.X:7050 timed out after 
> 180.000s (SENT)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to