[jira] [Commented] (KUDU-2952) TServers reporting replica stats may race with leadership change, hitting a DCHECK

2019-09-23 Thread HeLifu (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-2952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16936319#comment-16936319
 ] 

HeLifu commented on KUDU-2952:
--

[~andrew.wong], could you please attach a full test output? Thanks in advance.

> TServers reporting replica stats may race with leadership change, hitting a 
> DCHECK
> --
>
> Key: KUDU-2952
> URL: https://issues.apache.org/jira/browse/KUDU-2952
> Project: Kudu
>  Issue Type: Bug
>  Components: consensus, tserver
>Reporter: Andrew Wong
>Assignee: Andrew Wong
>Priority: Major
>
> I have a precommit that failed with:
> {code:java}
> F0924 00:08:46.821594  9670 catalog_manager.cc:4239] Check failed: 
> ts_desc->permanent_uuid() == report.consensus_state().leader_uuid() 
> *** Check failure stack trace: ***
> @ 0x7f5e442ea62d  google::LogMessage::Fail() at ??:0
> @ 0x7f5e442ec64c  google::LogMessage::SendToLog() at ??:0
> @ 0x7f5e442ea189  google::LogMessage::Flush() at ??:0
> @ 0x7f5e442ecfdf  google::LogMessageFatal::~LogMessageFatal() at ??:0
> @ 0x7f5e45d89a01  kudu::master::CatalogManager::ProcessTabletReport() 
> at ??:0
> @ 0x7f5e45e29ae7  kudu::master::MasterServiceImpl::TSHeartbeat() at 
> ??:0
> @ 0x7f5e41f29cbc  
> _ZZN4kudu6master15MasterServiceIfC1ERK13scoped_refptrINS_12MetricEntityEERKS2_INS_3rpc13ResultTrackerEEENKUlPKN6google8protobuf7MessageEPSE_PNS7_10RpcContextEE0_clESG_SH_SJ_
>  at ??:0
> @ 0x7f5e41f3068b  
> _ZNSt17_Function_handlerIFvPKN6google8protobuf7MessageEPS2_PN4kudu3rpc10RpcContextEEZNS6_6master15MasterServiceIfC1ERK13scoped_refptrINS6_12MetricEntityEERKSD_INS7_13ResultTrackerEEEUlS4_S5_S9_E0_E9_M_invokeERKSt9_Any_dataS4_S5_S9_
>  at ??:0
> @ 0x7f5e3fea909e  std::function<>::operator()() at ??:0
> @ 0x7f5e3fea88cf  kudu::rpc::GeneratedServiceIf::Handle() at ??:0
> @ 0x7f5e3feab3b6  kudu::rpc::ServicePool::RunThread() at ??:0
> @ 0x7f5e3feac785  boost::_mfi::mf0<>::operator()() at ??:0
> @ 0x7f5e3feac5ac  boost::_bi::list1<>::operator()<>() at ??:0
> @ 0x7f5e3feac493  boost::_bi::bind_t<>::operator()() at ??:0
> @ 0x7f5e3feac3c2  
> boost::detail::function::void_function_obj_invoker0<>::invoke() at ??:0
> @ 0x7f5e44db28d2  boost::function0<>::operator()() at ??:0
> @ 0x7f5e44daf65b  kudu::Thread::SuperviseThread() at ??:0
> @ 0x7f5e41429184  start_thread at ??:0
> @ 0x7f5e438f4ffd  clone at ??:0 
> {code}
> Looking through the code, it looks like there's a kind of TOCTOU race going 
> on when generating reports.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (KUDU-2943) TsTabletManagerITest.TestTableStats flaky due to WAL/cmeta term disagreement

2019-09-23 Thread HeLifu (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-2943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16935315#comment-16935315
 ] 

HeLifu edited comment on KUDU-2943 at 9/23/19 8:54 AM:
---

Please also take a look at  line 1489 and line 1513 of the full test log in the 
attachment.


was (Author: helifu):
"Line 1489" refers to the line 1489 of full test log in the attachment.

> TsTabletManagerITest.TestTableStats flaky due to WAL/cmeta term disagreement
> 
>
> Key: KUDU-2943
> URL: https://issues.apache.org/jira/browse/KUDU-2943
> Project: Kudu
>  Issue Type: Bug
>  Components: consensus, test
>Affects Versions: 1.11.0
>Reporter: Adar Dembo
>Priority: Critical
> Attachments: ts_tablet_manager-itest.txt
>
>
> This new test failed in a strange (and worrying) way:
> {noformat}
> /home/jenkins-slave/workspace/kudu-master/1/src/kudu/integration-tests/ts_tablet_manager-itest.cc:753:
>  Failure
> Failed
> Bad status: Corruption: Unable to start RaftConsensus: The last op in the WAL 
> with id 3.4 has a term (3) that is greater than the latest recorded term, 
> which is 2
> {noformat}
> From a brief dig through the code, looks like this means the current term as 
> per the on-disk cmeta file is older than the term in the latest WAL op.
> I can believe that this is somehow due to InternalMiniCluster exercising 
> clean shutdown paths that aren't well tested or robust, but it'd be nice to 
> determine that with certainty.
> I've attached the full test log.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (KUDU-2943) TsTabletManagerITest.TestTableStats flaky due to WAL/cmeta term disagreement

2019-09-23 Thread HeLifu (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-2943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16935313#comment-16935313
 ] 

HeLifu edited comment on KUDU-2943 at 9/23/19 8:51 AM:
---

If we step down a leader tablet, the leader's term will be increased by 1 but 
not persisted.
https://github.com/apache/kudu/blob/ee22ddcc734ab4947218c670d5cfddd61fe90fbb/src/kudu/consensus/raft_consensus.cc#L570
Then, after a successful election, one of the followers will be the new leader 
and the term will be increased by 1 too.
The term is durable for the new leader, but not for the old one. This is the 
root cause.
https://github.com/apache/kudu/blob/ee22ddcc734ab4947218c670d5cfddd61fe90fbb/src/kudu/consensus/raft_consensus.cc#L1138

So, the StepDown API is not safe.

// code placeholder
tablet: ac74b319ad54416685f8b9d9506e1d61
 f42c56 c2c8be eea10e
 | | |
 | start election |
 | WON |
 | leader(1,0) |
 (1,0) | (1,0)
 | NO_OP(1,1) |
 (1,1) | (1,1)
 | Write some Rows(1,2) |
 (1,2) | (1,2)
 | **StepDown(1/2,2)[term 2 is not durable] |
 start election(1/2,2) | |
 | | start election(1/2,2)
 WON |
 | FAIL
 leader(2,2)[term 2 is durable] |
 | (2,2)[term 2 is durable]
 NO_OP(2,3) |
 | (2,2)[not receive NO_OP]
**StepDown(2/3,3)[term 3 is not durable]"Line 570" |
 | start election(2/3,2)
 | WON
 | leader(3,2)[term 3 is durable]
 | |
 | NO_OP(3,3)
 (3,3)[term 3 is not durable]"Line 1138" |
 | alter schema(3,4)
 (3,4)[term 3 is not durable] |
 | |
 | [restart masters]
 | [restart tservers]
 |
**Reboot tablet failed since term is 2 in consensus metadata, opid is (3,4) in 
WAL


was (Author: helifu):
I think the term 3 for f42c56 is not durable. That means the StepDown API is 
not safe.
{code:java}
// code placeholder
tablet: ac74b319ad54416685f8b9d9506e1d61
  f42c56 c2c8be  eea10e
 |  |   |
 |start election|
 |  |   |
   (1,0)  leader(1,0) (1,0)
 |  |   |
   (1,1)   NO_OP(1,1) (1,1)
 |  |   |
   (1,2)   Write some Rows(1,2)   (1,2)
 |  |   |
 |StepDown(2, 2)|
 start election(2,2)|   |
 |  |   start election(2,2)
WIN |
 | FAIL
   leader(2,2)[term is durable] |
 |  |
NO_OP(2,3)[no sync] (2,2)[not receive NO_OP]
 |  |
StepDown(3,3)[term is not durable]"Line 1489"   |
 |  start election(3,2)
 |  |
 | WIN
 |  |
 |  leader(3,2)
 |  |
 |   NO_OP(3,3)
 |  |
 |  alter schema(3,4)
   (3,4)[term is not durable, op in WAL]|
 |  |
  restart masters
  restart tservers
{code}
 

> TsTabletManagerITest.TestTableStats flaky due to WAL/cmeta term disagreement
> 
>
> Key: KUDU-2943
> URL: https://issues.apache.org/jira/browse/KUDU-2943
> Project: Kudu
>  Issue Type: Bug
>  Components: consensus, test
>Affects Versions: 1.11.0
>Reporter: Adar Dembo
>Priority: Critical
> Attachments: ts_tablet_manager-itest.txt
>
>
> This new test failed in a strange (and worrying) way:
> {noformat}
> /home/jenkins-slave/workspace/kudu-master/1/src/kudu/integration-tests/ts_tablet_manager-itest.cc:753:
>  Failure
> Failed
> Bad status: Corruption: Unable to start RaftConsensus: The last op in the WAL 
> with id 3.4 has a term (3) that is greater than the latest recorded term, 
> which is 2
> {noformat}
> From a brief dig through the code, looks like this means the current term as 
> per the on-disk cmeta file is older than the term in the latest WAL op.
> I can believe that this is somehow due to InternalMiniCluster exercising 
> clean shutdown paths that aren't well tested or robust, but it'd be nice to 
> determine that with certainty.
> I've attached the full test log.



--
This message was sent by Atlassian Jira

[jira] [Comment Edited] (KUDU-2943) TsTabletManagerITest.TestTableStats flaky due to WAL/cmeta term disagreement

2019-09-23 Thread HeLifu (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-2943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16935313#comment-16935313
 ] 

HeLifu edited comment on KUDU-2943 at 9/23/19 8:52 AM:
---

If we step down a leader tablet, the leader's term will be increased by 1 but 
not persisted.
 
[https://github.com/apache/kudu/blob/ee22ddcc734ab4947218c670d5cfddd61fe90fbb/src/kudu/consensus/raft_consensus.cc#L570]
 Then, after a successful election, one of the followers will be the new leader 
and the term will be increased by 1 too.
 The term is durable for the new leader, but not for the old one. This is the 
root cause.
 
[https://github.com/apache/kudu/blob/ee22ddcc734ab4947218c670d5cfddd61fe90fbb/src/kudu/consensus/raft_consensus.cc#L1138]

So, the StepDown API is not safe.

 
{code:java}
// code placeholder
tablet: ac74b319ad54416685f8b9d9506e1d61
  f42c56c2c8beeea10e
 || |
 |  start election  |
 |  WON |
 |leader(1,0)   |
   (1,0)  |   (1,0)
 |NO_OP(1,1)|
   (1,1)  |   (1,1)
 |  Write some Rows(1,2)|
   (1,2)  |   (1,2)
 |  **StepDown(1/2,2)[term 2 is not durable]|
 start election(1/2,2)| |
 || start election(1/2,2)
WON |
 | FAIL
 leader(2,2)[term 2 is durable] |
 |(2,2)[term 2 is 
durable]
NO_OP(2,3)  |
 |(2,2)[not receive 
NO_OP]
**StepDown(2/3,3)[term 3 is not durable]"Line 570"  |
 |  start election(2/3,2)
 | WON
 |  leader(3,2)[term 3 is 
durable]
 |  |
 |  NO_OP(3,3)
   (3,3)[term 3 is not durable]"Line 1138" |
 |alter schema(3,4)
   (3,4)[term 3 is not durable] |
 |  |
 | [restart masters]
 | [restart tservers]
 |
**Reboot tablet failed since term is 2 in consensus metadata, opid is (3,4) in 
WAL

{code}


was (Author: helifu):
If we step down a leader tablet, the leader's term will be increased by 1 but 
not persisted.
https://github.com/apache/kudu/blob/ee22ddcc734ab4947218c670d5cfddd61fe90fbb/src/kudu/consensus/raft_consensus.cc#L570
Then, after a successful election, one of the followers will be the new leader 
and the term will be increased by 1 too.
The term is durable for the new leader, but not for the old one. This is the 
root cause.
https://github.com/apache/kudu/blob/ee22ddcc734ab4947218c670d5cfddd61fe90fbb/src/kudu/consensus/raft_consensus.cc#L1138

So, the StepDown API is not safe.

// code placeholder
tablet: ac74b319ad54416685f8b9d9506e1d61
 f42c56 c2c8be eea10e
 | | |
 | start election |
 | WON |
 | leader(1,0) |
 (1,0) | (1,0)
 | NO_OP(1,1) |
 (1,1) | (1,1)
 | Write some Rows(1,2) |
 (1,2) | (1,2)
 | **StepDown(1/2,2)[term 2 is not durable] |
 start election(1/2,2) | |
 | | start election(1/2,2)
 WON |
 | FAIL
 leader(2,2)[term 2 is durable] |
 | (2,2)[term 2 is durable]
 NO_OP(2,3) |
 | (2,2)[not receive NO_OP]
**StepDown(2/3,3)[term 3 is not durable]"Line 570" |
 | start election(2/3,2)
 | WON
 | leader(3,2)[term 3 is durable]
 | |
 | NO_OP(3,3)
 (3,3)[term 3 is not durable]"Line 1138" |
 | alter schema(3,4)
 (3,4)[term 3 is not durable] |
 | |
 | [restart masters]
 | [restart tservers]
 |
**Reboot tablet failed since term is 2 in consensus metadata, opid is (3,4) in 
WAL

> TsTabletManagerITest.TestTableStats flaky due to WAL/cmeta term disagreement
> 
>
> Key: KUDU-2943
> URL: https://issues.apache.org/jira/browse/KUDU-2943
> Project: Kudu
>  Issue Type: Bug
>  Components: consensus, test
>Affects Versions: 1.11.0
>Reporter: 

[jira] [Commented] (KUDU-2943) TsTabletManagerITest.TestTableStats flaky due to WAL/cmeta term disagreement

2019-09-22 Thread HeLifu (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-2943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16935315#comment-16935315
 ] 

HeLifu commented on KUDU-2943:
--

"Line 1489" refers to the line 1489 of full test log in the attachment.

> TsTabletManagerITest.TestTableStats flaky due to WAL/cmeta term disagreement
> 
>
> Key: KUDU-2943
> URL: https://issues.apache.org/jira/browse/KUDU-2943
> Project: Kudu
>  Issue Type: Bug
>  Components: consensus, test
>Affects Versions: 1.11.0
>Reporter: Adar Dembo
>Priority: Critical
> Attachments: ts_tablet_manager-itest.txt
>
>
> This new test failed in a strange (and worrying) way:
> {noformat}
> /home/jenkins-slave/workspace/kudu-master/1/src/kudu/integration-tests/ts_tablet_manager-itest.cc:753:
>  Failure
> Failed
> Bad status: Corruption: Unable to start RaftConsensus: The last op in the WAL 
> with id 3.4 has a term (3) that is greater than the latest recorded term, 
> which is 2
> {noformat}
> From a brief dig through the code, looks like this means the current term as 
> per the on-disk cmeta file is older than the term in the latest WAL op.
> I can believe that this is somehow due to InternalMiniCluster exercising 
> clean shutdown paths that aren't well tested or robust, but it'd be nice to 
> determine that with certainty.
> I've attached the full test log.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KUDU-2943) TsTabletManagerITest.TestTableStats flaky due to WAL/cmeta term disagreement

2019-09-22 Thread HeLifu (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-2943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16935313#comment-16935313
 ] 

HeLifu commented on KUDU-2943:
--

I think the term 3 for f42c56 is not durable. That means the StepDown API is 
not safe.
{code:java}
// code placeholder
tablet: ac74b319ad54416685f8b9d9506e1d61
  f42c56 c2c8be  eea10e
 |  |   |
 |start election|
 |  |   |
   (1,0)  leader(1,0) (1,0)
 |  |   |
   (1,1)   NO_OP(1,1) (1,1)
 |  |   |
   (1,2)   Write some Rows(1,2)   (1,2)
 |  |   |
 |StepDown(2, 2)|
 start election(2,2)|   |
 |  |   start election(2,2)
WIN |
 | FAIL
   leader(2,2)[term is durable] |
 |  |
NO_OP(2,3)[no sync] (2,2)[not receive NO_OP]
 |  |
StepDown(3,3)[term is not durable]"Line 1489"   |
 |  start election(3,2)
 |  |
 | WIN
 |  |
 |  leader(3,2)
 |  |
 |   NO_OP(3,3)
 |  |
 |  alter schema(3,4)
   (3,4)[term is not durable, op in WAL]|
 |  |
  restart masters
  restart tservers
{code}
 

> TsTabletManagerITest.TestTableStats flaky due to WAL/cmeta term disagreement
> 
>
> Key: KUDU-2943
> URL: https://issues.apache.org/jira/browse/KUDU-2943
> Project: Kudu
>  Issue Type: Bug
>  Components: consensus, test
>Affects Versions: 1.11.0
>Reporter: Adar Dembo
>Priority: Critical
> Attachments: ts_tablet_manager-itest.txt
>
>
> This new test failed in a strange (and worrying) way:
> {noformat}
> /home/jenkins-slave/workspace/kudu-master/1/src/kudu/integration-tests/ts_tablet_manager-itest.cc:753:
>  Failure
> Failed
> Bad status: Corruption: Unable to start RaftConsensus: The last op in the WAL 
> with id 3.4 has a term (3) that is greater than the latest recorded term, 
> which is 2
> {noformat}
> From a brief dig through the code, looks like this means the current term as 
> per the on-disk cmeta file is older than the term in the latest WAL op.
> I can believe that this is somehow due to InternalMiniCluster exercising 
> clean shutdown paths that aren't well tested or robust, but it'd be nice to 
> determine that with certainty.
> I've attached the full test log.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KUDU-2942) A rare flaky test for the aggregated live row count

2019-09-17 Thread HeLifu (Jira)
HeLifu created KUDU-2942:


 Summary: A rare flaky test for the aggregated live row count
 Key: KUDU-2942
 URL: https://issues.apache.org/jira/browse/KUDU-2942
 Project: Kudu
  Issue Type: Bug
Reporter: HeLifu
 Attachments: ts_tablet_manager-itest.txt

A few days ago, Adar met a rare flaky test for the live row count in TSAN mode.

 
{code:java}
// code placeholder
/home/jenkins-slave/workspace/kudu-master/3/src/kudu/integration-tests/ts_tablet_manager-itest.cc:642
      Expected: live_row_count
      Which is: 327
To be equal to: table_info->GetMetrics()->live_row_count->value()
      Which is: 654
{code}
It seems the metric value is doubled. And his full test output is in the 
attachment.

 

I reviewed the previous patches and made some unusual guesses. I think one of 
them could explain the issue:

When one master just becomes the leader and there are two heartbeat messages 
from the same tserver that are processed in parallel at 
[Line4239|https://github.com/apache/kudu/blob/1bdae88faefe9b0d43b6897d96cd853bc5dd7353/src/kudu/master/catalog_manager.cc#L4239],
 then the metric value will be doubled because the old tablet stats can be 
accessed concurrently.

Thus, the question becomes how to generate two heartbeat messages from the same 
tserver at the same time? The possible answer is: [First heartbeat 
message|https://github.com/apache/kudu/blob/1bdae88faefe9b0d43b6897d96cd853bc5dd7353/src/kudu/integration-tests/ts_tablet_manager-itest.cc#L741]
 and [Second heartbeat 
message|https://github.com/apache/kudu/blob/1bdae88faefe9b0d43b6897d96cd853bc5dd7353/src/kudu/integration-tests/ts_tablet_manager-itest.cc#L635]

Please don't forget the above case is integrate test environment, not product.

 

 

 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Assigned] (KUDU-2919) It's useful to support trash while drop partition/tables

2019-09-09 Thread HeLifu (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-2919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

HeLifu reassigned KUDU-2919:


Assignee: HeLifu

> It's useful to support trash while drop partition/tables
> 
>
> Key: KUDU-2919
> URL: https://issues.apache.org/jira/browse/KUDU-2919
> Project: Kudu
>  Issue Type: New Feature
>Reporter: HeLifu
>Assignee: HeLifu
>Priority: Major
>
> In order to shorten the recovery time of erroneously dropped partitions or 
> tables, it's useful to support trash functionality. For example, when we use 
> synchronization tools like sqoop to synchronize  data from DBMS to kudu, if 
> the synchronized table(partitions) is dropped unexpectedly, it will take a 
> long time to re-synchronize the data in full.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (KUDU-2932) Unix domain socket could speed up data transmission

2019-08-28 Thread HeLifu (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-2932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16918218#comment-16918218
 ] 

HeLifu commented on KUDU-2932:
--

Thank you for sharing the patch, and the result is interesting^_^

My understanding of data locality is that we can read data directly without tcp 
stack/serialization/deserialization. 

> Unix domain socket could speed up data transmission
> ---
>
> Key: KUDU-2932
> URL: https://issues.apache.org/jira/browse/KUDU-2932
> Project: Kudu
>  Issue Type: New Feature
>Reporter: HeLifu
>Priority: Major
>
> Right now, kudu doesn't support data locality. So, I think it's useful to 
> support unix domain socket for computing engine(impala/spark) and storage 
> engine(kudu) mixed deployment scenarios.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Assigned] (KUDU-2224) Kudu Partition Dynamic Creation on Insertion

2019-08-28 Thread HeLifu (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

HeLifu reassigned KUDU-2224:


Assignee: HeLifu

> Kudu Partition Dynamic Creation on Insertion
> 
>
> Key: KUDU-2224
> URL: https://issues.apache.org/jira/browse/KUDU-2224
> Project: Kudu
>  Issue Type: New Feature
>Affects Versions: 1.4.0
>Reporter: Sailesh Patel
>Assignee: HeLifu
>Priority: Minor
>
> Option to specify a more simplistic directive for partitioning where by Kudu 
> will create partitions on the fly instead of manual intervention of creating 
> additional partitions as described in:
>   https://kudu.apache.org/2016/08/23/new-range-partitioning-features.html
>   
>   
> https://kudu.apache.org/docs/kudu_impala_integration.html#partitioning_tables
>"Non-Covering Range Partitions"
>   
> +Requirement:+
>When creating partitioning, a partitioning rule is specified, whereby the 
> granularity size is specified and a new partition  is created :
> -at insert time when one does not exist for that value.
> e.g  proposal
> CREATE TABLE sample_table (ts TIMESTAMP, eventid BIGINT, somevalue STRING, 
> PRIMARY KEY(ts,eventid) )
> PARTITION BY 
> RANGE(ts) GRANULARITY= 864000 START = 11045376 
> STORED AS KUDU;
>- Maybe an optional END
>- The start is to show were there partition granularity builds from  
> -
> Use case
> - time series data where timestamps arrive out of order, can catch up from 
> sometimes years in the past and and for unpredictable timestamps. Event 
> information is either a timestamp (say epoch nano or epoch millisecond) with 
> partitions based upon a range value of that timestamp (typically day or hour 
> granularity)
> Currently, we script up the creation of partitions in advance of our received 
> data but if they fail for any reason the insert fails. Also, if we receive 
> unexpected data from a timestamp way in the past that if there is no 
> partition for the insert will fail.
> Opening this Jira enhancement for discussion.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Assigned] (KUDU-1994) Automatically Create New Range Partitions When Needed

2019-08-28 Thread HeLifu (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-1994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

HeLifu reassigned KUDU-1994:


Assignee: HeLifu

> Automatically Create New Range Partitions When Needed
> -
>
> Key: KUDU-1994
> URL: https://issues.apache.org/jira/browse/KUDU-1994
> Project: Kudu
>  Issue Type: Improvement
>Affects Versions: 1.3.0
>Reporter: Alan Jackoway
>Assignee: HeLifu
>Priority: Major
>  Labels: roadmap-candidate
>
> We have a few Kudu tables where we use a range-partitioned timestamp as part 
> of the key. The intention of this is to keep data locality for data that is 
> likely to be scanned together, such as events in a timeseries.
> Currently we create these with a partitions that look like this:
> {noformat}
> RANGE (ts) (
> PARTITION 0 <= VALUES < 142008840,
> PARTITION 142008840 <= VALUES < 142786080,
> PARTITION 142786080 <= VALUES < 143572320,
> PARTITION 143572320 <= VALUES < 144367200,
> PARTITION 144367200 <= VALUES < 145162440,
> PARTITION 145162440 <= VALUES < 145948320,
> PARTITION 145948320 <= VALUES < 146734560,
> PARTITION 146734560 <= VALUES < 147529440,
> PARTITION 147529440 <= VALUES < 148324680,
> PARTITION 148324680 <= VALUES < 149103360,
> PARTITION 149103360 <= VALUES < 149889600,
> PARTITION 149889600 <= VALUES < 150684480
> )
> {noformat}
> The problem is that as time goes on we have to choose to either create empty 
> partitions in advance of when we are writing data or risk forgetting to 
> create a partition and having writes of new data fail.
> Ideally, Kudu would have a way to indicate the size of the partitions (in 
> this example 3 months converted to milliseconds) and then automatically 
> create new partitions when new data comes in that needs the partition.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (KUDU-2932) Unix domain socket could speed up data transmission

2019-08-28 Thread HeLifu (Jira)
HeLifu created KUDU-2932:


 Summary: Unix domain socket could speed up data transmission
 Key: KUDU-2932
 URL: https://issues.apache.org/jira/browse/KUDU-2932
 Project: Kudu
  Issue Type: New Feature
Reporter: HeLifu


Right now, kudu doesn't support data locality. So, I think it's useful to 
support unix domain socket for computing engine(impala/spark) and storage 
engine(kudu) mixed deployment scenarios.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Comment Edited] (KUDU-2516) Add NOT EQUAL predicate type

2019-08-25 Thread HeLifu (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-2516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16915443#comment-16915443
 ] 

HeLifu edited comment on KUDU-2516 at 8/26/19 3:28 AM:
---

Can we make a simplified version for the NOT_EQUAL predicate type under the 
existing framework?
 Example:
 # one predicate on one column:
 a) a != 3 YES
 # AND-ed predicates on one column:
 a) 1 <= a < 5 and a != 3 NOT (refuse)
 b) 1 <= a < 5 and a != 6 YES


was (Author: helifu):
Can we make a simplified version for the NOT_EQUAL predicate type under the 
existing framework?
Example:
 # one predicate on one column:
a) a != 3 YES
 # AND-ed predicates on one column:
 a) 1 <= a < 5 and a != 3 NON (refuse)
 b) 1 <= a < 5 and a != 6 YES

> Add NOT EQUAL predicate type
> 
>
> Key: KUDU-2516
> URL: https://issues.apache.org/jira/browse/KUDU-2516
> Project: Kudu
>  Issue Type: Sub-task
>  Components: cfile, perf
>Affects Versions: 1.7.1
>Reporter: Mike Percy
>Assignee: HeLifu
>Priority: Major
>  Labels: roadmap-candidate
>
> Kudu currently does not have support for a NOT_EQUAL predicate type. This is 
> usually relevant when AND-ed together with other predicates.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (KUDU-2516) Add NOT EQUAL predicate type

2019-08-25 Thread HeLifu (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-2516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16915443#comment-16915443
 ] 

HeLifu commented on KUDU-2516:
--

Can we make a simplified version for the NOT_EQUAL predicate type under the 
existing framework?
Example:
 # one predicate on one column:
a) a != 3 YES
 # AND-ed predicates on one column:
 a) 1 <= a < 5 and a != 3 NON (refuse)
 b) 1 <= a < 5 and a != 6 YES

> Add NOT EQUAL predicate type
> 
>
> Key: KUDU-2516
> URL: https://issues.apache.org/jira/browse/KUDU-2516
> Project: Kudu
>  Issue Type: Sub-task
>  Components: cfile, perf
>Affects Versions: 1.7.1
>Reporter: Mike Percy
>Assignee: HeLifu
>Priority: Major
>  Labels: roadmap-candidate
>
> Kudu currently does not have support for a NOT_EQUAL predicate type. This is 
> usually relevant when AND-ed together with other predicates.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (KUDU-2919) It's useful to support trash while drop partition/tables

2019-08-15 Thread HeLifu (JIRA)
HeLifu created KUDU-2919:


 Summary: It's useful to support trash while drop partition/tables
 Key: KUDU-2919
 URL: https://issues.apache.org/jira/browse/KUDU-2919
 Project: Kudu
  Issue Type: New Feature
Reporter: HeLifu


In order to shorten the recovery time of erroneously dropped partitions or 
tables, it's useful to support trash functionality. For example, when we use 
synchronization tools like sqoop to synchronize  data from DBMS to kudu, if the 
synchronized table(partitions) is dropped unexpectedly, it will take a long 
time to re-synchronize the data in full.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Assigned] (KUDU-2803) support non-primary key type alteration

2019-08-09 Thread HeLifu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KUDU-2803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

HeLifu reassigned KUDU-2803:


Assignee: HeLifu

> support non-primary key type alteration
> ---
>
> Key: KUDU-2803
> URL: https://issues.apache.org/jira/browse/KUDU-2803
> Project: Kudu
>  Issue Type: Improvement
>Reporter: HeLifu
>Assignee: HeLifu
>Priority: Major
>
> We know that kudu does not allow the type of a column to be altered right 
> now. But indeed, a very common case is RDMS -> kudu, and there will be type 
> alterations inevitably, especially for small to large types.
> Currently, here are two ways to get rid of this problem:
>  # for the new tables: predefine a large type for every column;
>  # for the existing tables: stop app(write) > add a new column with large 
> type > copy data -> drop old column -> rename new column to old column -> 
> start app;
> However, neither of them is elegant. So, i think it is necessary to support 
> non-primary key type alteration.
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (KUDU-2894) How to modify the time zone of web ui

2019-07-15 Thread HeLifu (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16885136#comment-16885136
 ] 

HeLifu commented on KUDU-2894:
--

Hi, could you please share your schema and how did you create this table?

> How to modify the time zone of web ui
> -
>
> Key: KUDU-2894
> URL: https://issues.apache.org/jira/browse/KUDU-2894
> Project: Kudu
>  Issue Type: Bug
>  Components: ui
>Affects Versions: 1.8.0
>Reporter: wxmimperio
>Priority: Major
> Attachments: image-2019-07-15-19-32-42-041.png
>
>
> !image-2019-07-15-19-32-42-041.png!
> I create partition with 2019-05-31 00:00:00——2019-06-01 00:00:00, it show 
> 2019-05-30 16:00:00——2019-05-31 16:00:00.
> 8 hours difference.
> However, the data time I inserted is correct, not the partition display.
> How to modify the time zone on the web interface?



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Closed] (KUDU-2892) tserver crashed while dropping range partition

2019-07-12 Thread HeLifu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KUDU-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

HeLifu closed KUDU-2892.

Resolution: Duplicate

> tserver crashed while dropping range partition
> --
>
> Key: KUDU-2892
> URL: https://issues.apache.org/jira/browse/KUDU-2892
> Project: Kudu
>  Issue Type: Bug
>  Components: tablet
>Affects Versions: 1.9.0
>Reporter: HeLifu
>Priority: Major
> Attachments: tserver-INFO.log
>
>
> On one of our production clusters, a tserver crashed yesterday morning while 
> dropping a range partition, and below is error-msg:
> {code:java}
> // code placeholder
> Log file created at: 2019/07/11 01:51:30
> Running on machine: kudu31.jd.163.org
> Log line format: [IWEF]mmdd hh:mm:ss.uu threadid file:line] msg
> E0711 01:51:30.331185 11840 env_posix.cc:316] I/O error, context: 
> /mnt/dfs/0/kudu/tserver/data/data/9305dce18e6f4100b486b605617122b3.data
> E0711 01:51:30.337604 11840 data_dirs.cc:1120] Directory 
> /mnt/dfs/0/kudu/tserver/data/data marked as failed
> F0711 04:00:51.835958 68948 ts_tablet_manager.cc:940] Failed to delete tablet 
> data for 2278f736bf6548e2b773003c1ba7ed66: Invalid argument: Unable to delete 
> on-disk data from tablet 2278f736bf6548e2b773003c1ba7ed66: The metadata for 
> tablet 2278f736bf6548e2b773003c1ba7ed66 still references orphaned blocks. 
> Call DeleteTabletData() first
> {code}
> It seems the new orphan blocks that were not deleted caused this problem 
> after a disk was marked as bad. I attached an info-msg file about tablet 
> '2278f736bf6548e2b773003c1ba7ed66'. Our kudu version is 1.9.x 6a9cf4.
> For brevity, I made a quick generalization:
>  # 01:51:30.331185: bad disk /mnt/dfs/0 was detected
>  # 01:51:30.344581: failing tablet
>  # 01:51:30.870059: Initiating tablet copy
>  # 04:00:51.820354: Processing DeleteTablet
>  # 04:00:51.835958: Crashed.
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (KUDU-2892) tserver crashed while dropping range partition

2019-07-12 Thread HeLifu (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16884237#comment-16884237
 ] 

HeLifu commented on KUDU-2892:
--

[~andrew.wong]  Exactly, it's the same with [~29283500]'s issue. Ok, I will 
close it.

> tserver crashed while dropping range partition
> --
>
> Key: KUDU-2892
> URL: https://issues.apache.org/jira/browse/KUDU-2892
> Project: Kudu
>  Issue Type: Bug
>  Components: tablet
>Affects Versions: 1.9.0
>Reporter: HeLifu
>Priority: Major
> Attachments: tserver-INFO.log
>
>
> On one of our production clusters, a tserver crashed yesterday morning while 
> dropping a range partition, and below is error-msg:
> {code:java}
> // code placeholder
> Log file created at: 2019/07/11 01:51:30
> Running on machine: kudu31.jd.163.org
> Log line format: [IWEF]mmdd hh:mm:ss.uu threadid file:line] msg
> E0711 01:51:30.331185 11840 env_posix.cc:316] I/O error, context: 
> /mnt/dfs/0/kudu/tserver/data/data/9305dce18e6f4100b486b605617122b3.data
> E0711 01:51:30.337604 11840 data_dirs.cc:1120] Directory 
> /mnt/dfs/0/kudu/tserver/data/data marked as failed
> F0711 04:00:51.835958 68948 ts_tablet_manager.cc:940] Failed to delete tablet 
> data for 2278f736bf6548e2b773003c1ba7ed66: Invalid argument: Unable to delete 
> on-disk data from tablet 2278f736bf6548e2b773003c1ba7ed66: The metadata for 
> tablet 2278f736bf6548e2b773003c1ba7ed66 still references orphaned blocks. 
> Call DeleteTabletData() first
> {code}
> It seems the new orphan blocks that were not deleted caused this problem 
> after a disk was marked as bad. I attached an info-msg file about tablet 
> '2278f736bf6548e2b773003c1ba7ed66'. Our kudu version is 1.9.x 6a9cf4.
> For brevity, I made a quick generalization:
>  # 01:51:30.331185: bad disk /mnt/dfs/0 was detected
>  # 01:51:30.344581: failing tablet
>  # 01:51:30.870059: Initiating tablet copy
>  # 04:00:51.820354: Processing DeleteTablet
>  # 04:00:51.835958: Crashed.
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (KUDU-2892) tserver crashed while dropping range partition

2019-07-12 Thread HeLifu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KUDU-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

HeLifu updated KUDU-2892:
-
Description: 
On one of our production clusters, a tserver crashed yesterday morning while 
dropping a range partition, and below is error-msg:
{code:java}
// code placeholder
Log file created at: 2019/07/11 01:51:30
Running on machine: kudu31.jd.163.org
Log line format: [IWEF]mmdd hh:mm:ss.uu threadid file:line] msg
E0711 01:51:30.331185 11840 env_posix.cc:316] I/O error, context: 
/mnt/dfs/0/kudu/tserver/data/data/9305dce18e6f4100b486b605617122b3.data
E0711 01:51:30.337604 11840 data_dirs.cc:1120] Directory 
/mnt/dfs/0/kudu/tserver/data/data marked as failed
F0711 04:00:51.835958 68948 ts_tablet_manager.cc:940] Failed to delete tablet 
data for 2278f736bf6548e2b773003c1ba7ed66: Invalid argument: Unable to delete 
on-disk data from tablet 2278f736bf6548e2b773003c1ba7ed66: The metadata for 
tablet 2278f736bf6548e2b773003c1ba7ed66 still references orphaned blocks. Call 
DeleteTabletData() first
{code}
It seems the new orphan blocks that were not deleted caused this problem after 
a disk was marked as bad. I attached an info-msg file about tablet 
'2278f736bf6548e2b773003c1ba7ed66'. Our kudu version is 1.9.x 6a9cf4.

For brevity, I made a quick generalization:
 # 01:51:30.331185: bad disk /mnt/dfs/0 was detected
 # 01:51:30.344581: failing tablet
 # 01:51:30.870059: Initiating tablet copy
 # 04:00:51.820354: Processing DeleteTablet
 # 04:00:51.835958: Crashed.

 

  was:
On one of our production clusters, a tserver crashed yesterday morning while 
dropping a range partition, and below is error-msg:
{code:java}
// code placeholder
Log file created at: 2019/07/11 01:51:30
Running on machine: kudu31.jd.163.org
Log line format: [IWEF]mmdd hh:mm:ss.uu threadid file:line] msg
E0711 01:51:30.331185 11840 env_posix.cc:316] I/O error, context: 
/mnt/dfs/0/kudu/tserver/data/data/9305dce18e6f4100b486b605617122b3.data
E0711 01:51:30.337604 11840 data_dirs.cc:1120] Directory 
/mnt/dfs/0/kudu/tserver/data/data marked as failed
F0711 04:00:51.835958 68948 ts_tablet_manager.cc:940] Failed to delete tablet 
data for 2278f736bf6548e2b773003c1ba7ed66: Invalid argument: Unable to delete 
on-disk data from tablet 2278f736bf6548e2b773003c1ba7ed66: The metadata for 
tablet 2278f736bf6548e2b773003c1ba7ed66 still references orphaned blocks. Call 
DeleteTabletData() first
{code}
It seems the new orphan blocks that were not deleted caused this problem after 
a disk was marked as bad. I attached an info-msg file about tablet 
'2278f736bf6548e2b773003c1ba7ed66'.

For brevity, I made a quick generalization:
 # 01:51:30.331185: bad disk /mnt/dfs/0 was detected
 # 01:51:30.344581: failing tablet
 # 01:51:30.870059: Initiating tablet copy
 # 04:00:51.820354: Processing DeleteTablet
 # 04:00:51.835958: Crashed.

 


> tserver crashed while dropping range partition
> --
>
> Key: KUDU-2892
> URL: https://issues.apache.org/jira/browse/KUDU-2892
> Project: Kudu
>  Issue Type: Bug
>  Components: tablet
>Affects Versions: 1.9.0
>Reporter: HeLifu
>Priority: Major
> Attachments: tserver-INFO.log
>
>
> On one of our production clusters, a tserver crashed yesterday morning while 
> dropping a range partition, and below is error-msg:
> {code:java}
> // code placeholder
> Log file created at: 2019/07/11 01:51:30
> Running on machine: kudu31.jd.163.org
> Log line format: [IWEF]mmdd hh:mm:ss.uu threadid file:line] msg
> E0711 01:51:30.331185 11840 env_posix.cc:316] I/O error, context: 
> /mnt/dfs/0/kudu/tserver/data/data/9305dce18e6f4100b486b605617122b3.data
> E0711 01:51:30.337604 11840 data_dirs.cc:1120] Directory 
> /mnt/dfs/0/kudu/tserver/data/data marked as failed
> F0711 04:00:51.835958 68948 ts_tablet_manager.cc:940] Failed to delete tablet 
> data for 2278f736bf6548e2b773003c1ba7ed66: Invalid argument: Unable to delete 
> on-disk data from tablet 2278f736bf6548e2b773003c1ba7ed66: The metadata for 
> tablet 2278f736bf6548e2b773003c1ba7ed66 still references orphaned blocks. 
> Call DeleteTabletData() first
> {code}
> It seems the new orphan blocks that were not deleted caused this problem 
> after a disk was marked as bad. I attached an info-msg file about tablet 
> '2278f736bf6548e2b773003c1ba7ed66'. Our kudu version is 1.9.x 6a9cf4.
> For brevity, I made a quick generalization:
>  # 01:51:30.331185: bad disk /mnt/dfs/0 was detected
>  # 01:51:30.344581: failing tablet
>  # 01:51:30.870059: Initiating tablet copy
>  # 04:00:51.820354: Processing DeleteTablet
>  # 04:00:51.835958: Crashed.
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (KUDU-2892) tserver crashed while dropping range partition

2019-07-12 Thread HeLifu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KUDU-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

HeLifu updated KUDU-2892:
-
Description: 
On one of our production clusters, a tserver crashed yesterday morning while 
dropping a range partition, and below is error-msg:
{code:java}
// code placeholder
Log file created at: 2019/07/11 01:51:30
Running on machine: kudu31.jd.163.org
Log line format: [IWEF]mmdd hh:mm:ss.uu threadid file:line] msg
E0711 01:51:30.331185 11840 env_posix.cc:316] I/O error, context: 
/mnt/dfs/0/kudu/tserver/data/data/9305dce18e6f4100b486b605617122b3.data
E0711 01:51:30.337604 11840 data_dirs.cc:1120] Directory 
/mnt/dfs/0/kudu/tserver/data/data marked as failed
F0711 04:00:51.835958 68948 ts_tablet_manager.cc:940] Failed to delete tablet 
data for 2278f736bf6548e2b773003c1ba7ed66: Invalid argument: Unable to delete 
on-disk data from tablet 2278f736bf6548e2b773003c1ba7ed66: The metadata for 
tablet 2278f736bf6548e2b773003c1ba7ed66 still references orphaned blocks. Call 
DeleteTabletData() first
{code}
It seems the new orphan blocks that were not deleted caused this problem after 
a disk was marked as bad. I attached an info-msg file about tablet 
'2278f736bf6548e2b773003c1ba7ed66'.

For brevity, I made a quick generalization:
 # 01:51:30.331185: bad disk /mnt/dfs/0 was detected
 # 01:51:30.344581: failing tablet
 # 01:51:30.870059: Initiating tablet copy
 # 04:00:51.820354: Processing DeleteTablet
 # 04:00:51.835958: Crashed.

 

  was:
On one of our production clusters, a tserver crashed yesterday morning while 
dropping a range partition, and below is error-msg:

 
{code:java}
// code placeholder
Log file created at: 2019/07/11 01:51:30
Running on machine: kudu31.jd.163.org
Log line format: [IWEF]mmdd hh:mm:ss.uu threadid file:line] msg
E0711 01:51:30.331185 11840 env_posix.cc:316] I/O error, context: 
/mnt/dfs/0/kudu/tserver/data/data/9305dce18e6f4100b486b605617122b3.data
E0711 01:51:30.337604 11840 data_dirs.cc:1120] Directory 
/mnt/dfs/0/kudu/tserver/data/data marked as failed
F0711 04:00:51.835958 68948 ts_tablet_manager.cc:940] Failed to delete tablet 
data for 2278f736bf6548e2b773003c1ba7ed66: Invalid argument: Unable to delete 
on-disk data from tablet 2278f736bf6548e2b773003c1ba7ed66: The metadata for 
tablet 2278f736bf6548e2b773003c1ba7ed66 still references orphaned blocks. Call 
DeleteTabletData() first
{code}
It seems the new orphan blocks that were not deleted caused this problem after 
a disk was marked as bad. I have attached an info-msg file about tablet 
'2278f736bf6548e2b773003c1ba7ed66'.  For brevity, let me make a quick 
generalization:
 # 01:51:30.331185: bad disk /mnt/dfs/0 was detected
 # 01:51:30.344581: failing tablet
 # 01:51:30.870059: Initiating tablet copy
 # 04:00:51.820354: Processing DeleteTablet
 # 04:00:51.835958: Crashed.

 


> tserver crashed while dropping range partition
> --
>
> Key: KUDU-2892
> URL: https://issues.apache.org/jira/browse/KUDU-2892
> Project: Kudu
>  Issue Type: Bug
>  Components: tablet
>Affects Versions: 1.9.0
>Reporter: HeLifu
>Priority: Major
> Attachments: tserver-INFO.log
>
>
> On one of our production clusters, a tserver crashed yesterday morning while 
> dropping a range partition, and below is error-msg:
> {code:java}
> // code placeholder
> Log file created at: 2019/07/11 01:51:30
> Running on machine: kudu31.jd.163.org
> Log line format: [IWEF]mmdd hh:mm:ss.uu threadid file:line] msg
> E0711 01:51:30.331185 11840 env_posix.cc:316] I/O error, context: 
> /mnt/dfs/0/kudu/tserver/data/data/9305dce18e6f4100b486b605617122b3.data
> E0711 01:51:30.337604 11840 data_dirs.cc:1120] Directory 
> /mnt/dfs/0/kudu/tserver/data/data marked as failed
> F0711 04:00:51.835958 68948 ts_tablet_manager.cc:940] Failed to delete tablet 
> data for 2278f736bf6548e2b773003c1ba7ed66: Invalid argument: Unable to delete 
> on-disk data from tablet 2278f736bf6548e2b773003c1ba7ed66: The metadata for 
> tablet 2278f736bf6548e2b773003c1ba7ed66 still references orphaned blocks. 
> Call DeleteTabletData() first
> {code}
> It seems the new orphan blocks that were not deleted caused this problem 
> after a disk was marked as bad. I attached an info-msg file about tablet 
> '2278f736bf6548e2b773003c1ba7ed66'.
> For brevity, I made a quick generalization:
>  # 01:51:30.331185: bad disk /mnt/dfs/0 was detected
>  # 01:51:30.344581: failing tablet
>  # 01:51:30.870059: Initiating tablet copy
>  # 04:00:51.820354: Processing DeleteTablet
>  # 04:00:51.835958: Crashed.
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (KUDU-2892) tserver crashed while dropping range partition

2019-07-12 Thread HeLifu (JIRA)
HeLifu created KUDU-2892:


 Summary: tserver crashed while dropping range partition
 Key: KUDU-2892
 URL: https://issues.apache.org/jira/browse/KUDU-2892
 Project: Kudu
  Issue Type: Bug
  Components: tablet
Affects Versions: 1.9.0
Reporter: HeLifu
 Attachments: tserver-INFO.log

On one of our production clusters, a tserver crashed yesterday morning while 
dropping a range partition, and below is error-msg:

 
{code:java}
// code placeholder
Log file created at: 2019/07/11 01:51:30
Running on machine: kudu31.jd.163.org
Log line format: [IWEF]mmdd hh:mm:ss.uu threadid file:line] msg
E0711 01:51:30.331185 11840 env_posix.cc:316] I/O error, context: 
/mnt/dfs/0/kudu/tserver/data/data/9305dce18e6f4100b486b605617122b3.data
E0711 01:51:30.337604 11840 data_dirs.cc:1120] Directory 
/mnt/dfs/0/kudu/tserver/data/data marked as failed
F0711 04:00:51.835958 68948 ts_tablet_manager.cc:940] Failed to delete tablet 
data for 2278f736bf6548e2b773003c1ba7ed66: Invalid argument: Unable to delete 
on-disk data from tablet 2278f736bf6548e2b773003c1ba7ed66: The metadata for 
tablet 2278f736bf6548e2b773003c1ba7ed66 still references orphaned blocks. Call 
DeleteTabletData() first
{code}
It seems the new orphan blocks that were not deleted caused this problem after 
a disk was marked as bad. I have attached an info-msg file about tablet 
'2278f736bf6548e2b773003c1ba7ed66'.  For brevity, let me make a quick 
generalization:
 # 01:51:30.331185: bad disk /mnt/dfs/0 was detected
 # 01:51:30.344581: failing tablet
 # 01:51:30.870059: Initiating tablet copy
 # 04:00:51.820354: Processing DeleteTablet
 # 04:00:51.835958: Crashed.

 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (KUDU-2887) Expose the tablet statistics in Client API

2019-07-10 Thread HeLifu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KUDU-2887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

HeLifu updated KUDU-2887:
-
Description: The patch about aggregating tablet statistics on the 
kudu-master is on the way. And I think it's important to expose these 
statistics in client api by which the query engine can optimize their query 
plan. For example: (1) adjust the order of scanning tables, (2) Split a big 
tablet into multiple range pieces(KUDU-2437) to improve concurrency 
automatically, (3) speed up the query like "select count(*) from table".  (was: 
The patch about aggregating tablet statistics on the kudu-master is on the way. 
And I think it's important to expose these statistics in client api by which 
the query engine can optimize their query plan. For example: (1) adjust the 
order of scanning tables, (2) Split a big tablet into multiple range 
pieces(KUDU-2437) to improve concurrency automatically, (3) short)

> Expose the tablet statistics in Client API
> --
>
> Key: KUDU-2887
> URL: https://issues.apache.org/jira/browse/KUDU-2887
> Project: Kudu
>  Issue Type: Improvement
>  Components: client
>Reporter: HeLifu
>Priority: Minor
>
> The patch about aggregating tablet statistics on the kudu-master is on the 
> way. And I think it's important to expose these statistics in client api by 
> which the query engine can optimize their query plan. For example: (1) adjust 
> the order of scanning tables, (2) Split a big tablet into multiple range 
> pieces(KUDU-2437) to improve concurrency automatically, (3) speed up the 
> query like "select count(*) from table".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-2887) Expose the tablet statistics in Client API

2019-07-10 Thread HeLifu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KUDU-2887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

HeLifu updated KUDU-2887:
-
Description: The patch about aggregating tablet statistics on the 
kudu-master is on the way. And I think it's important to expose these 
statistics in client api by which the query engine can optimize their query 
plan. For example: (1) adjust the order of scanning tables, (2) Split a big 
tablet into multiple range pieces(KUDU-2437) to improve concurrency 
automatically, (3) speed up the query like "select count( *) from table".  
(was: The patch about aggregating tablet statistics on the kudu-master is on 
the way. And I think it's important to expose these statistics in client api by 
which the query engine can optimize their query plan. For example: (1) adjust 
the order of scanning tables, (2) Split a big tablet into multiple range 
pieces(KUDU-2437) to improve concurrency automatically, (3) speed up the query 
like "select count(*) from table".)

> Expose the tablet statistics in Client API
> --
>
> Key: KUDU-2887
> URL: https://issues.apache.org/jira/browse/KUDU-2887
> Project: Kudu
>  Issue Type: Improvement
>  Components: client
>Reporter: HeLifu
>Priority: Minor
>
> The patch about aggregating tablet statistics on the kudu-master is on the 
> way. And I think it's important to expose these statistics in client api by 
> which the query engine can optimize their query plan. For example: (1) adjust 
> the order of scanning tables, (2) Split a big tablet into multiple range 
> pieces(KUDU-2437) to improve concurrency automatically, (3) speed up the 
> query like "select count( *) from table".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-2887) Expose the tablet statistics in Client API

2019-07-10 Thread HeLifu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KUDU-2887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

HeLifu updated KUDU-2887:
-
Description: The patch about aggregating tablet statistics on the 
kudu-master is on the way. And I think it's important to expose these 
statistics in client api by which the query engine can optimize their query 
plan. For example: (1) adjust the order of scanning tables, (2) Split a big 
tablet into multiple range pieces(KUDU-2437) to improve concurrency 
automatically, (3) short  (was: The patch about aggregating tablet statistics 
on the kudu-master is on the way. And I think it's important to expose these 
statistics in client api by which the query engine can optimize their query 
plan. For example: (1) adjust the order of scanning tables, (2) Split a big 
tablet into multiple range pieces(KUDU-2437) to improve concurrency 
automatically.)

> Expose the tablet statistics in Client API
> --
>
> Key: KUDU-2887
> URL: https://issues.apache.org/jira/browse/KUDU-2887
> Project: Kudu
>  Issue Type: Improvement
>  Components: client
>Reporter: HeLifu
>Priority: Minor
>
> The patch about aggregating tablet statistics on the kudu-master is on the 
> way. And I think it's important to expose these statistics in client api by 
> which the query engine can optimize their query plan. For example: (1) adjust 
> the order of scanning tables, (2) Split a big tablet into multiple range 
> pieces(KUDU-2437) to improve concurrency automatically, (3) short



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (KUDU-2855) Lazy-create DeltaMemStores on first update

2019-07-04 Thread HeLifu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KUDU-2855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

HeLifu reassigned KUDU-2855:


Assignee: HeLifu

> Lazy-create DeltaMemStores on first update
> --
>
> Key: KUDU-2855
> URL: https://issues.apache.org/jira/browse/KUDU-2855
> Project: Kudu
>  Issue Type: Improvement
>  Components: perf, tserver
>Reporter: Todd Lipcon
>Assignee: HeLifu
>Priority: Major
>
> Currently DeltaTracker::DoOpen creates a DeltaMemStore for each DRS. If we 
> assume that most DRS don't have any deltas, this ends up wasting quite a bit 
> of memory. Looking at one TS in a production cluster, about 1GB of the ~14G 
> heap is being used by DMS. Of that, 464MB is data and the remainder is 
> overhead.
> This would likely improve other code paths too to fast-path out any 
> DMS-related code.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (KUDU-2854) Short circuit predicates on dictionary-coded columns

2019-07-04 Thread HeLifu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KUDU-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

HeLifu reassigned KUDU-2854:


Assignee: HeLifu

> Short circuit predicates on dictionary-coded columns
> 
>
> Key: KUDU-2854
> URL: https://issues.apache.org/jira/browse/KUDU-2854
> Project: Kudu
>  Issue Type: Improvement
>  Components: cfile, perf, tserver
>Reporter: Todd Lipcon
>Assignee: HeLifu
>Priority: Major
>
> In the common case that a column has no updates in a given DRS, if we see 
> that no entries in the dictionary match the predicate, we can short circuit 
> at a few layers:
> - we can store a flag in the cfile footer that indicates that all blocks are 
> dict-coded (ie there are no fallbacks). In that case, we can skip the whole 
> rowset
> - if a cfile is partially dict-encoded, we can skip any dict-coded blocks 
> without decoding the dictionary words



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KUDU-2887) Expose the tablet statistics in Client API

2019-07-03 Thread HeLifu (JIRA)
HeLifu created KUDU-2887:


 Summary: Expose the tablet statistics in Client API
 Key: KUDU-2887
 URL: https://issues.apache.org/jira/browse/KUDU-2887
 Project: Kudu
  Issue Type: Improvement
  Components: client
Reporter: HeLifu


The patch about aggregating tablet statistics on the kudu-master is on the way. 
And I think it's important to expose these statistics in client api by which 
the query engine can optimize their query plan. For example: (1) adjust the 
order of scanning tables, (2) Split a big tablet into multiple range 
pieces(KUDU-2437) to improve concurrency automatically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-2863) Support OR predicates

2019-07-03 Thread HeLifu (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16878291#comment-16878291
 ] 

HeLifu commented on KUDU-2863:
--

Hey Grant, I have no idea which use cases should we cover, one column or more: 
"c1 < v1 OR v2 < c1", "c1 < v1 OR v2 < c2" ?

So, could you please explain more? Thanks:)

> Support OR predicates
> -
>
> Key: KUDU-2863
> URL: https://issues.apache.org/jira/browse/KUDU-2863
> Project: Kudu
>  Issue Type: Sub-task
>Reporter: Grant Henke
>Priority: Major
>
> Support combining predicates with a OR predicate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (KUDU-2516) Add NOT EQUAL predicate type

2019-07-03 Thread HeLifu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KUDU-2516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

HeLifu reassigned KUDU-2516:


Assignee: HeLifu

> Add NOT EQUAL predicate type
> 
>
> Key: KUDU-2516
> URL: https://issues.apache.org/jira/browse/KUDU-2516
> Project: Kudu
>  Issue Type: Sub-task
>  Components: cfile, perf
>Affects Versions: 1.7.1
>Reporter: Mike Percy
>Assignee: HeLifu
>Priority: Major
>  Labels: roadmap-candidate
>
> Kudu currently does not have support for a NOT_EQUAL predicate type. This is 
> usually relevant when AND-ed together with other predicates.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-2882) Increase the timeout interval for TestSentryClientMetrics.Basic

2019-07-01 Thread HeLifu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KUDU-2882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

HeLifu updated KUDU-2882:
-
Description: 
When I run the test cases of 1.10.0-RC2, the 'TestSentryClientMetrics.Basic' is 
a little bit strange. Sometimes it works, but sometime it doesn't. Today, I 
took a close look at the output log and found some useful info:
{code:java}
// code placeholder
I0701 16:37:24.925388 33240 thread.cc:675] Ended thread 33240 - thread 
pool:Sentry [worker]
I0701 16:37:24.925501 33015 thread.cc:624] Started thread 33436 - thread 
pool:Sentry [worker]
I0701 16:37:25.322556 33015 mini_sentry.cc:164] Pausing Sentry
W0701 16:37:27.331832 33436 sentry_client.cc:134] Time spent starting Sentry 
client: real 1.999s user 0.000s sys 0.000s
W0701 16:37:27.331894 33436 client.h:352] Failed to connect to Sentry 
(127.32.61.193:59755): Timed out: failed to open Sentry connection: 
THRIFT_EAGAIN (timed out)
I0701 16:37:27.331986 33015 mini_sentry.cc:172] Resuming Sentry
/mnt/ddb/2/helif/apache/kudu/src/kudu/master/sentry_authz_provider-test.cc:1415:
 Failure
Expected: (200) < (hist->histogram()->MaxValue()), actual: 200 vs 
1999002
I0701 16:37:27.332604 33015 mini_sentry.cc:155] Stopping Sentry
{code}
Then I looked through the file 'sentry_authz_provider-test.cc', it seems the 
timeout value is too short:

[https://github.com/apache/kudu/blob/5c652defff422f908dacc11011dc6ae59bf49be5/src/kudu/master/sentry_authz_provider-test.cc#L1396]

Perhaps, we can increase this value (default 60 seconds) to 4 or 5 seconds to 
avoid the failures, though Alexey Serbin(not sure) and I have this problem.

  was:
When I run the test cases of 1.10.0-RC2, the 'TestSentryClientMetrics.Basic' is 
a little bit strange. Sometimes it works, but sometime it doesn't.

Today, I took a close look at the output log and found some useful info:
{code:java}
// code placeholder
I0701 16:37:24.925388 33240 thread.cc:675] Ended thread 33240 - thread 
pool:Sentry [worker]
I0701 16:37:24.925501 33015 thread.cc:624] Started thread 33436 - thread 
pool:Sentry [worker]
I0701 16:37:25.322556 33015 mini_sentry.cc:164] Pausing Sentry
W0701 16:37:27.331832 33436 sentry_client.cc:134] Time spent starting Sentry 
client: real 1.999s user 0.000s sys 0.000s
W0701 16:37:27.331894 33436 client.h:352] Failed to connect to Sentry 
(127.32.61.193:59755): Timed out: failed to open Sentry connection: 
THRIFT_EAGAIN (timed out)
I0701 16:37:27.331986 33015 mini_sentry.cc:172] Resuming Sentry
/mnt/ddb/2/helif/apache/kudu/src/kudu/master/sentry_authz_provider-test.cc:1415:
 Failure
Expected: (200) < (hist->histogram()->MaxValue()), actual: 200 vs 
1999002
I0701 16:37:27.332604 33015 mini_sentry.cc:155] Stopping Sentry
{code}
Then I looked through the file 'sentry_authz_provider-test.cc', it seems the 
timeout value is too short:

[https://github.com/apache/kudu/blob/5c652defff422f908dacc11011dc6ae59bf49be5/src/kudu/master/sentry_authz_provider-test.cc#L1396]

Perhaps, we can increase this value (default 60 seconds) to 4 or 5 seconds to 
avoid the failures, though Alexey Serbin(not sure) and I have this problem.


> Increase the timeout interval for TestSentryClientMetrics.Basic
> ---
>
> Key: KUDU-2882
> URL: https://issues.apache.org/jira/browse/KUDU-2882
> Project: Kudu
>  Issue Type: Improvement
>  Components: master
>Affects Versions: 1.10.0
>Reporter: HeLifu
>Priority: Minor
>
> When I run the test cases of 1.10.0-RC2, the 'TestSentryClientMetrics.Basic' 
> is a little bit strange. Sometimes it works, but sometime it doesn't. Today, 
> I took a close look at the output log and found some useful info:
> {code:java}
> // code placeholder
> I0701 16:37:24.925388 33240 thread.cc:675] Ended thread 33240 - thread 
> pool:Sentry [worker]
> I0701 16:37:24.925501 33015 thread.cc:624] Started thread 33436 - thread 
> pool:Sentry [worker]
> I0701 16:37:25.322556 33015 mini_sentry.cc:164] Pausing Sentry
> W0701 16:37:27.331832 33436 sentry_client.cc:134] Time spent starting Sentry 
> client: real 1.999s user 0.000s sys 0.000s
> W0701 16:37:27.331894 33436 client.h:352] Failed to connect to Sentry 
> (127.32.61.193:59755): Timed out: failed to open Sentry connection: 
> THRIFT_EAGAIN (timed out)
> I0701 16:37:27.331986 33015 mini_sentry.cc:172] Resuming Sentry
> /mnt/ddb/2/helif/apache/kudu/src/kudu/master/sentry_authz_provider-test.cc:1415:
>  Failure
> Expected: (200) < (hist->histogram()->MaxValue()), actual: 200 vs 
> 1999002
> I0701 16:37:27.332604 33015 mini_sentry.cc:155] Stopping Sentry
> {code}
> Then I looked through the file 'sentry_authz_provider-test.cc', it seems the 
> timeout value is too short:
> 

[jira] [Created] (KUDU-2882) Increase the timeout interval for TestSentryClientMetrics.Basic

2019-07-01 Thread HeLifu (JIRA)
HeLifu created KUDU-2882:


 Summary: Increase the timeout interval for 
TestSentryClientMetrics.Basic
 Key: KUDU-2882
 URL: https://issues.apache.org/jira/browse/KUDU-2882
 Project: Kudu
  Issue Type: Improvement
  Components: master
Affects Versions: 1.10.0
Reporter: HeLifu


When I run the test cases of 1.10.0-RC2, the 'TestSentryClientMetrics.Basic' is 
a little bit strange. Sometimes it works, but sometime it doesn't.

Today, I took a close look at the output log and found some useful info:
{code:java}
// code placeholder
I0701 16:37:24.925388 33240 thread.cc:675] Ended thread 33240 - thread 
pool:Sentry [worker]
I0701 16:37:24.925501 33015 thread.cc:624] Started thread 33436 - thread 
pool:Sentry [worker]
I0701 16:37:25.322556 33015 mini_sentry.cc:164] Pausing Sentry
W0701 16:37:27.331832 33436 sentry_client.cc:134] Time spent starting Sentry 
client: real 1.999s user 0.000s sys 0.000s
W0701 16:37:27.331894 33436 client.h:352] Failed to connect to Sentry 
(127.32.61.193:59755): Timed out: failed to open Sentry connection: 
THRIFT_EAGAIN (timed out)
I0701 16:37:27.331986 33015 mini_sentry.cc:172] Resuming Sentry
/mnt/ddb/2/helif/apache/kudu/src/kudu/master/sentry_authz_provider-test.cc:1415:
 Failure
Expected: (200) < (hist->histogram()->MaxValue()), actual: 200 vs 
1999002
I0701 16:37:27.332604 33015 mini_sentry.cc:155] Stopping Sentry
{code}
Then I looked through the file 'sentry_authz_provider-test.cc', it seems the 
timeout value is too short:

[https://github.com/apache/kudu/blob/5c652defff422f908dacc11011dc6ae59bf49be5/src/kudu/master/sentry_authz_provider-test.cc#L1396]

Perhaps, we can increase this value (default 60 seconds) to 4 or 5 seconds to 
avoid the failures, though Alexey Serbin(not sure) and I have this problem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-2613) Implement secondary indexes

2019-06-24 Thread HeLifu (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16870956#comment-16870956
 ] 

HeLifu commented on KUDU-2613:
--

Is it possible to introduce "Replicated Mirrors" from 
[PDT|[http://www.odbms.org/wp-content/uploads/2014/07/PositionalDelat-Trees.pdf]]
 ?

> Implement secondary indexes
> ---
>
> Key: KUDU-2613
> URL: https://issues.apache.org/jira/browse/KUDU-2613
> Project: Kudu
>  Issue Type: Task
>Reporter: Mike Percy
>Priority: Major
>  Labels: roadmap-candidate
>
> Tracking Jira to implement secondary indexes in Kudu



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-2780) Rebalance Kudu cluster in background

2019-06-20 Thread HeLifu (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16869057#comment-16869057
 ] 

HeLifu commented on KUDU-2780:
--

[~hannahvnguyen] wants to contribute this patch ^_^

> Rebalance Kudu cluster in background
> 
>
> Key: KUDU-2780
> URL: https://issues.apache.org/jira/browse/KUDU-2780
> Project: Kudu
>  Issue Type: Improvement
>Reporter: Alexey Serbin
>Assignee: Hannah Nguyen
>Priority: Major
>  Labels: roadmap-candidate
>
> With the introduction of `kudu cluster rebalance` CLI tool it's possible to 
> balance the distribution of tablet replicas in a Kudu cluster.  However, that 
> tool should be run manually or via an external scheduler (e.g. cron).
> It would be nice if Kudu would track and correct imbalances of replica 
> distribution automatically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (KUDU-2780) Rebalance Kudu cluster in background

2019-06-20 Thread HeLifu (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16869057#comment-16869057
 ] 

HeLifu edited comment on KUDU-2780 at 6/21/19 12:48 AM:


[~hannahvnguyen] wants to contribute this patch (*)


was (Author: helifu):
[~hannahvnguyen] wants to contribute this patch ^_^

> Rebalance Kudu cluster in background
> 
>
> Key: KUDU-2780
> URL: https://issues.apache.org/jira/browse/KUDU-2780
> Project: Kudu
>  Issue Type: Improvement
>Reporter: Alexey Serbin
>Assignee: Hannah Nguyen
>Priority: Major
>  Labels: roadmap-candidate
>
> With the introduction of `kudu cluster rebalance` CLI tool it's possible to 
> balance the distribution of tablet replicas in a Kudu cluster.  However, that 
> tool should be run manually or via an external scheduler (e.g. cron).
> It would be nice if Kudu would track and correct imbalances of replica 
> distribution automatically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (KUDU-2780) Rebalance Kudu cluster in background

2019-06-20 Thread HeLifu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KUDU-2780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

HeLifu reassigned KUDU-2780:


Assignee: Hannah Nguyen  (was: HeLifu)

> Rebalance Kudu cluster in background
> 
>
> Key: KUDU-2780
> URL: https://issues.apache.org/jira/browse/KUDU-2780
> Project: Kudu
>  Issue Type: Improvement
>Reporter: Alexey Serbin
>Assignee: Hannah Nguyen
>Priority: Major
>  Labels: roadmap-candidate
>
> With the introduction of `kudu cluster rebalance` CLI tool it's possible to 
> balance the distribution of tablet replicas in a Kudu cluster.  However, that 
> tool should be run manually or via an external scheduler (e.g. cron).
> It would be nice if Kudu would track and correct imbalances of replica 
> distribution automatically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KUDU-2843) Tablet Replica Distribution should include newly added or restored tservers

2019-06-10 Thread HeLifu (JIRA)
HeLifu created KUDU-2843:


 Summary: Tablet Replica Distribution should include newly added or 
restored tservers
 Key: KUDU-2843
 URL: https://issues.apache.org/jira/browse/KUDU-2843
 Project: Kudu
  Issue Type: Improvement
  Components: master
Affects Versions: 1.9.0
Reporter: HeLifu


It seems the 'Tablet Replica Distribution' on the master's table page is 
calculated from the existing tablets. That would give a false signal that the 
table is balanced after tservers are newly added or restored.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-2366) LockManager consumes significant memory

2019-05-16 Thread HeLifu (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16841185#comment-16841185
 ] 

HeLifu commented on KUDU-2366:
--

Maybe we can refer to redis's operation for dict rehash which could expand and 
shrink. :)

> LockManager consumes significant memory
> ---
>
> Key: KUDU-2366
> URL: https://issues.apache.org/jira/browse/KUDU-2366
> Project: Kudu
>  Issue Type: Improvement
>  Components: tablet
>Affects Versions: 1.7.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Major
>
> Looking at a heap dump of a server that's been running for a while with an 
> ingest workload across multiple tables, I see the LockManager is using about 
> 200MB of RAM. The workload in this case has batches of about 30,000 rows 
> each, so while each batch is in flight the LockManager hashtable has that 
> many locks in it. That causes it to expand to the next higher power of two 
> (64k slots). Each slot takes 16 bytes, so the lock table is reaching about 
> 1MB. We never resize _down_, so even once the tablet becomes cold, it still 
> uses 1M of unrecoverable RAM for the rest of the tserver lifetime.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-2823) Rebalance range partition

2019-05-15 Thread HeLifu (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840918#comment-16840918
 ] 

HeLifu commented on KUDU-2823:
--

Thanks to Grant and Alexey.

To Alexey: I added some new tablet servers to the cluster before adding new 
partition the day before yesterday, so the skew that i pasted in front was a 
little bit bigger; then i rebalanced the whole cluster yesterday, it took a 
long time and now the skew becomes small(6). So the placement policy for adding 
partitions is not bad. I'm sorry to have misled you.

> Rebalance range partition
> -
>
> Key: KUDU-2823
> URL: https://issues.apache.org/jira/browse/KUDU-2823
> Project: Kudu
>  Issue Type: Improvement
>Reporter: HeLifu
>Assignee: Xu Yao
>Priority: Major
>
> Right now we will add new range every day for a fact table which has 
> range+hash partitions, and then rebalance it as soon as possible. It seems 
> that not only the new added tablets of this table but also the historical 
> tablets will be rebalanced. But the historical tablets already have data, so 
> they are heavy to move and it will increase the disk and network suddenly. 
> So, I think it will be good to rebalance the new added range partitions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-2823) Rebalance range partition

2019-05-13 Thread HeLifu (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839012#comment-16839012
 ] 

HeLifu commented on KUDU-2823:
--

h3. By the way, here is "tablet replica distribution" for that table which has 
not been rebalanced:
h3. Tablet Replica Distribution
|Min Count|156|
|Max Count|218|
|Skew 
([?|http://kudu33.jd.163.org:8051/table?id=94943a1ad5a74ef3a76b0a3d4b699842#skew-help])|62|

> Rebalance range partition
> -
>
> Key: KUDU-2823
> URL: https://issues.apache.org/jira/browse/KUDU-2823
> Project: Kudu
>  Issue Type: Improvement
>Reporter: HeLifu
>Assignee: Xu Yao
>Priority: Major
>
> Right now we will add new range every day for a fact table which has 
> range+hash partitions, and then rebalance it as soon as possible. It seems 
> that not only the new added tablets of this table but also the historical 
> tablets will be rebalanced. But the historical tablets already have data, so 
> they are heavy to move and it will increase the disk and network suddenly. 
> So, I think it will be good to rebalance the new added range partitions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-2823) Rebalance range partition

2019-05-13 Thread HeLifu (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839003#comment-16839003
 ] 

HeLifu commented on KUDU-2823:
--

Ah, maybe i misunderstood the function of rebalance.

We have a big fact table which has 360 tablets(hash) every day(range), and 
heavy insert workload. We want that the writes to be spread evenly across each 
node.

Well, if the rebalance is not positioned for this function, i am going to close 
this issue.

> Rebalance range partition
> -
>
> Key: KUDU-2823
> URL: https://issues.apache.org/jira/browse/KUDU-2823
> Project: Kudu
>  Issue Type: Improvement
>Reporter: HeLifu
>Assignee: Xu Yao
>Priority: Major
>
> Right now we will add new range every day for a fact table which has 
> range+hash partitions, and then rebalance it as soon as possible. It seems 
> that not only the new added tablets of this table but also the historical 
> tablets will be rebalanced. But the historical tablets already have data, so 
> they are heavy to move and it will increase the disk and network suddenly. 
> So, I think it will be good to rebalance the new added range partitions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (KUDU-2823) Rebalance range partition

2019-05-13 Thread HeLifu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KUDU-2823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

HeLifu reassigned KUDU-2823:


Assignee: Xu Yao

> Rebalance range partition
> -
>
> Key: KUDU-2823
> URL: https://issues.apache.org/jira/browse/KUDU-2823
> Project: Kudu
>  Issue Type: Improvement
>Reporter: HeLifu
>Assignee: Xu Yao
>Priority: Major
>
> Right now we will add new range every day for a fact table which has 
> range+hash partitions, and then rebalance it as soon as possible. It seems 
> that not only the new added tablets of this table but also the historical 
> tablets will be rebalanced. But the historical tablets already have data, so 
> they are heavy to move and it will increase the disk and network suddenly. 
> So, I think it will be good to rebalance the new added range partitions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KUDU-2823) Rebalance range partition

2019-05-13 Thread HeLifu (JIRA)
HeLifu created KUDU-2823:


 Summary: Rebalance range partition
 Key: KUDU-2823
 URL: https://issues.apache.org/jira/browse/KUDU-2823
 Project: Kudu
  Issue Type: Improvement
Reporter: HeLifu


Right now we will add new range every day for a fact table which has range+hash 
partitions, and then rebalance it as soon as possible. It seems that not only 
the new added tablets of this table but also the historical tablets will be 
rebalanced. But the historical tablets already have data, so they are heavy to 
move and t will impact the disk and network. So, I think it will be good to 
rebalance the new added range partitions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-2823) Rebalance range partition

2019-05-13 Thread HeLifu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KUDU-2823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

HeLifu updated KUDU-2823:
-
Description: Right now we will add new range every day for a fact table 
which has range+hash partitions, and then rebalance it as soon as possible. It 
seems that not only the new added tablets of this table but also the historical 
tablets will be rebalanced. But the historical tablets already have data, so 
they are heavy to move and it will increase the disk and network suddenly. So, 
I think it will be good to rebalance the new added range partitions.  (was: 
Right now we will add new range every day for a fact table which has range+hash 
partitions, and then rebalance it as soon as possible. It seems that not only 
the new added tablets of this table but also the historical tablets will be 
rebalanced. But the historical tablets already have data, so they are heavy to 
move and it will impact the disk and network. So, I think it will be good to 
rebalance the new added range partitions.)

> Rebalance range partition
> -
>
> Key: KUDU-2823
> URL: https://issues.apache.org/jira/browse/KUDU-2823
> Project: Kudu
>  Issue Type: Improvement
>Reporter: HeLifu
>Priority: Major
>
> Right now we will add new range every day for a fact table which has 
> range+hash partitions, and then rebalance it as soon as possible. It seems 
> that not only the new added tablets of this table but also the historical 
> tablets will be rebalanced. But the historical tablets already have data, so 
> they are heavy to move and it will increase the disk and network suddenly. 
> So, I think it will be good to rebalance the new added range partitions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-2823) Rebalance range partition

2019-05-13 Thread HeLifu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KUDU-2823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

HeLifu updated KUDU-2823:
-
Description: Right now we will add new range every day for a fact table 
which has range+hash partitions, and then rebalance it as soon as possible. It 
seems that not only the new added tablets of this table but also the historical 
tablets will be rebalanced. But the historical tablets already have data, so 
they are heavy to move and it will impact the disk and network. So, I think it 
will be good to rebalance the new added range partitions.  (was: Right now we 
will add new range every day for a fact table which has range+hash partitions, 
and then rebalance it as soon as possible. It seems that not only the new added 
tablets of this table but also the historical tablets will be rebalanced. But 
the historical tablets already have data, so they are heavy to move and t will 
impact the disk and network. So, I think it will be good to rebalance the new 
added range partitions.)

> Rebalance range partition
> -
>
> Key: KUDU-2823
> URL: https://issues.apache.org/jira/browse/KUDU-2823
> Project: Kudu
>  Issue Type: Improvement
>Reporter: HeLifu
>Priority: Major
>
> Right now we will add new range every day for a fact table which has 
> range+hash partitions, and then rebalance it as soon as possible. It seems 
> that not only the new added tablets of this table but also the historical 
> tablets will be rebalanced. But the historical tablets already have data, so 
> they are heavy to move and it will impact the disk and network. So, I think 
> it will be good to rebalance the new added range partitions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-2822) Kudu create table problem

2019-05-09 Thread HeLifu (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16836827#comment-16836827
 ] 

HeLifu commented on KUDU-2822:
--

Could you please share more info about your cluster and operations?

> Kudu create table  problem
> --
>
> Key: KUDU-2822
> URL: https://issues.apache.org/jira/browse/KUDU-2822
> Project: Kudu
>  Issue Type: Bug
>Affects Versions: 1.8.0
>Reporter: kun'qin 
>Priority: Major
>
> There are five ts, each with 775 partitions. Through the impala kudu table, 
> 100 partitions, the number of partitions per ts has been growing, increasing 
> to 1500+



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (KUDU-2797) Implement table size metrics

2019-05-08 Thread HeLifu (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16835388#comment-16835388
 ] 

HeLifu edited comment on KUDU-2797 at 5/8/19 7:44 AM:
--

Right now we collect these metrics via HTTP interface from all of the tservers, 
merge and display on the monitoring systems, but it's not friendly for 
kudu-users. So i think it's valuable to support it and I'd like to have a try ;)


was (Author: helifu):
Right now we collect these metrics via HTTP interface from all of the tservers, 
merge and display on the monitoring systems. But it's not friendly for 
kudu-users, i think.

> Implement table size metrics
> 
>
> Key: KUDU-2797
> URL: https://issues.apache.org/jira/browse/KUDU-2797
> Project: Kudu
>  Issue Type: Task
>  Components: master, metrics
>Affects Versions: 1.8.0
>Reporter: Mike Percy
>Assignee: HeLifu
>Priority: Major
>
> It would be valuable to implement table size metrics for row count and byte 
> size (pre-replication and post-replication). The master could aggregate these 
> stats from the various partitions (tablets) and expose aggregated metrics for 
> consumption by monitoring systems and dashboards. These same metrics would 
> also be valuable to display on the web UI.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (KUDU-2797) Implement table size metrics

2019-05-08 Thread HeLifu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KUDU-2797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

HeLifu reassigned KUDU-2797:


Assignee: HeLifu

> Implement table size metrics
> 
>
> Key: KUDU-2797
> URL: https://issues.apache.org/jira/browse/KUDU-2797
> Project: Kudu
>  Issue Type: Task
>  Components: master, metrics
>Affects Versions: 1.8.0
>Reporter: Mike Percy
>Assignee: HeLifu
>Priority: Major
>
> It would be valuable to implement table size metrics for row count and byte 
> size (pre-replication and post-replication). The master could aggregate these 
> stats from the various partitions (tablets) and expose aggregated metrics for 
> consumption by monitoring systems and dashboards. These same metrics would 
> also be valuable to display on the web UI.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-2797) Implement table size metrics

2019-05-08 Thread HeLifu (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16835388#comment-16835388
 ] 

HeLifu commented on KUDU-2797:
--

Right now we collect these metrics via HTTP interface from all of the tservers, 
merge and display on the monitoring systems. But it's not friendly for 
kudu-users, i think.

> Implement table size metrics
> 
>
> Key: KUDU-2797
> URL: https://issues.apache.org/jira/browse/KUDU-2797
> Project: Kudu
>  Issue Type: Task
>  Components: master, metrics
>Affects Versions: 1.8.0
>Reporter: Mike Percy
>Assignee: HeLifu
>Priority: Major
>
> It would be valuable to implement table size metrics for row count and byte 
> size (pre-replication and post-replication). The master could aggregate these 
> stats from the various partitions (tablets) and expose aggregated metrics for 
> consumption by monitoring systems and dashboards. These same metrics would 
> also be valuable to display on the web UI.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (KUDU-2780) Rebalance Kudu cluster in background

2019-05-07 Thread HeLifu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KUDU-2780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

HeLifu reassigned KUDU-2780:


Assignee: HeLifu

> Rebalance Kudu cluster in background
> 
>
> Key: KUDU-2780
> URL: https://issues.apache.org/jira/browse/KUDU-2780
> Project: Kudu
>  Issue Type: Improvement
>Reporter: Alexey Serbin
>Assignee: HeLifu
>Priority: Major
>
> With the introduction of `kudu cluster rebalance` CLI tool it's possible to 
> balance the distribution of tablet replicas in a Kudu cluster.  However, that 
> tool should be run manually or via an external scheduler (e.g. cron).
> It would be nice if Kudu would track and correct imbalances of replica 
> distribution automatically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-2818) kudu CLI tool to show which master is the leader

2019-05-07 Thread HeLifu (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16834558#comment-16834558
 ] 

HeLifu commented on KUDU-2818:
--

Dengguangchao wants to fix this task, please help to assign to him :P

> kudu CLI tool to show which master is the leader
> 
>
> Key: KUDU-2818
> URL: https://issues.apache.org/jira/browse/KUDU-2818
> Project: Kudu
>  Issue Type: Task
>  Components: CLI
>Affects Versions: 1.9.0
>Reporter: Dengguangchao
>Priority: Major
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
> kudu CLI command supports list the leader master
> Example:
>  
> kudu master listleader 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-2818) kudu CLI tool to show which master is the leader

2019-05-07 Thread HeLifu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KUDU-2818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

HeLifu updated KUDU-2818:
-
Target Version/s:   (was: 1.10.0)
   Fix Version/s: (was: 1.10.0)

> kudu CLI tool to show which master is the leader
> 
>
> Key: KUDU-2818
> URL: https://issues.apache.org/jira/browse/KUDU-2818
> Project: Kudu
>  Issue Type: Task
>  Components: CLI
>Affects Versions: 1.9.0
>Reporter: Dengguangchao
>Priority: Major
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
> kudu CLI command supports list the leader master
> Example:
>  
> kudu master listleader 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-2803) support non-primary key type alteration

2019-04-26 Thread HeLifu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KUDU-2803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

HeLifu updated KUDU-2803:
-
Description: 
We know that kudu does not allow the type of a column to be altered right now. 
But indeed, a very common case is RDMS -> kudu, and there will be type 
alterations inevitably, especially for small to large types.

Currently, here are two ways to get rid of this problem:
 # for the new tables: predefine a large type for every column;
 # for the existing tables: stop app(write) > add a new column with large type 
> copy data -> drop old column -> rename new column to old column -> start app;

However, neither of them is elegant. So, i think it is necessary to support 
non-primary key type alteration.

 

  was:
We know that kudu does not allow the type of a column to be altered right now. 
But indeed, a very common case is RDMS -> kudu, and there will be type 
alterations inevitably.

Currently, here are two ways to get rid of this problem:
 # for the new tables: predefine a big type for every column;
 # for the existing tables: stop app(write) > add a new column with new type > 
copy data -> drop old column -> rename new column to old column -> start app;

However, neither of them is elegant. So, i think it is necessary to support 
non-primary key type alteration.

 


> support non-primary key type alteration
> ---
>
> Key: KUDU-2803
> URL: https://issues.apache.org/jira/browse/KUDU-2803
> Project: Kudu
>  Issue Type: Improvement
>Reporter: HeLifu
>Priority: Major
>
> We know that kudu does not allow the type of a column to be altered right 
> now. But indeed, a very common case is RDMS -> kudu, and there will be type 
> alterations inevitably, especially for small to large types.
> Currently, here are two ways to get rid of this problem:
>  # for the new tables: predefine a large type for every column;
>  # for the existing tables: stop app(write) > add a new column with large 
> type > copy data -> drop old column -> rename new column to old column -> 
> start app;
> However, neither of them is elegant. So, i think it is necessary to support 
> non-primary key type alteration.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-2803) support non-primary key type alteration

2019-04-25 Thread HeLifu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KUDU-2803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

HeLifu updated KUDU-2803:
-
Description: 
We know that kudu does not allow the type of a column to be altered right now. 
But indeed, a very common case is RDMS -> kudu, and there will be type 
alterations inevitably.

Currently, here are two ways to get rid of this problem:
 # for the new tables: predefine a big type for every column;
 # for the existing tables: stop app(write) > add a new column with new type > 
copy data -> drop old column -> rename new column to old column -> start app;

However, neither of them is elegant. So, i think it is necessary to support 
non-primary key type alteration.

 

  was:
We know that kudu does not allow the type of a column to be altered right now. 
But indeed, a very common case is RDMS -> kudu, and there will be type 
alterations inevitably.

Currently, here are two ways to get rid of this problem:
 # for the new tables: predefine a big type for every column;
 # for the existing tables: stop app(write) > add a new column with new type > 
copy data -> drop old column -> rename new column to old column -> start app;

However, neither of them is elegant. So, is it possible to support non-primary 
key type alteration?

 


> support non-primary key type alteration
> ---
>
> Key: KUDU-2803
> URL: https://issues.apache.org/jira/browse/KUDU-2803
> Project: Kudu
>  Issue Type: Improvement
>Reporter: HeLifu
>Priority: Major
>
> We know that kudu does not allow the type of a column to be altered right 
> now. But indeed, a very common case is RDMS -> kudu, and there will be type 
> alterations inevitably.
> Currently, here are two ways to get rid of this problem:
>  # for the new tables: predefine a big type for every column;
>  # for the existing tables: stop app(write) > add a new column with new type 
> > copy data -> drop old column -> rename new column to old column -> start 
> app;
> However, neither of them is elegant. So, i think it is necessary to support 
> non-primary key type alteration.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-2803) support non-primary key type alteration

2019-04-25 Thread HeLifu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KUDU-2803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

HeLifu updated KUDU-2803:
-
Description: 
We know that kudu does not allow the type of a column to be altered right now. 
But indeed, a very common case is RDMS -> kudu, and there will be type 
alterations inevitably.

Currently, here are two ways to get rid of this problem:
 # for the newly tables: predefine a big type for every column;
 # for the existing tables: stop app(write) > add a new column with new type > 
copy data -> drop old column -> rename new column to old column -> start app;

However, neither of them is elegant. So, is it possible to support non-primary 
key type alteration?

 

  was:
We know that kudu does not allow the type of a column to be altered right now. 
But indeed, a very common case is RDMS -> kudu, and there will be type 
alterations inevitably.

Currently, here are two ways to get rid of this problem:
 # for the newly tables: predefine a big type for every column;
 # for the existing tables: stop app(write) -> add a new column with new type 
-> copy data -> drop old column -> rename new column to old column -> start app;

However, neither of them is elegant. So, is it possible to support non-primary 
key type alteration?

 


> support non-primary key type alteration
> ---
>
> Key: KUDU-2803
> URL: https://issues.apache.org/jira/browse/KUDU-2803
> Project: Kudu
>  Issue Type: Improvement
>Reporter: HeLifu
>Priority: Major
>
> We know that kudu does not allow the type of a column to be altered right 
> now. But indeed, a very common case is RDMS -> kudu, and there will be type 
> alterations inevitably.
> Currently, here are two ways to get rid of this problem:
>  # for the newly tables: predefine a big type for every column;
>  # for the existing tables: stop app(write) > add a new column with new type 
> > copy data -> drop old column -> rename new column to old column -> start 
> app;
> However, neither of them is elegant. So, is it possible to support 
> non-primary key type alteration?
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-2803) support non-primary key type alteration

2019-04-25 Thread HeLifu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KUDU-2803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

HeLifu updated KUDU-2803:
-
Description: 
We know that kudu does not allow the type of a column to be altered right now. 
But indeed, a very common case is RDMS -> kudu, and there will be type 
alterations inevitably.

Currently, here are two ways to get rid of this problem:
 # for the new tables: predefine a big type for every column;
 # for the existing tables: stop app(write) > add a new column with new type > 
copy data -> drop old column -> rename new column to old column -> start app;

However, neither of them is elegant. So, is it possible to support non-primary 
key type alteration?

 

  was:
We know that kudu does not allow the type of a column to be altered right now. 
But indeed, a very common case is RDMS -> kudu, and there will be type 
alterations inevitably.

Currently, here are two ways to get rid of this problem:
 # for the newly tables: predefine a big type for every column;
 # for the existing tables: stop app(write) > add a new column with new type > 
copy data -> drop old column -> rename new column to old column -> start app;

However, neither of them is elegant. So, is it possible to support non-primary 
key type alteration?

 


> support non-primary key type alteration
> ---
>
> Key: KUDU-2803
> URL: https://issues.apache.org/jira/browse/KUDU-2803
> Project: Kudu
>  Issue Type: Improvement
>Reporter: HeLifu
>Priority: Major
>
> We know that kudu does not allow the type of a column to be altered right 
> now. But indeed, a very common case is RDMS -> kudu, and there will be type 
> alterations inevitably.
> Currently, here are two ways to get rid of this problem:
>  # for the new tables: predefine a big type for every column;
>  # for the existing tables: stop app(write) > add a new column with new type 
> > copy data -> drop old column -> rename new column to old column -> start 
> app;
> However, neither of them is elegant. So, is it possible to support 
> non-primary key type alteration?
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KUDU-2803) support non-primary key type alteration

2019-04-25 Thread HeLifu (JIRA)
HeLifu created KUDU-2803:


 Summary: support non-primary key type alteration
 Key: KUDU-2803
 URL: https://issues.apache.org/jira/browse/KUDU-2803
 Project: Kudu
  Issue Type: Improvement
Reporter: HeLifu


We know that kudu does not allow the type of a column to be altered right now. 
But indeed, a very common case is RDMS -> kudu, and there will be type 
alterations inevitably.

Currently, here are two ways to get rid of this problem:
 # for the newly tables: predefine a big type for every column;
 # for the existing tables: stop app(write) -> add a new column with new type 
-> copy data -> drop old column -> rename new column to old column -> start app;

However, neither of them is elegant. So, is it possible to support non-primary 
key type alteration?

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KUDU-2750) Add create timestamp to every table

2019-03-20 Thread HeLifu (JIRA)
HeLifu created KUDU-2750:


 Summary: Add create timestamp to every table
 Key: KUDU-2750
 URL: https://issues.apache.org/jira/browse/KUDU-2750
 Project: Kudu
  Issue Type: Improvement
  Components: master
Affects Versions: 1.9.1
Reporter: HeLifu


There seems to be no place to look at the creation time of a table, thus it is 
difficult to get the latest created tables, especially in a cluster which has 
accumulated lots of tables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-1711) Add support for storing column comments in ColumnSchema

2019-03-20 Thread HeLifu (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-1711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16796942#comment-16796942
 ] 

HeLifu commented on KUDU-1711:
--

Is anyone developing this feature already? If not, i'd like to try.

> Add support for storing column comments in ColumnSchema
> ---
>
> Key: KUDU-1711
> URL: https://issues.apache.org/jira/browse/KUDU-1711
> Project: Kudu
>  Issue Type: Improvement
>  Components: impala
>Affects Versions: 1.0.1
>Reporter: Dimitris Tsirogiannis
>Priority: Minor
>
> Currently, there is no way to persist column comments for Kudu tables unless 
> we store them in HMS. We should be able to store column comments in Kudu 
> through the ColumnSchema class. 
> Example of using column comments in a CREATE TABLE statement:
> {code}
> impala>create table foo (a int primary key comment 'this is column a') 
> distribute by hash (a) into 4 buckets stored as kudu;
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-2648) compaction does not run

2019-01-10 Thread HeLifu (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16739918#comment-16739918
 ] 

HeLifu commented on KUDU-2648:
--

Totally 70MB data, but 25024 rowsets, almost 2.86KB/Rowset.

I guess you hit [KUDU-1400|https://issues.apache.org/jira/browse/KUDU-1400].

Can you check the 'Rowset Layout Diagram' of the tablet under this table?

> compaction does not run
> ---
>
> Key: KUDU-2648
> URL: https://issues.apache.org/jira/browse/KUDU-2648
> Project: Kudu
>  Issue Type: Bug
>  Components: tablet
>Affects Versions: 1.7.0
> Environment: 3 master nodes, 4c32g, ubuntu16.04
> 3 data nodes, 8c64g, 1.8T ssd, ubuntu16.04
>Reporter: huajian
>Assignee: Will Berkeley
>Priority: Major
>  Labels: compact
>
> Here is a table : project_construction_record, 62 columns, 170k records, no 
> partition
> The table has many crud operations every day
> I run a simple sql on it (using impala): 
>  
> {code:java}
> SELECT * FROM project_construction_record ORDER BY id LIMIT 1{code}
> it takes 7 seconds
> By checking the profile, I found this:
> {quote}
> h4. KUDU_SCAN_NODE (id=0) (6.06秒)
>  * BytesRead: *0 字节*
>  * CollectionItemsRead: *0*
>  * InactiveTotalTime: *0纳秒*
>  * KuduRemoteScanTokens: *0*
>  * NumScannerThreadsStarted: *1*
>  * PeakMemoryUsage: *3.4 兆字节*
>  * RowsRead: *177,007*
>  * RowsReturned: *177,007*
>  * RowsReturnedRate: *29188/秒*
>  * ScanRangesComplete: *1*
>  * ScannerThreadsInvoluntaryContextSwitches: *0*
>  * ScannerThreadsTotalWallClockTime: *6.09秒*
>  ** MaterializeTupleTime(*): *6.06秒*
>  ** ScannerThreadsSysTime: *48毫秒*
>  ** ScannerThreadsUserTime: *172毫秒*{quote}
> So i check the scan of this sql, and found this:
> |column|cells read|bytes read|blocks read|
> |id|176.92k|1.91M|19.96k|
> |org_id|176.92k|1.91M|19.96k|
> |work_date|176.92k|2.03M|19.96k|
> |description|176.92k|1.21M|19.96k|
> |user_name|176.92k|775.9K|19.96k|
> |spot_name|176.92k|825.8K|19.96k|
> |spot_start_pile|176.92k|778.7K|19.96k|
> |spot_end_pile|176.92k|780.4K|19.96k|
> |..|..|..|..|
> There are so many blocks read.
> Then I run the _*kudu fs list*_ command, and I got a 70M report data, here is 
> the bottom:
>  
> {code:java}
> 0b6ac30b449043a68905e02b797144fc | 25024 | 40310988 | column
>  0b6ac30b449043a68905e02b797144fc | 25024 | 40310989 | column
>  0b6ac30b449043a68905e02b797144fc | 25024 | 40310990 | column
>  0b6ac30b449043a68905e02b797144fc | 25024 | 40310991 | column
>  0b6ac30b449043a68905e02b797144fc | 25024 | 40310992 | column
>  0b6ac30b449043a68905e02b797144fc | 25024 | 40310993 | column
>  0b6ac30b449043a68905e02b797144fc | 25024 | 40310996 | undo
>  0b6ac30b449043a68905e02b797144fc | 25024 | 40310994 | bloom
>  0b6ac30b449043a68905e02b797144fc | 25024 | 40310995 | adhoc-index{code}
>  
> there are 25024 rowsets, and more than 1m blocks in the tablet
> I left the maintenance and the compact flags by default, only change the 
> tablet_history_max_age_sec to one day:
>  
>  
> {code:java}
> --maintenance_manager_history_size=8
> --maintenance_manager_num_threads=1
> --maintenance_manager_polling_interval_ms=250
> --budgeted_compaction_target_rowset_size=33554432
> --compaction_approximation_ratio=1.049523162842
> --compaction_minimum_improvement=0.009997764825821
> --deltafile_default_block_size=32768
> --deltafile_default_compression_codec=lz4
> --default_composite_key_index_block_size_bytes=4096
> --tablet_delta_store_major_compact_min_ratio=0.1000149011612
> --tablet_delta_store_minor_compact_max=1000
> --mrs_use_codegen=true
> --compaction_policy_dump_svgs_pattern=
> --enable_undo_delta_block_gc=true
> --fault_crash_before_flush_tablet_meta_after_compaction=0
> --fault_crash_before_flush_tablet_meta_after_flush_mrs=0
> --max_cell_size_bytes=65536
> --max_encoded_key_size_bytes=16384
> --tablet_bloom_block_size=4096
> --tablet_bloom_target_fp_rate=9.997473787516e-05
> --tablet_compaction_budget_mb=128
> --tablet_history_max_age_sec=86400{code}
> So my question is, *why the compaction does not run? is it a bug? and what 
> can i do to compact manually?* 
> It is a production enviroment, and many other tables have same issue, the 
> performance is getting slower and slower.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-2653) The ASAN test failed on Debian 8.9

2019-01-04 Thread HeLifu (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16734728#comment-16734728
 ] 

HeLifu commented on KUDU-2653:
--

LSAN_OPTIONS=fast_unwind_on_malloc=0 ./bin/master-test

 
{code:java}
=
==47247==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 16 byte(s) in 1 object(s) allocated from:
 #0 0x550fb8 in __interceptor_malloc 
/mnt/ddb/2/helif/apache/kudu/thirdparty/src/llvm-6.0.0.src/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:88
 #1 0x7f5975bc8c77 in glob64 (/lib/x86_64-linux-gnu/libc.so.6+0xbdc77)
 #2 0x4c7079 in __interceptor_glob 
sanitizer_common/sanitizer_common_interceptors.inc:
 #3 0x7f597a0335f7 (/usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2+0x185f7)
 #4 0x7f597a033baa in gss_indicate_mechs 
(/usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2+0x18baa)
 #5 0x7f597a035935 in gss_indicate_mechs_by_attrs 
(/usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2+0x1a935)
 #6 0x7f596fbbbc95 in _init (/usr/lib/x86_64-linux-gnu/sasl2/libgs2.so+0x3c95)
 #7 0x7f596fbbcf8e in gs2_client_plug_init 
(/usr/lib/x86_64-linux-gnu/sasl2/libgs2.so+0x4f8e)
 #8 0x7f597a26ced9 in sasl_client_add_plugin 
(/usr/lib/x86_64-linux-gnu/libsasl2.so.2+0x6ed9)
 #9 0x7f597a278dff (/usr/lib/x86_64-linux-gnu/libsasl2.so.2+0x12dff)
 #10 0x7f597a26d7b0 in sasl_client_init 
(/usr/lib/x86_64-linux-gnu/libsasl2.so.2+0x77b0)
 #11 0x7f597b3e9125 in kudu::rpc::DoSaslInit(bool) 
/mnt/ddb/2/helif/apache/kudu/src/kudu/rpc/sasl_common.cc:240:16
 #12 0x7f597b3f0fca in void std::_Bind_simple::_M_invoke<0ul>(std::_Index_tuple<0ul>) 
../../../include/c++/4.9/functional:1699:18
 #13 0x7f597b3f0f14 in std::_Bind_simple::operator()() 
../../../include/c++/4.9/functional:1688:16
 #14 0x7f597b3f0c3b in void std::__once_call_impl >() ../../../include/c++/4.9/mutex:714:7
 #15 0x7f597f9bb40f in __GI___pthread_once 
(/lib/x86_64-linux-gnu/libpthread.so.0+0xd40f)
 #16 0x7f597b3eedda in __gthread_once(int*, void (*)()) 
../../../include/x86_64-linux-gnu/c++/4.9/bits/gthr-default.h:699:12
 #17 0x7f597b3f031f in void std::call_once(std::once_flag&, void (&)(bool), bool&) 
../../../include/c++/4.9/mutex:746:17
 #18 0x7f597b3e8902 in kudu::rpc::SaslInit(bool) 
/mnt/ddb/2/helif/apache/kudu/src/kudu/rpc/sasl_common.cc:270:3
 #19 0x7f597b30a6f3 in 
kudu::rpc::MessengerBuilder::Build(std::shared_ptr*) 
/mnt/ddb/2/helif/apache/kudu/src/kudu/rpc/messenger.cc:189:3
 #20 0x7f598317413b in kudu::server::ServerBase::Init() 
/mnt/ddb/2/helif/apache/kudu/src/kudu/server/server_base.cc:474:3
 #21 0x7f5983486203 in kudu::kserver::KuduServer::Init() 
/mnt/ddb/2/helif/apache/kudu/src/kudu/kserver/kserver.cc:136:3
 #22 0x7f5986fc8031 in kudu::master::Master::Init() 
/mnt/ddb/2/helif/apache/kudu/src/kudu/master/master.cc:152:3
 #23 0x7f598703b4b6 in kudu::master::MiniMaster::Start() 
/mnt/ddb/2/helif/apache/kudu/src/kudu/master/mini_master.cc:93:3
 #24 0x6133bc in kudu::master::MasterTest::SetUp() 
/mnt/ddb/2/helif/apache/kudu/src/kudu/master/master-test.cc:125:5
 #25 0x7f597941bbe7 in HandleSehExceptionsInMethodIfSupported 
/mnt/ddb/2/helif/apache/kudu/thirdparty/src/googletest-release-1.8.0/googletest/src/gtest.cc:2402
 #26 0x7f597941bbe7 in void 
testing::internal::HandleExceptionsInMethodIfSupported(testing::Test*, void (testing::Test::*)(), char const*) 
/mnt/ddb/2/helif/apache/kudu/thirdparty/src/googletest-release-1.8.0/googletest/src/gtest.cc:2438
 #27 0x7f59794096e5 in testing::Test::Run() 
/mnt/ddb/2/helif/apache/kudu/thirdparty/src/googletest-release-1.8.0/googletest/src/gtest.cc:2470
 #28 0x7f5979409897 in testing::TestInfo::Run() 
/mnt/ddb/2/helif/apache/kudu/thirdparty/src/googletest-release-1.8.0/googletest/src/gtest.cc:2656
 #29 0x7f5979409974 in testing::TestCase::Run() 
/mnt/ddb/2/helif/apache/kudu/thirdparty/src/googletest-release-1.8.0/googletest/src/gtest.cc:2774
 #30 0x7f5979410287 in testing::internal::UnitTestImpl::RunAllTests() 
/mnt/ddb/2/helif/apache/kudu/thirdparty/src/googletest-release-1.8.0/googletest/src/gtest.cc:4649
SUMMARY: AddressSanitizer: 16 byte(s) leaked in 1 allocation(s).
{code}
 

 

LSAN_OPTIONS=fast_unwind_on_malloc=0 ./bin/ts_tablet_manager-test

 
{code:java}
=
==520==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 16 byte(s) in 1 object(s) allocated from:
 #0 0x50c068 in __interceptor_malloc 
/mnt/ddb/2/helif/apache/kudu/thirdparty/src/llvm-6.0.0.src/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:88
 #1 0x7f506b090c77 in glob64 (/lib/x86_64-linux-gnu/libc.so.6+0xbdc77)
 #2 0x482129 in __interceptor_glob 
sanitizer_common/sanitizer_common_interceptors.inc:
 #3 0x7f507047b5f7 (/usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2+0x185f7)
 #4 0x7f507047bbaa in gss_indicate_mechs 
(/usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2+0x18baa)
 #5 0x7f507047d935 in gss_indicate_mechs_by_attrs 

[jira] [Created] (KUDU-2653) The ASAN test failed on Debian 8.9

2019-01-03 Thread HeLifu (JIRA)
HeLifu created KUDU-2653:


 Summary: The ASAN test failed on Debian 8.9
 Key: KUDU-2653
 URL: https://issues.apache.org/jira/browse/KUDU-2653
 Project: Kudu
  Issue Type: Bug
  Components: test
Affects Versions: 1.8.0, 1.7.0, 1.4.0
Reporter: HeLifu
 Attachments: ctest output.txt, test-logs.tar.gz

I tried to run ASAN test of branch 1.4.x, 1.7.x, 1.8.x and master on Debian 
8.9, but all failed. It seems that there is a historical issue there. The 
errors are in the attachments.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (KUDU-2453) kudu should stop creating tablet infinitely

2018-12-27 Thread HeLifu (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16493153#comment-16493153
 ] 

HeLifu edited comment on KUDU-2453 at 12/28/18 1:44 AM:


-i digged it, and thought the reason was KUDU-1913.-

-version 1.4.x-


was (Author: helifu):
i digged it, and thought the reason was KUDU-1913.

version 1.4.x

> kudu should stop creating tablet infinitely
> ---
>
> Key: KUDU-2453
> URL: https://issues.apache.org/jira/browse/KUDU-2453
> Project: Kudu
>  Issue Type: Bug
>  Components: master, tserver
>Affects Versions: 1.4.0, 1.7.2
>Reporter: HeLifu
>Priority: Major
>
> I have met this problem again on 2018/10/26. And now the kudu version is 
> 1.7.2.
> -
> We modified the flag 'max_create_tablets_per_ts' (2000) of master.conf, and 
> there are some load on the kudu cluster. Then someone else created a big 
> table which had tens of thousands of tablets from impala-shell (that was a 
> mistake). 
> {code:java}
> CREATE TABLE XXX(
> ...
>PRIMARY KEY (...)
> )
> PARTITION BY HASH (...) PARTITIONS 100,
> RANGE (...)
> (
>   PARTITION "2018-10-24" <= VALUES < "2018-10-24\000",
>   PARTITION "2018-10-25" <= VALUES < "2018-10-25\000",
>   ...
>   PARTITION "2018-12-07" <= VALUES < "2018-12-07\000"
> )
> STORED AS KUDU
> TBLPROPERTIES ('kudu.master_addresses'= '...');
> {code}
> Here are the logs after creating table (only pick one tablet as example):
> {code:java}
> --Kudu-master log
> ==e884bda6bbd3482f94c07ca0f34f99a4==
> W1024 11:40:51.914397 180146 catalog_manager.cc:2664] TS 
> 39f15fcf42ef45bba0c95a3223dc25ee (kudu2.lt.163.org:7050): Create Tablet RPC 
> failed for tablet e884bda6bbd3482f94c07ca0f34f99a4: Remote error: Service 
> unavailable: CreateTablet request on kudu.tserver.TabletServerAdminService 
> from 10.120.219.118:50247 dropped due to backpressure. The service queue is 
> full; it has 512 items.
> I1024 11:40:51.914412 180146 catalog_manager.cc:2700] Scheduling retry of 
> CreateTablet RPC for tablet e884bda6bbd3482f94c07ca0f34f99a4 on TS 
> 39f15fcf42ef45bba0c95a3223dc25ee with a delay of 42 ms (attempt = 1)
> ...
> ==Be replaced by 0b144c00f35d48cca4d4981698faef72==
> W1024 11:41:22.114512 180202 catalog_manager.cc:3949] T 
>  P f6c9a09da7ef4fc191cab6276b942ba3: Tablet 
> e884bda6bbd3482f94c07ca0f34f99a4 (table quasi_realtime_user_feature 
> [id=946d6dd03ec544eab96231e5a03bed59]) was not created within the allowed 
> timeout. Replacing with a new tablet 0b144c00f35d48cca4d4981698faef72
> ...
> I1024 11:41:22.391916 180202 catalog_manager.cc:3806] T 
>  P f6c9a09da7ef4fc191cab6276b942ba3: Sending 
> DeleteTablet for 3 replicas of tablet e884bda6bbd3482f94c07ca0f34f99a4
> ...
> I1024 11:41:22.391927 180202 catalog_manager.cc:2922] Sending 
> DeleteTablet(TABLET_DATA_DELETED) for tablet e884bda6bbd3482f94c07ca0f34f99a4 
> on 39f15fcf42ef45bba0c95a3223dc25ee (kudu2.lt.163.org:7050) (Replaced by 
> 0b144c00f35d48cca4d4981698faef72 at 2018-10-24 11:41:22 CST)
> ...
> W1024 11:41:22.428129 180146 catalog_manager.cc:2892] TS 
> 39f15fcf42ef45bba0c95a3223dc25ee (kudu2.lt.163.org:7050): delete failed for 
> tablet e884bda6bbd3482f94c07ca0f34f99a4 with error code TABLET_NOT_RUNNING: 
> Already present: State transition of tablet e884bda6bbd3482f94c07ca0f34f99a4 
> already in progress: creating tablet
> ...
> I1024 11:41:22.428143 180146 catalog_manager.cc:2700] Scheduling retry of 
> e884bda6bbd3482f94c07ca0f34f99a4 Delete Tablet RPC for 
> TS=39f15fcf42ef45bba0c95a3223dc25ee with a delay of 35 ms (attempt = 1)
> ...
> W1024 11:41:22.683702 180145 catalog_manager.cc:2664] TS 
> b251540e606b4863bb576091ff961892 (kudu1.lt.163.org:7050): Create Tablet RPC 
> failed for tablet 0b144c00f35d48cca4d4981698faef72: Remote error: Service 
> unavailable: CreateTablet request on kudu.tserver.TabletServerAdminService 
> from 10.120.219.118:59735 dropped due to backpressure. The service queue is 
> full; it has 512 items.
> I1024 11:41:22.683717 180145 catalog_manager.cc:2700] Scheduling retry of 
> CreateTablet RPC for tablet 0b144c00f35d48cca4d4981698faef72 on TS 
> b251540e606b4863bb576091ff961892 with a delay of 46 ms (attempt = 1)
> ...
> ==Be replaced by c0e0acc448fc42fc9e48f5025b112a75==
> W1024 11:41:52.775420 180202 catalog_manager.cc:3949] T 
>  P f6c9a09da7ef4fc191cab6276b942ba3: Tablet 
> 0b144c00f35d48cca4d4981698faef72 (table quasi_realtime_user_feature 
> [id=946d6dd03ec544eab96231e5a03bed59]) was not created within the allowed 
> timeout. Replacing with a new tablet c0e0acc448fc42fc9e48f5025b112a75
> ...
> 

[jira] [Commented] (KUDU-2646) kudu restart the tablets stats from INITIALIZED change to RUNNING cost a few days

2018-12-27 Thread HeLifu (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16729984#comment-16729984
 ] 

HeLifu commented on KUDU-2646:
--

ah, i thought it was raid0 on every disk separately :(

> kudu restart the tablets stats from INITIALIZED change to RUNNING cost a few 
> days
> -
>
> Key: KUDU-2646
> URL: https://issues.apache.org/jira/browse/KUDU-2646
> Project: Kudu
>  Issue Type: Bug
>Reporter: qinzl_1
>Priority: Major
> Fix For: n/a
>
> Attachments: kudu-tserver (1).INFO.gz
>
>
> [^kudu-tserver (1).INFO.gz]i install kudu from cloudera manager ,i have 3 
> master and 4 tablet server .do not have any especial config. when i restart 
> the server, it can not offer service.i found all tablet server is INITIALIZED 
> , and it spend a long time to change to RUNNING



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-2646) kudu restart the tablets stats from INITIALIZED change to RUNNING cost a few days

2018-12-27 Thread HeLifu (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16729493#comment-16729493
 ] 

HeLifu commented on KUDU-2646:
--

Your case is the same with [~jiaqiyang]'s. Only one disk is in use even if you 
have 6 disk.

The '--fs_data_dirs=' should be configured to look something like this:

_fs_data_dirs=/cdh/kudu/tserver/{color:#d04437}disk1{color},/cdh/kudu/tserver/{color:#d04437}disk2{color},/cdh/kudu/tserver/{color:#d04437}disk3{color},/cdh/kudu/tserver/{color:#d04437}disk4{color},/cdh/kudu/tserver/{color:#d04437}disk5{color},/cdh/kudu/tserver/{color:#d04437}disk6{color}_

 

 

 
{code:java}
I1221 16:00:15.634905 135800 fs_report.cc:347] Block manager report

1 data directories: /cdh/kudu/tserver/fdd/data
Total live blocks: 33688986
Total live bytes: 134814618486
Total live bytes (after alignment): 259383992320
Total number of LBM containers: 21179 (8196 full)
Did not check for missing blocks
Did not check for orphaned blocks
Total full LBM containers with extra space: 45 (45 repaired)
Total full LBM container extra space in bytes: 4947722240 (4947722240 repaired)
Total incomplete LBM containers: 0 (0 repaired)
Total LBM partial records: 0 (0 repaired){code}
 

> kudu restart the tablets stats from INITIALIZED change to RUNNING cost a few 
> days
> -
>
> Key: KUDU-2646
> URL: https://issues.apache.org/jira/browse/KUDU-2646
> Project: Kudu
>  Issue Type: Bug
>Reporter: qinzl_1
>Priority: Major
> Fix For: n/a
>
> Attachments: kudu-tserver (1).INFO.gz
>
>
> [^kudu-tserver (1).INFO.gz]i install kudu from cloudera manager ,i have 3 
> master and 4 tablet server .do not have any especial config. when i restart 
> the server, it can not offer service.i found all tablet server is INITIALIZED 
> , and it spend a long time to change to RUNNING



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (KUDU-2646) kudu restart the tablets stats from INITIALIZED change to RUNNING cost a few days

2018-12-27 Thread HeLifu (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16729493#comment-16729493
 ] 

HeLifu edited comment on KUDU-2646 at 12/27/18 10:03 AM:
-

Your case is the same with [~jiaqiyang]'s. Only one disk is in use even if you 
have 6 disk.

The '--fs_data_dirs=' should be configured to look something like this:

_fs_data_dirs=/cdh/kudu/tserver/{color:#d04437}disk1{color},/cdh/kudu/tserver/{color:#d04437}disk2{color},/cdh/kudu/tserver/{color:#d04437}disk3{color},/cdh/kudu/tserver/{color:#d04437}disk4{color},/cdh/kudu/tserver/{color:#d04437}disk5{color},/cdh/kudu/tserver/{color:#d04437}disk6{color}_
 
{code:java}
I1221 16:00:15.634905 135800 fs_report.cc:347] Block manager report

1 data directories: /cdh/kudu/tserver/fdd/data
Total live blocks: 33688986
Total live bytes: 134814618486
Total live bytes (after alignment): 259383992320
Total number of LBM containers: 21179 (8196 full)
Did not check for missing blocks
Did not check for orphaned blocks
Total full LBM containers with extra space: 45 (45 repaired)
Total full LBM container extra space in bytes: 4947722240 (4947722240 repaired)
Total incomplete LBM containers: 0 (0 repaired)
Total LBM partial records: 0 (0 repaired){code}
 


was (Author: helifu):
Your case is the same with [~jiaqiyang]'s. Only one disk is in use even if you 
have 6 disk.

The '--fs_data_dirs=' should be configured to look something like this:

_fs_data_dirs=/cdh/kudu/tserver/{color:#d04437}disk1{color},/cdh/kudu/tserver/{color:#d04437}disk2{color},/cdh/kudu/tserver/{color:#d04437}disk3{color},/cdh/kudu/tserver/{color:#d04437}disk4{color},/cdh/kudu/tserver/{color:#d04437}disk5{color},/cdh/kudu/tserver/{color:#d04437}disk6{color}_

 

 

 
{code:java}
I1221 16:00:15.634905 135800 fs_report.cc:347] Block manager report

1 data directories: /cdh/kudu/tserver/fdd/data
Total live blocks: 33688986
Total live bytes: 134814618486
Total live bytes (after alignment): 259383992320
Total number of LBM containers: 21179 (8196 full)
Did not check for missing blocks
Did not check for orphaned blocks
Total full LBM containers with extra space: 45 (45 repaired)
Total full LBM container extra space in bytes: 4947722240 (4947722240 repaired)
Total incomplete LBM containers: 0 (0 repaired)
Total LBM partial records: 0 (0 repaired){code}
 

> kudu restart the tablets stats from INITIALIZED change to RUNNING cost a few 
> days
> -
>
> Key: KUDU-2646
> URL: https://issues.apache.org/jira/browse/KUDU-2646
> Project: Kudu
>  Issue Type: Bug
>Reporter: qinzl_1
>Priority: Major
> Fix For: n/a
>
> Attachments: kudu-tserver (1).INFO.gz
>
>
> [^kudu-tserver (1).INFO.gz]i install kudu from cloudera manager ,i have 3 
> master and 4 tablet server .do not have any especial config. when i restart 
> the server, it can not offer service.i found all tablet server is INITIALIZED 
> , and it spend a long time to change to RUNNING



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-2289) Tablet deletion should be throttled

2018-12-17 Thread HeLifu (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16723636#comment-16723636
 ] 

HeLifu commented on KUDU-2289:
--

it seems that it have been done:)

> Tablet deletion should be throttled
> ---
>
> Key: KUDU-2289
> URL: https://issues.apache.org/jira/browse/KUDU-2289
> Project: Kudu
>  Issue Type: Improvement
>  Components: tserver
>Reporter: Todd Lipcon
>Assignee: Will Berkeley
>Priority: Major
>
> Currently if a large amount of data is deleted simultaneously, the master 
> will not do any throttling of the DeleteTablet requests send to the tservers. 
> The tservers will use up to the configured number of service threads to work 
> on deleting tablets. The deletion can be relatively heavy-weight -- lots of 
> file system operations to hole-punch out the dead tablets, etc. This can have 
> a negative impact on other concurrent workloads.
> It would be desirable to do some throttling either on the master or tserver 
> side to avoid overwhelming disks and thread resources during heavy deletion.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-2636) LBM supports deleting the full container which is dead after hole punch

2018-12-12 Thread HeLifu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KUDU-2636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

HeLifu updated KUDU-2636:
-
External issue URL:   (was: https://issues.apache.org/jira/browse/KUDU-2014)

> LBM supports deleting the full container which is dead after hole punch
> ---
>
> Key: KUDU-2636
> URL: https://issues.apache.org/jira/browse/KUDU-2636
> Project: Kudu
>  Issue Type: Improvement
>  Components: fs, util
>Affects Versions: 1.8.0
>Reporter: HeLifu
>Priority: Major
>
> Right now, the LBM does not support deleting the full container which is dead 
> after hole punching, and after running for some time, there will be lots of 
> dead containers that will affect the startup time. So, it is necessary to 
> delete these files while hole punching.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KUDU-2636) LBM supports deleting the full container which is dead after hole punch

2018-12-12 Thread HeLifu (JIRA)
HeLifu created KUDU-2636:


 Summary: LBM supports deleting the full container which is dead 
after hole punch
 Key: KUDU-2636
 URL: https://issues.apache.org/jira/browse/KUDU-2636
 Project: Kudu
  Issue Type: Improvement
  Components: fs, util
Affects Versions: 1.8.0
Reporter: HeLifu


Right now, the LBM does not support deleting the full container which is dead 
after hole punching, and after running for some time, there will be lots of 
dead containers that will affect the startup time. So, it is necessary to 
delete these files while hole punching.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-2630) kudu scan performance is too terrible

2018-11-23 Thread HeLifu (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16696581#comment-16696581
 ] 

HeLifu commented on KUDU-2630:
--

Please check the *rowset layout diagram* of the tablet in that table first, 
maybe there will be some valuable info there.

> kudu scan performance is too terrible
> -
>
> Key: KUDU-2630
> URL: https://issues.apache.org/jira/browse/KUDU-2630
> Project: Kudu
>  Issue Type: Bug
>Reporter: qinzl
>Priority: Major
> Attachments: menu.saveimg.savepath20181123165144.jpg, 
> menu.saveimg.savepath20181123165728.jpg
>
>
> 800 recodes table it cost 15 minute
> h2. 
> !menu.saveimg.savepath20181123165728.jpg!!menu.saveimg.savepath20181123165144.jpg!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-2453) kudu should stop creating tablet infinitely

2018-11-01 Thread HeLifu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KUDU-2453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

HeLifu updated KUDU-2453:
-
Description: 
I have met this problem again on 2018/10/26. And now the kudu version is 1.7.2.

-

We modified the flag 'max_create_tablets_per_ts' (2000) of master.conf, and 
there are some load on the kudu cluster. Then someone else created a big table 
which had tens of thousands of tablets from impala-shell (that was a mistake). 
{code:java}
CREATE TABLE XXX(
...
   PRIMARY KEY (...)
)
PARTITION BY HASH (...) PARTITIONS 100,
RANGE (...)
(
  PARTITION "2018-10-24" <= VALUES < "2018-10-24\000",
  PARTITION "2018-10-25" <= VALUES < "2018-10-25\000",
  ...
  PARTITION "2018-12-07" <= VALUES < "2018-12-07\000"
)
STORED AS KUDU
TBLPROPERTIES ('kudu.master_addresses'= '...');
{code}
Here are the logs after creating table (only pick one tablet as example):
{code:java}
--Kudu-master log
==e884bda6bbd3482f94c07ca0f34f99a4==
W1024 11:40:51.914397 180146 catalog_manager.cc:2664] TS 
39f15fcf42ef45bba0c95a3223dc25ee (kudu2.lt.163.org:7050): Create Tablet RPC 
failed for tablet e884bda6bbd3482f94c07ca0f34f99a4: Remote error: Service 
unavailable: CreateTablet request on kudu.tserver.TabletServerAdminService from 
10.120.219.118:50247 dropped due to backpressure. The service queue is full; it 
has 512 items.
I1024 11:40:51.914412 180146 catalog_manager.cc:2700] Scheduling retry of 
CreateTablet RPC for tablet e884bda6bbd3482f94c07ca0f34f99a4 on TS 
39f15fcf42ef45bba0c95a3223dc25ee with a delay of 42 ms (attempt = 1)
...

==Be replaced by 0b144c00f35d48cca4d4981698faef72==
W1024 11:41:22.114512 180202 catalog_manager.cc:3949] T 
 P f6c9a09da7ef4fc191cab6276b942ba3: Tablet 
e884bda6bbd3482f94c07ca0f34f99a4 (table quasi_realtime_user_feature 
[id=946d6dd03ec544eab96231e5a03bed59]) was not created within the allowed 
timeout. Replacing with a new tablet 0b144c00f35d48cca4d4981698faef72
...
I1024 11:41:22.391916 180202 catalog_manager.cc:3806] T 
 P f6c9a09da7ef4fc191cab6276b942ba3: Sending 
DeleteTablet for 3 replicas of tablet e884bda6bbd3482f94c07ca0f34f99a4
...
I1024 11:41:22.391927 180202 catalog_manager.cc:2922] Sending 
DeleteTablet(TABLET_DATA_DELETED) for tablet e884bda6bbd3482f94c07ca0f34f99a4 
on 39f15fcf42ef45bba0c95a3223dc25ee (kudu2.lt.163.org:7050) (Replaced by 
0b144c00f35d48cca4d4981698faef72 at 2018-10-24 11:41:22 CST)
...
W1024 11:41:22.428129 180146 catalog_manager.cc:2892] TS 
39f15fcf42ef45bba0c95a3223dc25ee (kudu2.lt.163.org:7050): delete failed for 
tablet e884bda6bbd3482f94c07ca0f34f99a4 with error code TABLET_NOT_RUNNING: 
Already present: State transition of tablet e884bda6bbd3482f94c07ca0f34f99a4 
already in progress: creating tablet
...
I1024 11:41:22.428143 180146 catalog_manager.cc:2700] Scheduling retry of 
e884bda6bbd3482f94c07ca0f34f99a4 Delete Tablet RPC for 
TS=39f15fcf42ef45bba0c95a3223dc25ee with a delay of 35 ms (attempt = 1)
...
W1024 11:41:22.683702 180145 catalog_manager.cc:2664] TS 
b251540e606b4863bb576091ff961892 (kudu1.lt.163.org:7050): Create Tablet RPC 
failed for tablet 0b144c00f35d48cca4d4981698faef72: Remote error: Service 
unavailable: CreateTablet request on kudu.tserver.TabletServerAdminService from 
10.120.219.118:59735 dropped due to backpressure. The service queue is full; it 
has 512 items.
I1024 11:41:22.683717 180145 catalog_manager.cc:2700] Scheduling retry of 
CreateTablet RPC for tablet 0b144c00f35d48cca4d4981698faef72 on TS 
b251540e606b4863bb576091ff961892 with a delay of 46 ms (attempt = 1)
...

==Be replaced by c0e0acc448fc42fc9e48f5025b112a75==
W1024 11:41:52.775420 180202 catalog_manager.cc:3949] T 
 P f6c9a09da7ef4fc191cab6276b942ba3: Tablet 
0b144c00f35d48cca4d4981698faef72 (table quasi_realtime_user_feature 
[id=946d6dd03ec544eab96231e5a03bed59]) was not created within the allowed 
timeout. Replacing with a new tablet c0e0acc448fc42fc9e48f5025b112a75
...


--Kudu-tserver log
I1024 11:40:52.014571 137358 tablet_service.cc:758] Processing CreateTablet for 
tablet e884bda6bbd3482f94c07ca0f34f99a4 (table=quasi_realtime_user_feature 
[id=946d6dd03ec544eab96231e5a03bed59]), partition=HASH (user_id) PARTITION 29, 
RANGE (dt) PARTITION "2018-11-10" <= VALUES < "2018-11-10\000"
...
I1024 11:40:52.017539 137358 ts_tablet_manager.cc:1080] T 
e884bda6bbd3482f94c07ca0f34f99a4 P 39f15fcf42ef45bba0c95a3223dc25ee: Registered 
tablet (data state: TABLET_DATA_READY)
...
I1024 11:41:22.392292 137355 tablet_service.cc:799] Processing DeleteTablet for 
tablet e884bda6bbd3482f94c07ca0f34f99a4 with delete_type TABLET_DATA_DELETED 
(Replaced by 0b144c00f35d48cca4d4981698faef72 at 2018-10-24 11:41:22 CST) from 

[jira] [Updated] (KUDU-2453) kudu should stop creating tablet infinitely

2018-10-31 Thread HeLifu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KUDU-2453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

HeLifu updated KUDU-2453:
-
Description: 
I have met this problem again on 2018/10/26. And now the kudu version is 1.7.2.

-

We modified the flag 'max_create_tablets_per_ts' (2000) of master.conf, and 
there are some load on the kudu cluster. Then someone else created a big table 
which had tens of thousands of tablets from impala-shell (that was a mistake). 
{code:java}
CREATE TABLE XXX(
...
   PRIMARY KEY (...)
)
PARTITION BY HASH (...) PARTITIONS 100,
RANGE (...)
(
  PARTITION "2018-10-24" <= VALUES < "2018-10-24\000",
  PARTITION "2018-10-25" <= VALUES < "2018-10-25\000",
  ...
  PARTITION "2018-12-07" <= VALUES < "2018-12-07\000"
)
STORED AS KUDU
TBLPROPERTIES ('kudu.master_addresses'= '...');
{code}
Here are the logs after creating table (only pick one tablet as example):
{code:java}
--Kudu-master log
==e884bda6bbd3482f94c07ca0f34f99a4==
W1024 11:40:51.914397 180146 catalog_manager.cc:2664] TS 
39f15fcf42ef45bba0c95a3223dc25ee (kudu2.lt.163.org:7050): Create Tablet RPC 
failed for tablet e884bda6bbd3482f94c07ca0f34f99a4: Remote error: Service 
unavailable: CreateTablet request on kudu.tserver.TabletServerAdminService from 
10.120.219.118:50247 dropped due to backpressure. The service queue is full; it 
has 512 items.
I1024 11:40:51.914412 180146 catalog_manager.cc:2700] Scheduling retry of 
CreateTablet RPC for tablet e884bda6bbd3482f94c07ca0f34f99a4 on TS 
39f15fcf42ef45bba0c95a3223dc25ee with a delay of 42 ms (attempt = 1)
...

==Be replaced by 0b144c00f35d48cca4d4981698faef72==
W1024 11:41:22.114512 180202 catalog_manager.cc:3949] T 
 P f6c9a09da7ef4fc191cab6276b942ba3: Tablet 
e884bda6bbd3482f94c07ca0f34f99a4 (table quasi_realtime_user_feature 
[id=946d6dd03ec544eab96231e5a03bed59]) was not created within the allowed 
timeout. Replacing with a new tablet 0b144c00f35d48cca4d4981698faef72
...
I1024 11:41:22.391916 180202 catalog_manager.cc:3806] T 
 P f6c9a09da7ef4fc191cab6276b942ba3: Sending 
DeleteTablet for 3 replicas of tablet e884bda6bbd3482f94c07ca0f34f99a4
...
I1024 11:41:22.391927 180202 catalog_manager.cc:2922] Sending 
DeleteTablet(TABLET_DATA_DELETED) for tablet e884bda6bbd3482f94c07ca0f34f99a4 
on 39f15fcf42ef45bba0c95a3223dc25ee (kudu2.lt.163.org:7050) (Replaced by 
0b144c00f35d48cca4d4981698faef72 at 2018-10-24 11:41:22 CST)
...
W1024 11:41:22.428129 180146 catalog_manager.cc:2892] TS 
39f15fcf42ef45bba0c95a3223dc25ee (kudu2.lt.163.org:7050): delete failed for 
tablet e884bda6bbd3482f94c07ca0f34f99a4 with error code TABLET_NOT_RUNNING: 
Already present: State transition of tablet e884bda6bbd3482f94c07ca0f34f99a4 
already in progress: creating tablet
...
I1024 11:41:22.428143 180146 catalog_manager.cc:2700] Scheduling retry of 
e884bda6bbd3482f94c07ca0f34f99a4 Delete Tablet RPC for 
TS=39f15fcf42ef45bba0c95a3223dc25ee with a delay of 35 ms (attempt = 1)
...
W1024 11:41:22.683702 180145 catalog_manager.cc:2664] TS 
b251540e606b4863bb576091ff961892 (kudu1.lt.163.org:7050): Create Tablet RPC 
failed for tablet 0b144c00f35d48cca4d4981698faef72: Remote error: Service 
unavailable: CreateTablet request on kudu.tserver.TabletServerAdminService from 
10.120.219.118:59735 dropped due to backpressure. The service queue is full; it 
has 512 items.
I1024 11:41:22.683717 180145 catalog_manager.cc:2700] Scheduling retry of 
CreateTablet RPC for tablet 0b144c00f35d48cca4d4981698faef72 on TS 
b251540e606b4863bb576091ff961892 with a delay of 46 ms (attempt = 1)
...

==Be replaced by c0e0acc448fc42fc9e48f5025b112a75==
W1024 11:41:52.775420 180202 catalog_manager.cc:3949] T 
 P f6c9a09da7ef4fc191cab6276b942ba3: Tablet 
0b144c00f35d48cca4d4981698faef72 (table quasi_realtime_user_feature 
[id=946d6dd03ec544eab96231e5a03bed59]) was not created within the allowed 
timeout. Replacing with a new tablet c0e0acc448fc42fc9e48f5025b112a75
...


--Kudu-tserver log
I1024 11:40:52.014571 137358 tablet_service.cc:758] Processing CreateTablet for 
tablet e884bda6bbd3482f94c07ca0f34f99a4 (table=quasi_realtime_user_feature 
[id=946d6dd03ec544eab96231e5a03bed59]), partition=HASH (user_id) PARTITION 29, 
RANGE (dt) PARTITION "2018-11-10" <= VALUES < "2018-11-10\000"
...
I1024 11:40:52.017539 137358 ts_tablet_manager.cc:1080] T 
e884bda6bbd3482f94c07ca0f34f99a4 P 39f15fcf42ef45bba0c95a3223dc25ee: Registered 
tablet (data state: TABLET_DATA_READY)
...
I1024 11:41:22.392292 137355 tablet_service.cc:799] Processing DeleteTablet for 
tablet e884bda6bbd3482f94c07ca0f34f99a4 with delete_type TABLET_DATA_DELETED 
(Replaced by 0b144c00f35d48cca4d4981698faef72 at 2018-10-24 11:41:22 CST) from 

[jira] [Updated] (KUDU-2453) kudu should stop creating tablet infinitely

2018-10-31 Thread HeLifu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KUDU-2453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

HeLifu updated KUDU-2453:
-
Description: 
I have met this problem again on 2018/10/26. And now the kudu version is 1.7.2.

-

We modified the flag 'max_create_tablets_per_ts' (2000) of master.conf, and 
there are some load on the kudu cluster. Then someone else created a big table 
which had tens of thousands of tablets from impala-shell (that was a mistake). 
{code:java}
CREATE TABLE XXX(
...
   PRIMARY KEY (...)
)
PARTITION BY HASH (...) PARTITIONS 100,
RANGE (...)
(
  PARTITION "2018-10-24" <= VALUES < "2018-10-24\000",
  PARTITION "2018-10-25" <= VALUES < "2018-10-25\000",
  ...
  PARTITION "2018-12-07" <= VALUES < "2018-12-07\000"
)
STORED AS KUDU
TBLPROPERTIES ('kudu.master_addresses'= '...');
{code}
Here are the logs after creating table (only pick one tablet as example):
{code:java}
--Kudu-master log
==e884bda6bbd3482f94c07ca0f34f99a4==
W1024 11:40:51.914397 180146 catalog_manager.cc:2664] TS 
39f15fcf42ef45bba0c95a3223dc25ee (kudu2.lt.163.org:7050): Create Tablet RPC 
failed for tablet e884bda6bbd3482f94c07ca0f34f99a4: Remote error: Service 
unavailable: CreateTablet request on kudu.tserver.TabletServerAdminService from 
10.120.219.118:50247 dropped due to backpressure. The service queue is full; it 
has 512 items.
I1024 11:40:51.914412 180146 catalog_manager.cc:2700] Scheduling retry of 
CreateTablet RPC for tablet e884bda6bbd3482f94c07ca0f34f99a4 on TS 
39f15fcf42ef45bba0c95a3223dc25ee with a delay of 42 ms (attempt = 1)
...

==Be replaced by 0b144c00f35d48cca4d4981698faef72==
W1024 11:41:22.114512 180202 catalog_manager.cc:3949] T 
 P f6c9a09da7ef4fc191cab6276b942ba3: Tablet 
e884bda6bbd3482f94c07ca0f34f99a4 (table quasi_realtime_user_feature 
[id=946d6dd03ec544eab96231e5a03bed59]) was not created within the allowed 
timeout. Replacing with a new tablet 0b144c00f35d48cca4d4981698faef72
...
I1024 11:41:22.391916 180202 catalog_manager.cc:3806] T 
 P f6c9a09da7ef4fc191cab6276b942ba3: Sending 
DeleteTablet for 3 replicas of tablet e884bda6bbd3482f94c07ca0f34f99a4
...
I1024 11:41:22.391927 180202 catalog_manager.cc:2922] Sending 
DeleteTablet(TABLET_DATA_DELETED) for tablet e884bda6bbd3482f94c07ca0f34f99a4 
on 39f15fcf42ef45bba0c95a3223dc25ee (kudu2.lt.163.org:7050) (Replaced by 
0b144c00f35d48cca4d4981698faef72 at 2018-10-24 11:41:22 CST)
...
W1024 11:41:22.428129 180146 catalog_manager.cc:2892] TS 
39f15fcf42ef45bba0c95a3223dc25ee (kudu2.lt.163.org:7050): delete failed for 
tablet e884bda6bbd3482f94c07ca0f34f99a4 with error code TABLET_NOT_RUNNING: 
Already present: State transition of tablet e884bda6bbd3482f94c07ca0f34f99a4 
already in progress: creating tablet
...
I1024 11:41:22.428143 180146 catalog_manager.cc:2700] Scheduling retry of 
e884bda6bbd3482f94c07ca0f34f99a4 Delete Tablet RPC for 
TS=39f15fcf42ef45bba0c95a3223dc25ee with a delay of 35 ms (attempt = 1)
...
W1024 11:41:22.683702 180145 catalog_manager.cc:2664] TS 
b251540e606b4863bb576091ff961892 (kudu1.lt.163.org:7050): Create Tablet RPC 
failed for tablet 0b144c00f35d48cca4d4981698faef72: Remote error: Service 
unavailable: CreateTablet request on kudu.tserver.TabletServerAdminService from 
10.120.219.118:59735 dropped due to backpressure. The service queue is full; it 
has 512 items.
I1024 11:41:22.683717 180145 catalog_manager.cc:2700] Scheduling retry of 
CreateTablet RPC for tablet 0b144c00f35d48cca4d4981698faef72 on TS 
b251540e606b4863bb576091ff961892 with a delay of 46 ms (attempt = 1)
...

==Be replaced by c0e0acc448fc42fc9e48f5025b112a75==
W1024 11:41:52.775420 180202 catalog_manager.cc:3949] T 
 P f6c9a09da7ef4fc191cab6276b942ba3: Tablet 
0b144c00f35d48cca4d4981698faef72 (table quasi_realtime_user_feature 
[id=946d6dd03ec544eab96231e5a03bed59]) was not created within the allowed 
timeout. Replacing with a new tablet c0e0acc448fc42fc9e48f5025b112a75
...


--Kudu-tserver log
I1024 11:40:52.014571 137358 tablet_service.cc:758] Processing CreateTablet for 
tablet e884bda6bbd3482f94c07ca0f34f99a4 (table=quasi_realtime_user_feature 
[id=946d6dd03ec544eab96231e5a03bed59]), partition=HASH (user_id) PARTITION 29, 
RANGE (dt) PARTITION "2018-11-10" <= VALUES < "2018-11-10\000"
...
I1024 11:40:52.017539 137358 ts_tablet_manager.cc:1080] T 
e884bda6bbd3482f94c07ca0f34f99a4 P 39f15fcf42ef45bba0c95a3223dc25ee: Registered 
tablet (data state: TABLET_DATA_READY)
...
I1024 11:41:22.392292 137355 tablet_service.cc:799] Processing DeleteTablet for 
tablet e884bda6bbd3482f94c07ca0f34f99a4 with delete_type TABLET_DATA_DELETED 
(Replaced by 0b144c00f35d48cca4d4981698faef72 at 2018-10-24 11:41:22 CST) from 

[jira] [Updated] (KUDU-2453) kudu should stop creating tablet infinitely

2018-10-31 Thread HeLifu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KUDU-2453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

HeLifu updated KUDU-2453:
-
Description: 
I have met this problem again on 2018/10/26. And now the kudu version is 1.7.2.

-

We modified the flag 'max_create_tablets_per_ts' (2000) of master.conf, and 
there are some load on the kudu cluster. Then someone else created a big table 
which had tens of thousands of tablets from impala-shell (that was a mistake). 
{code:java}
CREATE TABLE XXX(
...
   PRIMARY KEY (...)
)
PARTITION BY HASH (...) PARTITIONS 100,
RANGE (...)
(
  PARTITION "2018-10-24" <= VALUES < "2018-10-24\000",
  PARTITION "2018-10-25" <= VALUES < "2018-10-25\000",
  ...
  PARTITION "2018-12-07" <= VALUES < "2018-12-07\000"
)
STORED AS KUDU
TBLPROPERTIES ('kudu.master_addresses'= '...');
{code}
Here are the logs after creating table (only pick one tablet as example):
{code:java}
--Kudu-master log
==e884bda6bbd3482f94c07ca0f34f99a4==
W1024 11:40:51.914397 180146 catalog_manager.cc:2664] TS 
39f15fcf42ef45bba0c95a3223dc25ee (kudu2.lt.163.org:7050): Create Tablet RPC 
failed for tablet e884bda6bbd3482f94c07ca0f34f99a4: Remote error: Service 
unavailable: CreateTablet request on kudu.tserver.TabletServerAdminService from 
10.120.219.118:50247 dropped due to backpressure. The service queue is full; it 
has 512 items.
I1024 11:40:51.914412 180146 catalog_manager.cc:2700] Scheduling retry of 
CreateTablet RPC for tablet e884bda6bbd3482f94c07ca0f34f99a4 on TS 
39f15fcf42ef45bba0c95a3223dc25ee with a delay of 42 ms (attempt = 1)
...

==Be replaced by 0b144c00f35d48cca4d4981698faef72==
W1024 11:41:22.114512 180202 catalog_manager.cc:3949] T 
 P f6c9a09da7ef4fc191cab6276b942ba3: Tablet 
e884bda6bbd3482f94c07ca0f34f99a4 (table quasi_realtime_user_feature 
[id=946d6dd03ec544eab96231e5a03bed59]) was not created within the allowed 
timeout. Replacing with a new tablet 0b144c00f35d48cca4d4981698faef72
...
I1024 11:41:22.391916 180202 catalog_manager.cc:3806] T 
 P f6c9a09da7ef4fc191cab6276b942ba3: Sending 
DeleteTablet for 3 replicas of tablet e884bda6bbd3482f94c07ca0f34f99a4
...
I1024 11:41:22.391927 180202 catalog_manager.cc:2922] Sending 
DeleteTablet(TABLET_DATA_DELETED) for tablet e884bda6bbd3482f94c07ca0f34f99a4 
on 39f15fcf42ef45bba0c95a3223dc25ee (kudu2.lt.163.org:7050) (Replaced by 
0b144c00f35d48cca4d4981698faef72 at 2018-10-24 11:41:22 CST)
...
W1024 11:41:22.428129 180146 catalog_manager.cc:2892] TS 
39f15fcf42ef45bba0c95a3223dc25ee (kudu2.lt.163.org:7050): delete failed for 
tablet e884bda6bbd3482f94c07ca0f34f99a4 with error code TABLET_NOT_RUNNING: 
Already present: State transition of tablet e884bda6bbd3482f94c07ca0f34f99a4 
already in progress: creating tablet
...
I1024 11:41:22.428143 180146 catalog_manager.cc:2700] Scheduling retry of 
e884bda6bbd3482f94c07ca0f34f99a4 Delete Tablet RPC for 
TS=39f15fcf42ef45bba0c95a3223dc25ee with a delay of 35 ms (attempt = 1)
...
W1024 11:41:22.683702 180145 catalog_manager.cc:2664] TS 
b251540e606b4863bb576091ff961892 (kudu1.lt.163.org:7050): Create Tablet RPC 
failed for tablet 0b144c00f35d48cca4d4981698faef72: Remote error: Service 
unavailable: CreateTablet request on kudu.tserver.TabletServerAdminService from 
10.120.219.118:59735 dropped due to backpressure. The service queue is full; it 
has 512 items.
I1024 11:41:22.683717 180145 catalog_manager.cc:2700] Scheduling retry of 
CreateTablet RPC for tablet 0b144c00f35d48cca4d4981698faef72 on TS 
b251540e606b4863bb576091ff961892 with a delay of 46 ms (attempt = 1)
...

==Be replaced by c0e0acc448fc42fc9e48f5025b112a75==
W1024 11:41:52.775420 180202 catalog_manager.cc:3949] T 
 P f6c9a09da7ef4fc191cab6276b942ba3: Tablet 
0b144c00f35d48cca4d4981698faef72 (table quasi_realtime_user_feature 
[id=946d6dd03ec544eab96231e5a03bed59]) was not created within the allowed 
timeout. Replacing with a new tablet c0e0acc448fc42fc9e48f5025b112a75
...


--Kudu-tserver log
I1024 11:40:52.014571 137358 tablet_service.cc:758] Processing CreateTablet for 
tablet e884bda6bbd3482f94c07ca0f34f99a4 (table=quasi_realtime_user_feature 
[id=946d6dd03ec544eab96231e5a03bed59]), partition=HASH (user_id) PARTITION 29, 
RANGE (dt) PARTITION "2018-11-10" <= VALUES < "2018-11-10\000"
...
I1024 11:40:52.017539 137358 ts_tablet_manager.cc:1080] T 
e884bda6bbd3482f94c07ca0f34f99a4 P 39f15fcf42ef45bba0c95a3223dc25ee: Registered 
tablet (data state: TABLET_DATA_READY)
...
I1024 11:41:22.392292 137355 tablet_service.cc:799] Processing DeleteTablet for 
tablet e884bda6bbd3482f94c07ca0f34f99a4 with delete_type TABLET_DATA_DELETED 
(Replaced by 0b144c00f35d48cca4d4981698faef72 at 2018-10-24 11:41:22 CST) from 

[jira] [Updated] (KUDU-2453) kudu should stop creating tablet infinitely

2018-10-31 Thread HeLifu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KUDU-2453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

HeLifu updated KUDU-2453:
-
Description: 
I have met this problem again on 2018/10/26. And now the kudu version is 1.7.2.

-

We modified the flag 'max_create_tablets_per_ts' (2000) of master.conf, and 
there are some load on the kudu cluster. Then someone else created a big table 
which had tens of thousands of tablets from impala-shell (that was a mistake). 
{code:java}
CREATE TABLE XXX(
...
   PRIMARY KEY (...)
)
PARTITION BY HASH (...) PARTITIONS 100,
RANGE (...)
(
  PARTITION "2018-10-24" <= VALUES < "2018-10-24\000",
  PARTITION "2018-10-25" <= VALUES < "2018-10-25\000",
  ...
  PARTITION "2018-12-07" <= VALUES < "2018-12-07\000"
)
STORED AS KUDU
TBLPROPERTIES ('kudu.master_addresses'= '...');
{code}
Here are the logs after creating table (only pick one tablet as example):
{code:java}
--Kudu-master log
==e884bda6bbd3482f94c07ca0f34f99a4==
W1024 11:40:51.914397 180146 catalog_manager.cc:2664] TS 
39f15fcf42ef45bba0c95a3223dc25ee (kudu2.lt.163.org:7050): Create Tablet RPC 
failed for tablet e884bda6bbd3482f94c07ca0f34f99a4: Remote error: Service 
unavailable: CreateTablet request on kudu.tserver.TabletServerAdminService from 
10.120.219.118:50247 dropped due to backpressure. The service queue is full; it 
has 512 items.
I1024 11:40:51.914412 180146 catalog_manager.cc:2700] Scheduling retry of 
CreateTablet RPC for tablet e884bda6bbd3482f94c07ca0f34f99a4 on TS 
39f15fcf42ef45bba0c95a3223dc25ee with a delay of 42 ms (attempt = 1)
...

==Be replaced by 0b144c00f35d48cca4d4981698faef72==
W1024 11:41:22.114512 180202 catalog_manager.cc:3949] T 
 P f6c9a09da7ef4fc191cab6276b942ba3: Tablet 
e884bda6bbd3482f94c07ca0f34f99a4 (table quasi_realtime_user_feature 
[id=946d6dd03ec544eab96231e5a03bed59]) was not created within the allowed 
timeout. Replacing with a new tablet 0b144c00f35d48cca4d4981698faef72
...
I1024 11:41:22.391916 180202 catalog_manager.cc:3806] T 
 P f6c9a09da7ef4fc191cab6276b942ba3: Sending 
DeleteTablet for 3 replicas of tablet e884bda6bbd3482f94c07ca0f34f99a4
...
I1024 11:41:22.391927 180202 catalog_manager.cc:2922] Sending 
DeleteTablet(TABLET_DATA_DELETED) for tablet e884bda6bbd3482f94c07ca0f34f99a4 
on 39f15fcf42ef45bba0c95a3223dc25ee (kudu2.lt.163.org:7050) (Replaced by 
0b144c00f35d48cca4d4981698faef72 at 2018-10-24 11:41:22 CST)
...
W1024 11:41:22.428129 180146 catalog_manager.cc:2892] TS 
39f15fcf42ef45bba0c95a3223dc25ee (kudu2.lt.163.org:7050): delete failed for 
tablet e884bda6bbd3482f94c07ca0f34f99a4 with error code TABLET_NOT_RUNNING: 
Already present: State transition of tablet e884bda6bbd3482f94c07ca0f34f99a4 
already in progress: creating tablet
...
I1024 11:41:22.428143 180146 catalog_manager.cc:2700] Scheduling retry of 
e884bda6bbd3482f94c07ca0f34f99a4 Delete Tablet RPC for 
TS=39f15fcf42ef45bba0c95a3223dc25ee with a delay of 35 ms (attempt = 1)
...
W1024 11:41:22.683702 180145 catalog_manager.cc:2664] TS 
b251540e606b4863bb576091ff961892 (kudu1.lt.163.org:7050): Create Tablet RPC 
failed for tablet 0b144c00f35d48cca4d4981698faef72: Remote error: Service 
unavailable: CreateTablet request on kudu.tserver.TabletServerAdminService from 
10.120.219.118:59735 dropped due to backpressure. The service queue is full; it 
has 512 items.
I1024 11:41:22.683717 180145 catalog_manager.cc:2700] Scheduling retry of 
CreateTablet RPC for tablet 0b144c00f35d48cca4d4981698faef72 on TS 
b251540e606b4863bb576091ff961892 with a delay of 46 ms (attempt = 1)
...

==Be replaced by c0e0acc448fc42fc9e48f5025b112a75==
W1024 11:41:52.775420 180202 catalog_manager.cc:3949] T 
 P f6c9a09da7ef4fc191cab6276b942ba3: Tablet 
0b144c00f35d48cca4d4981698faef72 (table quasi_realtime_user_feature 
[id=946d6dd03ec544eab96231e5a03bed59]) was not created within the allowed 
timeout. Replacing with a new tablet c0e0acc448fc42fc9e48f5025b112a75
...


--Kudu-tserver log
I1024 11:40:52.014571 137358 tablet_service.cc:758] Processing CreateTablet for 
tablet e884bda6bbd3482f94c07ca0f34f99a4 (table=quasi_realtime_user_feature 
[id=946d6dd03ec544eab96231e5a03bed59]), partition=HASH (user_id) PARTITION 29, 
RANGE (dt) PARTITION "2018-11-10" <= VALUES < "2018-11-10\000"
...
I1024 11:40:52.017539 137358 ts_tablet_manager.cc:1080] T 
e884bda6bbd3482f94c07ca0f34f99a4 P 39f15fcf42ef45bba0c95a3223dc25ee: Registered 
tablet (data state: TABLET_DATA_READY)
...
I1024 11:41:22.392292 137355 tablet_service.cc:799] Processing DeleteTablet for 
tablet e884bda6bbd3482f94c07ca0f34f99a4 with delete_type TABLET_DATA_DELETED 
(Replaced by 0b144c00f35d48cca4d4981698faef72 at 2018-10-24 11:41:22 CST) from 

[jira] [Updated] (KUDU-2453) kudu should stop creating tablet infinitely

2018-10-31 Thread HeLifu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KUDU-2453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

HeLifu updated KUDU-2453:
-
Description: 
I have met this problem again on 2018/10/26. And now the kudu version is 1.7.2.

-

We modified the flag 'max_create_tablets_per_ts' (2000) of master.conf, and 
there are some load on the kudu cluster. Then someone else created a big table 
which had tens of thousands of tablets from impala-shell (that was a mistake).

 
{code:java}
CREATE TABLE XXX(
...
   PRIMARY KEY (...)
)
PARTITION BY HASH (...) PARTITIONS 100,
RANGE (...)
(
  PARTITION "2018-10-24" <= VALUES < "2018-10-24\000",
  PARTITION "2018-10-25" <= VALUES < "2018-10-25\000",
  ...
  PARTITION "2018-12-07" <= VALUES < "2018-12-07\000"
)
STORED AS KUDU
TBLPROPERTIES ('kudu.master_addresses'= '...');
{code}
Here are the logs after creating table (only pick one tablet as example):

 
{code:java}
--Kudu-master log
==e884bda6bbd3482f94c07ca0f34f99a4==
W1024 11:40:51.914397 180146 catalog_manager.cc:2664] TS 
39f15fcf42ef45bba0c95a3223dc25ee (kudu2.lt.163.org:7050): Create Tablet RPC 
failed for tablet e884bda6bbd3482f94c07ca0f34f99a4: Remote error: Service 
unavailable: CreateTablet request on kudu.tserver.TabletServerAdminService from 
10.120.219.118:50247 dropped due to backpressure. The service queue is full; it 
has 512 items.
I1024 11:40:51.914412 180146 catalog_manager.cc:2700] Scheduling retry of 
CreateTablet RPC for tablet e884bda6bbd3482f94c07ca0f34f99a4 on TS 
39f15fcf42ef45bba0c95a3223dc25ee with a delay of 42 ms (attempt = 1)
...

==Be replaced by 0b144c00f35d48cca4d4981698faef72==
W1024 11:41:22.114512 180202 catalog_manager.cc:3949] T 
 P f6c9a09da7ef4fc191cab6276b942ba3: Tablet 
e884bda6bbd3482f94c07ca0f34f99a4 (table quasi_realtime_user_feature 
[id=946d6dd03ec544eab96231e5a03bed59]) was not created within the allowed 
timeout. Replacing with a new tablet 0b144c00f35d48cca4d4981698faef72
...
I1024 11:41:22.391916 180202 catalog_manager.cc:3806] T 
 P f6c9a09da7ef4fc191cab6276b942ba3: Sending 
DeleteTablet for 3 replicas of tablet e884bda6bbd3482f94c07ca0f34f99a4
...
I1024 11:41:22.391927 180202 catalog_manager.cc:2922] Sending 
DeleteTablet(TABLET_DATA_DELETED) for tablet e884bda6bbd3482f94c07ca0f34f99a4 
on 39f15fcf42ef45bba0c95a3223dc25ee (kudu2.lt.163.org:7050) (Replaced by 
0b144c00f35d48cca4d4981698faef72 at 2018-10-24 11:41:22 CST)
...
W1024 11:41:22.428129 180146 catalog_manager.cc:2892] TS 
39f15fcf42ef45bba0c95a3223dc25ee (kudu2.lt.163.org:7050): delete failed for 
tablet e884bda6bbd3482f94c07ca0f34f99a4 with error code TABLET_NOT_RUNNING: 
Already present: State transition of tablet e884bda6bbd3482f94c07ca0f34f99a4 
already in progress: creating tablet
...
I1024 11:41:22.428143 180146 catalog_manager.cc:2700] Scheduling retry of 
e884bda6bbd3482f94c07ca0f34f99a4 Delete Tablet RPC for 
TS=39f15fcf42ef45bba0c95a3223dc25ee with a delay of 35 ms (attempt = 1)
...
W1024 11:41:22.683702 180145 catalog_manager.cc:2664] TS 
b251540e606b4863bb576091ff961892 (kudu1.lt.163.org:7050): Create Tablet RPC 
failed for tablet 0b144c00f35d48cca4d4981698faef72: Remote error: Service 
unavailable: CreateTablet request on kudu.tserver.TabletServerAdminService from 
10.120.219.118:59735 dropped due to backpressure. The service queue is full; it 
has 512 items.
I1024 11:41:22.683717 180145 catalog_manager.cc:2700] Scheduling retry of 
CreateTablet RPC for tablet 0b144c00f35d48cca4d4981698faef72 on TS 
b251540e606b4863bb576091ff961892 with a delay of 46 ms (attempt = 1)
...

==Be replaced by c0e0acc448fc42fc9e48f5025b112a75==
W1024 11:41:52.775420 180202 catalog_manager.cc:3949] T 
 P f6c9a09da7ef4fc191cab6276b942ba3: Tablet 
0b144c00f35d48cca4d4981698faef72 (table quasi_realtime_user_feature 
[id=946d6dd03ec544eab96231e5a03bed59]) was not created within the allowed 
timeout. Replacing with a new tablet c0e0acc448fc42fc9e48f5025b112a75
...


--Kudu-tserver log
I1024 11:40:52.014571 137358 tablet_service.cc:758] Processing CreateTablet for 
tablet e884bda6bbd3482f94c07ca0f34f99a4 (table=quasi_realtime_user_feature 
[id=946d6dd03ec544eab96231e5a03bed59]), partition=HASH (user_id) PARTITION 29, 
RANGE (dt) PARTITION "2018-11-10" <= VALUES < "2018-11-10\000"
...
I1024 11:40:52.017539 137358 ts_tablet_manager.cc:1080] T 
e884bda6bbd3482f94c07ca0f34f99a4 P 39f15fcf42ef45bba0c95a3223dc25ee: Registered 
tablet (data state: TABLET_DATA_READY)
...
I1024 11:41:22.392292 137355 tablet_service.cc:799] Processing DeleteTablet for 
tablet e884bda6bbd3482f94c07ca0f34f99a4 with delete_type TABLET_DATA_DELETED 
(Replaced by 0b144c00f35d48cca4d4981698faef72 at 2018-10-24 11:41:22 CST) from 

[jira] [Updated] (KUDU-2453) kudu should stop creating tablet infinitely

2018-10-31 Thread HeLifu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KUDU-2453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

HeLifu updated KUDU-2453:
-
Summary: kudu should stop creating tablet infinitely  (was: kudu will 
create tablet infinitely while there are more than 2000 tables on the tserver)

> kudu should stop creating tablet infinitely
> ---
>
> Key: KUDU-2453
> URL: https://issues.apache.org/jira/browse/KUDU-2453
> Project: Kudu
>  Issue Type: Bug
>  Components: master, tserver
>Affects Versions: 1.4.0, 1.7.2
>Reporter: HeLifu
>Priority: Major
>
> I have met this problem again on 2018/10/26. And now the kudu version is 
> 1.7.2.
> kudu-master's log as below:
> {code:java}
> I1031 16:21:21.644222 180146 catalog_manager.cc:2922] Sending 
> DeleteTablet(TABLET_DATA_DELETED) for tablet d1fd56be8eef44e782d509a0eeae9c15 
> on 39f15fcf42ef45bba0c95a3223dc25ee (kudu2.lt.163.org:7050) (Replaced by 
> ff4fd0a538944d69b8a6beea81e5bb01 at 2018-10-24 12:39:17 CST)
> W1031 16:21:21.644421 180146 catalog_manager.cc:2892] TS 
> 39f15fcf42ef45bba0c95a3223dc25ee (kudu2.lt.163.org:7050): delete failed for 
> tablet d1fd56be8eef44e782d509a0eeae9c15 with error code TABLET_NOT_RUNNING: 
> Already present: State transition of tablet d1fd56be8eef44e782d509a0eeae9c15 
> already in progress: creating tablet
> I1031 16:21:21.644436 180146 catalog_manager.cc:2700] Scheduling retry of 
> d1fd56be8eef44e782d509a0eeae9c15 Delete Tablet RPC for 
> TS=39f15fcf42ef45bba0c95a3223dc25ee with a delay of 553 ms (attempt = 6)
> {code}
> kudu-tserver's log as below:
>  
> {code:java}
> I1031 16:21:22.197888 137341 tablet_service.cc:799] Processing DeleteTablet 
> for tablet d1fd56be8eef44e782d509a0eeae9c15 with delete_type 
> TABLET_DATA_DELETED (Replaced by ff4fd0a538944d69b8a6beea81e5bb01 at 
> 2018-10-24 12:39:17 CST) from {username='kudu'} at 10.120.219.118:50247
> I1031 16:21:22.230309 137131 maintenance_manager.cc:492] P 
> 39f15fcf42ef45bba0c95a3223dc25ee: 
> FlushDeltaMemStoresOp(70499bc0f9ac4d8196ae5a0be6ef0b8b) complete. Timing: 
> real 0.416suser 0.404s sys 0.008s Metrics: 
> {"fdatasync":3,"fdatasync_us":2583,"lbm_write_time_us":29,"lbm_writes_lt_1ms":4}
> I1031 16:21:22.321700 137341 tablet_service.cc:799] Processing DeleteTablet 
> for tablet 74a30181dea9400a9bcfaeb56f83f379 with delete_type 
> TABLET_DATA_DELETED (Replaced by 31e350fddea443048946f5a20d3171bd at 
> 2018-10-31 16:21:13 CST) from {username='kudu'} at 10.120.219.118:50247
> I1031 16:21:22.350440 137341 tablet_service.cc:799] Processing DeleteTablet 
> for tablet 7c864af01309432c9a2a4d1c88bbe52b with delete_type 
> TABLET_DATA_DELETED (Replaced by ec4b733818d940e0af32c51bda3c7^C
> {code}
>  
> ---
> We modified the flag '{color:#FF}max_create_tablets_per_ts{color}' (2000) 
> of master.conf, and there is some load on the kudu cluster. Then someone else 
> created a big table which had tens of thousands of tablets from impala-shell 
> (it was a mistake).
> It was a long time for him to wait, so he did "ctrl+c". But we found that the 
> tablets in 'INITIALIZED' status was growing rapidly, half an hour later it 
> was 350,000 :(
> We deleted this table by kudu client tool, and found that the number of 
> 'INITIALIZED' tablets was going down slowly. By simple estimating it will 
> take 10+ days to be back to normal.  But luckily, the application system are 
> not affected.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (KUDU-2453) kudu should stop creating tablet infinitely

2018-10-31 Thread HeLifu (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16493153#comment-16493153
 ] 

HeLifu edited comment on KUDU-2453 at 10/31/18 8:56 AM:


i digged it, and thought the reason was KUDU-1913.

version 1.4.x


was (Author: helifu):
i digged it, and thought the reason was 
[KUDU-1913|https://issues.apache.org/jira/browse/KUDU-1913]

> kudu should stop creating tablet infinitely
> ---
>
> Key: KUDU-2453
> URL: https://issues.apache.org/jira/browse/KUDU-2453
> Project: Kudu
>  Issue Type: Bug
>  Components: master, tserver
>Affects Versions: 1.4.0, 1.7.2
>Reporter: HeLifu
>Priority: Major
>
> I have met this problem again on 2018/10/26. And now the kudu version is 
> 1.7.2.
> kudu-master's log as below:
> {code:java}
> I1031 16:21:21.644222 180146 catalog_manager.cc:2922] Sending 
> DeleteTablet(TABLET_DATA_DELETED) for tablet d1fd56be8eef44e782d509a0eeae9c15 
> on 39f15fcf42ef45bba0c95a3223dc25ee (kudu2.lt.163.org:7050) (Replaced by 
> ff4fd0a538944d69b8a6beea81e5bb01 at 2018-10-24 12:39:17 CST)
> W1031 16:21:21.644421 180146 catalog_manager.cc:2892] TS 
> 39f15fcf42ef45bba0c95a3223dc25ee (kudu2.lt.163.org:7050): delete failed for 
> tablet d1fd56be8eef44e782d509a0eeae9c15 with error code TABLET_NOT_RUNNING: 
> Already present: State transition of tablet d1fd56be8eef44e782d509a0eeae9c15 
> already in progress: creating tablet
> I1031 16:21:21.644436 180146 catalog_manager.cc:2700] Scheduling retry of 
> d1fd56be8eef44e782d509a0eeae9c15 Delete Tablet RPC for 
> TS=39f15fcf42ef45bba0c95a3223dc25ee with a delay of 553 ms (attempt = 6)
> {code}
> kudu-tserver's log as below:
>  
> {code:java}
> I1031 16:21:22.197888 137341 tablet_service.cc:799] Processing DeleteTablet 
> for tablet d1fd56be8eef44e782d509a0eeae9c15 with delete_type 
> TABLET_DATA_DELETED (Replaced by ff4fd0a538944d69b8a6beea81e5bb01 at 
> 2018-10-24 12:39:17 CST) from {username='kudu'} at 10.120.219.118:50247
> I1031 16:21:22.230309 137131 maintenance_manager.cc:492] P 
> 39f15fcf42ef45bba0c95a3223dc25ee: 
> FlushDeltaMemStoresOp(70499bc0f9ac4d8196ae5a0be6ef0b8b) complete. Timing: 
> real 0.416suser 0.404s sys 0.008s Metrics: 
> {"fdatasync":3,"fdatasync_us":2583,"lbm_write_time_us":29,"lbm_writes_lt_1ms":4}
> I1031 16:21:22.321700 137341 tablet_service.cc:799] Processing DeleteTablet 
> for tablet 74a30181dea9400a9bcfaeb56f83f379 with delete_type 
> TABLET_DATA_DELETED (Replaced by 31e350fddea443048946f5a20d3171bd at 
> 2018-10-31 16:21:13 CST) from {username='kudu'} at 10.120.219.118:50247
> I1031 16:21:22.350440 137341 tablet_service.cc:799] Processing DeleteTablet 
> for tablet 7c864af01309432c9a2a4d1c88bbe52b with delete_type 
> TABLET_DATA_DELETED (Replaced by ec4b733818d940e0af32c51bda3c7^C
> {code}
>  
> ---
> We modified the flag '{color:#FF}max_create_tablets_per_ts{color}' (2000) 
> of master.conf, and there is some load on the kudu cluster. Then someone else 
> created a big table which had tens of thousands of tablets from impala-shell 
> (it was a mistake).
> It was a long time for him to wait, so he did "ctrl+c". But we found that the 
> tablets in 'INITIALIZED' status was growing rapidly, half an hour later it 
> was 350,000 :(
> We deleted this table by kudu client tool, and found that the number of 
> 'INITIALIZED' tablets was going down slowly. By simple estimating it will 
> take 10+ days to be back to normal.  But luckily, the application system are 
> not affected.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-2453) kudu will create tablet infinitely while there are more than 2000 tables on the tserver

2018-10-31 Thread HeLifu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KUDU-2453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

HeLifu updated KUDU-2453:
-
Affects Version/s: (was: 1.7.1)
   1.7.2

> kudu will create tablet infinitely while there are more than 2000 tables on 
> the tserver
> ---
>
> Key: KUDU-2453
> URL: https://issues.apache.org/jira/browse/KUDU-2453
> Project: Kudu
>  Issue Type: Bug
>  Components: master, tserver
>Affects Versions: 1.4.0, 1.7.2
>Reporter: HeLifu
>Priority: Major
>
> I have met this problem again on 2018/10/26.The kudu version is 1.7.2
> Once there are more than 2000(one threshold value) tablets on one tserver and 
>  at the same time the queue for the consensus service will be full. Then, if 
> we create a new table, we can see that the number of the new tablets will 
> raise infinitely on tserver.
> I think it is related to the logic of creating tablet on master, especially 
> the replacement operation.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-2453) kudu will create tablet infinitely while there are more than 2000 tables on the tserver

2018-10-31 Thread HeLifu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KUDU-2453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

HeLifu updated KUDU-2453:
-
Description: 
I have met this problem again on 2018/10/26.The kudu version is 1.7.2

Once there are more than 2000(one threshold value) tablets on one tserver and  
at the same time the queue for the consensus service will be full. Then, if we 
create a new table, we can see that the number of the new tablets will raise 
infinitely on tserver.

I think it is related to the logic of creating tablet on master, especially the 
replacement operation.

 

  was:
Once there are more than 2000(one threshold value) tablets on one tserver and  
at the same time the queue for the consensus service will be full. Then, if we 
create a new table, we can see that the number of the new tablets will raise 
infinitely on tserver.

I think it is related to the logic of creating tablet on master, especially the 
replacement operation.

 


> kudu will create tablet infinitely while there are more than 2000 tables on 
> the tserver
> ---
>
> Key: KUDU-2453
> URL: https://issues.apache.org/jira/browse/KUDU-2453
> Project: Kudu
>  Issue Type: Bug
>  Components: master, tserver
>Affects Versions: 1.4.0, 1.7.1
>Reporter: HeLifu
>Priority: Major
>
> I have met this problem again on 2018/10/26.The kudu version is 1.7.2
> Once there are more than 2000(one threshold value) tablets on one tserver and 
>  at the same time the queue for the consensus service will be full. Then, if 
> we create a new table, we can see that the number of the new tablets will 
> raise infinitely on tserver.
> I think it is related to the logic of creating tablet on master, especially 
> the replacement operation.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-2453) kudu will create tablet infinitely while there are more than 2000 tables on the tserver

2018-10-31 Thread HeLifu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KUDU-2453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

HeLifu updated KUDU-2453:
-
Affects Version/s: 1.7.1

> kudu will create tablet infinitely while there are more than 2000 tables on 
> the tserver
> ---
>
> Key: KUDU-2453
> URL: https://issues.apache.org/jira/browse/KUDU-2453
> Project: Kudu
>  Issue Type: Bug
>  Components: master, tserver
>Affects Versions: 1.4.0, 1.7.1
>Reporter: HeLifu
>Priority: Major
>
> Once there are more than 2000(one threshold value) tablets on one tserver and 
>  at the same time the queue for the consensus service will be full. Then, if 
> we create a new table, we can see that the number of the new tablets will 
> raise infinitely on tserver.
> I think it is related to the logic of creating tablet on master, especially 
> the replacement operation.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (KUDU-2471) ColumnSchema.equals NPE with non-Decimal columns

2018-10-29 Thread HeLifu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KUDU-2471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

HeLifu reassigned KUDU-2471:


Assignee: Grant Henke  (was: HeLifu)

> ColumnSchema.equals NPE with non-Decimal columns
> 
>
> Key: KUDU-2471
> URL: https://issues.apache.org/jira/browse/KUDU-2471
> Project: Kudu
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.7.1
>Reporter: Dan Burkert
>Assignee: Grant Henke
>Priority: Blocker
> Fix For: 1.8.0
>
>
> Reported by Chris George on slack. Missing a type/null check on this line: 
> https://github.com/apache/kudu/blob/branch-1.7.x/java/kudu-client/src/main/java/org/apache/kudu/ColumnSchema.java#L212.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (KUDU-2471) ColumnSchema.equals NPE with non-Decimal columns

2018-10-29 Thread HeLifu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KUDU-2471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

HeLifu reassigned KUDU-2471:


Assignee: HeLifu  (was: Grant Henke)

> ColumnSchema.equals NPE with non-Decimal columns
> 
>
> Key: KUDU-2471
> URL: https://issues.apache.org/jira/browse/KUDU-2471
> Project: Kudu
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.7.1
>Reporter: Dan Burkert
>Assignee: HeLifu
>Priority: Blocker
> Fix For: 1.8.0
>
>
> Reported by Chris George on slack. Missing a type/null check on this line: 
> https://github.com/apache/kudu/blob/branch-1.7.x/java/kudu-client/src/main/java/org/apache/kudu/ColumnSchema.java#L212.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-2483) Scan tablets with bloom filter

2018-09-12 Thread HeLifu (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16612946#comment-16612946
 ] 

HeLifu commented on KUDU-2483:
--

I talked with ZhangYao on WeChat. He told me that you guys are going to 
exposing the bloom filter of the util module in kudu. Is that true?
Another option, is it possible to introduce a new bloom filter for scan 
operations? For example, impala's bloom filter which has been optimized. In 
this case, the modification for impala is friendly and limited. And for spark, 
it is always new.;)


> Scan tablets with bloom filter
> --
>
> Key: KUDU-2483
> URL: https://issues.apache.org/jira/browse/KUDU-2483
> Project: Kudu
>  Issue Type: New Feature
>  Components: client
>Reporter: jin xing
>Priority: Major
> Attachments: KUDU-2483, image-2018-07-01-23-29-05-517.png
>
>
> Join is really common/popular in Spark SQL, in this JIRA I take broadcast 
> join as an example and describe how Kudu's bloom filter can help accelerate 
> distributed computing.
> Spark runs broadcast join with below steps:
>  1. When do broadcast join, we have a small table and a big table; Spark will 
> read all data from small table to one worker and build a hash table;
>  2. The generated hash table from step 1 is broadcasted to all the workers, 
> which will read the splits from big table;
>  3. Workers start fetching and iterating all the splits of big table and see 
> if the joining keys exists in the hash table; Only matched joining keys is 
> retained.
> From above, step 3 is the heaviest, especially when the worker and split 
> storage is not on the same host and bandwith is limited. Actually the cost 
> brought by step 3 is not always necessary. Think about below scenario:
> {code:none}
> Small table A
> id      name
> 1      Jin
> 6      Xing
> Big table B
> id     age
> 1      10
> 2      21
> 3      33
> 4      65
> 5      32
> 6      23
> 7      18
> 8      20
> 9      22
> {code}
> Run query with SQL: *select * from A inner join B on A.id=B.id*
> It's pretty straight that we don't need to fetch all the data from Table B, 
> because the number of matched keys is really small;
> I propose to use small table to build a bloom filter(BF) and use the 
> generated BF as a predicate/filter to fetch data from big table, thus:
>  1. Much traffic/bandwith is saved.
>  2. Less data to processe by worker
> Broadcast join is just an example, other types of join will also benefit if 
> we scan with a BF
> In a nutshell, I think Kudu can provide an iterface, by which user can scan 
> data with bloom filters
>  
> Here I want add some statistics for Spark-Kudu integration with/without 
> BloomBloomFilter.
> In our product environment the bandwidth of each executor is 50M bps.
> We do inner join with two tables – – one is large and another one is 
> comparatively small.
> In Spark, inner join can be implemented as SortMergeJoin or 
> BroadcastHashJoin, we implemented the corresponding operators with 
> BloomFilter as SortMergeBloomFilterJoin and BroadcastBloomFilterJoin.
> The hash table of BloomFilter is configured as 32M. 
> I show statistics as below:
> ||Records of Table A||Records of Table B||Join Operator||Executor Time||
> |400 thousand|14 billion|SortMergeJoin|760 seconds|
> |400 thousand|14 billion|BroadcastHashJoin|376s|
> |400 thousand|14 billion|BroadcastBloomFilterJoin|21s|
> |2 million|14 billion|SortMergeJoin|707s|
> |2 million|14 billion|BroadcastHashJoin|329s|
> |2 million|14 billion|SortMergeBloomFilterJoin|75s|
> |2 million|14 billion|BroadcastBloomFilterJoin|35s|
> As we can see, it benefit a lot from BloomFilter-PushDown. 
> I want to take this jira  as a umbrella and my workmates will submit 
> following sub-task/pr.
> It will be great if some can take more look at this and share some comments. 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-2566) Enhance rowset tree pruning and discard string copy while querying

2018-09-10 Thread HeLifu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KUDU-2566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

HeLifu updated KUDU-2566:
-
Description: 
1.Support open-ended intervals:
In our kudu source code, we just cull row-sets when lower_bound_key and 
exclusive_upper_bound_key are existing at the same time. And if not, we will 
grab all row-sets of the tablet, then have to seek the key in 
‘CFileSet::Iterator::PushdownRangeScanPredicate’ for the unnecessary row-sets 
(which will waste disk io). 
After modification, we could cull rowsets whether lower_bound_key or 
exclusive_upper_bound_key exists or not;

2.The upper bound key is exclusive, but the RowSetTree function takes an 
inclusive interval. So, we might end up fetching one more rowset than necessary.
After modification, the upper bound will only fetch the exactly rowsets;

3.Perf improvement: using raw slices instead of copying to strings while 
querying.
After modification, the copying from slices to string is discarded.



  was:
1.Support open-ended intervals:
  In our kudu source code, we just cull row-sets when lower_bound_key and 
exclusive_upper_bound_key are existing at the same time. And if not, we will 
grab all row-sets of the tablet, then have to seek the key in 
‘CFileSet::Iterator::PushdownRangeScanPredicate’ for the unnecessary row-sets 
(which will waste disk io).
After modification, we could cull rowsets whether lower_bound_key or 
exclusive_upper_bound_key exists or not;
2.The upper bound key is exclusive, but the RowSetTree function takes an 
inclusive interval. So, we might end up fetching one more rowset than necessary.
After modification, the upper bound will only fetch the exactly rowsets;
3.Perf improvement: using raw slices instead of copying to strings while 
querying.
After modification, the copying from slices to string is discarded.
4.Simplify the logic of CaptureConsistentIterators in tablet.cc.
After modification,  the function logic will be simpler.



> Enhance rowset tree pruning and discard string copy while querying
> --
>
> Key: KUDU-2566
> URL: https://issues.apache.org/jira/browse/KUDU-2566
> Project: Kudu
>  Issue Type: Improvement
>  Components: tablet, util
>Affects Versions: 1.0.0
>Reporter: HeLifu
>Priority: Major
>
> 1.Support open-ended intervals:
> In our kudu source code, we just cull row-sets when lower_bound_key and 
> exclusive_upper_bound_key are existing at the same time. And if not, we will 
> grab all row-sets of the tablet, then have to seek the key in 
> ‘CFileSet::Iterator::PushdownRangeScanPredicate’ for the unnecessary row-sets 
> (which will waste disk io). 
> After modification, we could cull rowsets whether lower_bound_key or 
> exclusive_upper_bound_key exists or not;
> 2.The upper bound key is exclusive, but the RowSetTree function takes an 
> inclusive interval. So, we might end up fetching one more rowset than 
> necessary.
> After modification, the upper bound will only fetch the exactly rowsets;
> 3.Perf improvement: using raw slices instead of copying to strings while 
> querying.
> After modification, the copying from slices to string is discarded.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-2566) Enhance rowset tree pruning and discard string copy while querying

2018-09-06 Thread HeLifu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KUDU-2566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

HeLifu updated KUDU-2566:
-
Description: 
1.Support open-ended intervals:
  In our kudu source code, we just cull row-sets when lower_bound_key and 
exclusive_upper_bound_key are existing at the same time. And if not, we will 
grab all row-sets of the tablet, then have to seek the key in 
‘CFileSet::Iterator::PushdownRangeScanPredicate’ for the unnecessary row-sets 
(which will waste disk io).
After modification, we could cull rowsets whether lower_bound_key or 
exclusive_upper_bound_key exists or not;
2.The upper bound key is exclusive, but the RowSetTree function takes an 
inclusive interval. So, we might end up fetching one more rowset than necessary.
After modification, the upper bound will only fetch the exactly rowsets;
3.Perf improvement: using raw slices instead of copying to strings while 
querying.
After modification, the copying from slices to string is discarded.
4.Simplify the logic of CaptureConsistentIterators in tablet.cc.
After modification,  the function logic will be simpler.


  was:
1.Support open-ended intervals:
  In our kudu source code, we just cull row-sets when lower_bound_key and 
exclusive_upper_bound_key are existing at the same time. And if not, we will 
grab all row-sets of the tablet, then have to seek the key in 
‘CFileSet::Iterator::PushdownRangeScanPredicate’ for the unnecessary row-sets 
(which will waste disk io).
*After modification*, we could cull rowsets whether lower_bound_key or 
exclusive_upper_bound_key exists or not;
2.The upper bound key is exclusive, but the RowSetTree function takes an 
inclusive interval. So, we might end up fetching one more rowset than necessary.
*After modification*, the upper bound will only fetch the exactly rowsets;
3.Perf improvement: using raw slices instead of copying to strings while 
querying.
*After modification*, the copying from slices to string is discarded.
4.Simplify the logic of CaptureConsistentIterators in tablet.cc.
*After modification*,  the function logic will be simpler.



> Enhance rowset tree pruning and discard string copy while querying
> --
>
> Key: KUDU-2566
> URL: https://issues.apache.org/jira/browse/KUDU-2566
> Project: Kudu
>  Issue Type: Improvement
>  Components: tablet, util
>Affects Versions: 1.0.0
>Reporter: HeLifu
>Priority: Major
>
> 1.Support open-ended intervals:
>   In our kudu source code, we just cull row-sets when lower_bound_key and 
> exclusive_upper_bound_key are existing at the same time. And if not, we will 
> grab all row-sets of the tablet, then have to seek the key in 
> ‘CFileSet::Iterator::PushdownRangeScanPredicate’ for the unnecessary row-sets 
> (which will waste disk io).
> After modification, we could cull rowsets whether lower_bound_key or 
> exclusive_upper_bound_key exists or not;
> 2.The upper bound key is exclusive, but the RowSetTree function takes an 
> inclusive interval. So, we might end up fetching one more rowset than 
> necessary.
> After modification, the upper bound will only fetch the exactly rowsets;
> 3.Perf improvement: using raw slices instead of copying to strings while 
> querying.
> After modification, the copying from slices to string is discarded.
> 4.Simplify the logic of CaptureConsistentIterators in tablet.cc.
> After modification,  the function logic will be simpler.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-2566) Enhance rowset tree pruning and discard string copy while querying

2018-09-06 Thread HeLifu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KUDU-2566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

HeLifu updated KUDU-2566:
-
Summary: Enhance rowset tree pruning and discard string copy while querying 
 (was: Enhance rowset tree pruning and discard string copying while querying)

> Enhance rowset tree pruning and discard string copy while querying
> --
>
> Key: KUDU-2566
> URL: https://issues.apache.org/jira/browse/KUDU-2566
> Project: Kudu
>  Issue Type: Improvement
>  Components: tablet, util
>Affects Versions: 1.0.0
>Reporter: HeLifu
>Priority: Major
>
> 1.Support open-ended intervals:
>   In our kudu source code, we just cull row-sets when lower_bound_key and 
> exclusive_upper_bound_key are existing at the same time. And if not, we will 
> grab all row-sets of the tablet, then have to seek the key in 
> ‘CFileSet::Iterator::PushdownRangeScanPredicate’ for the unnecessary row-sets 
> (which will waste disk io).
> *After modification*, we could cull rowsets whether lower_bound_key or 
> exclusive_upper_bound_key exists or not;
> 2.The upper bound key is exclusive, but the RowSetTree function takes an 
> inclusive interval. So, we might end up fetching one more rowset than 
> necessary.
> *After modification*, the upper bound will only fetch the exactly rowsets;
> 3.Perf improvement: using raw slices instead of copying to strings while 
> querying.
> *After modification*, the copying from slices to string is discarded.
> 4.Simplify the logic of CaptureConsistentIterators in tablet.cc.
> *After modification*,  the function logic will be simpler.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-2566) Enhance rowset tree pruning and discard string copying while querying

2018-09-06 Thread HeLifu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KUDU-2566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

HeLifu updated KUDU-2566:
-
Summary: Enhance rowset tree pruning and discard string copying while 
querying  (was: Enhance rowset tree pruning and fix a perf todo while querying)

> Enhance rowset tree pruning and discard string copying while querying
> -
>
> Key: KUDU-2566
> URL: https://issues.apache.org/jira/browse/KUDU-2566
> Project: Kudu
>  Issue Type: Improvement
>  Components: tablet, util
>Affects Versions: 1.0.0
>Reporter: HeLifu
>Priority: Major
>
> 1.Support open-ended intervals:
>   In our kudu source code, we just cull row-sets when lower_bound_key and 
> exclusive_upper_bound_key are existing at the same time. And if not, we will 
> grab all row-sets of the tablet, then have to seek the key in 
> ‘CFileSet::Iterator::PushdownRangeScanPredicate’ for the unnecessary row-sets 
> (which will waste disk io).
> *After modification*, we could cull rowsets whether lower_bound_key or 
> exclusive_upper_bound_key exists or not;
> 2.The upper bound key is exclusive, but the RowSetTree function takes an 
> inclusive interval. So, we might end up fetching one more rowset than 
> necessary.
> *After modification*, the upper bound will only fetch the exactly rowsets;
> 3.Perf improvement: using raw slices instead of copying to strings while 
> querying.
> *After modification*, the copying from slices to string is discarded.
> 4.Simplify the logic of CaptureConsistentIterators in tablet.cc.
> *After modification*,  the function logic will be simpler.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-2566) Enhance rowset tree pruning and fix a perf todo while querying

2018-09-06 Thread HeLifu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KUDU-2566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

HeLifu updated KUDU-2566:
-
Description: 
1.Support open-ended intervals:
  In our kudu source code, we just cull row-sets when lower_bound_key and 
exclusive_upper_bound_key are existing at the same time. And if not, we will 
grab all row-sets of the tablet, then have to seek the key in 
‘CFileSet::Iterator::PushdownRangeScanPredicate’ for the unnecessary row-sets 
(which will waste disk io).
*After modification*, we could cull rowsets whether lower_bound_key or 
exclusive_upper_bound_key exists or not;
2.The upper bound key is exclusive, but the RowSetTree function takes an 
inclusive interval. So, we might end up fetching one more rowset than necessary.
*After modification*, the upper bound will only fetch the exactly rowsets;
3.Perf improvement: using raw slices instead of copying to strings while 
querying.
*After modification*, the copying from slices to string is discarded.
4.Simplify the logic of CaptureConsistentIterators in tablet.cc.
*After modification*,  the function logic will be simpler.


  was:
1.Support open-ended intervals:
  In our kudu source code, we just cull row-sets when lower_bound_key and 
exclusive_upper_bound_key are existing at the same time. And if not, we will 
grab all row-sets of the tablet, then have to seek the key in 
‘CFileSet::Iterator::PushdownRangeScanPredicate’ for the unnecessary row-sets 
(which will waste disk io).
*After modification*, we could cull rowsets whether lower_bound_key or 
exclusive_upper_bound_key exists or not;
2.The upper bound key is exclusive, but the RowSetTree function takes an 
inclusive interval. So, we might end up fetching one more rowset than necessary.
*After modification*, the upper bound will only fetch the exactly rowsets;
3.Perf improvement: using raw slices instead of copying to strings while 
querying.
*After modification*, the copying from slices to string is discarded.
4.Simplify the logic of CaptureConsistentIterators in tablet.cc.


> Enhance rowset tree pruning and fix a perf todo while querying
> --
>
> Key: KUDU-2566
> URL: https://issues.apache.org/jira/browse/KUDU-2566
> Project: Kudu
>  Issue Type: Improvement
>  Components: tablet, util
>Affects Versions: 1.0.0
>Reporter: HeLifu
>Priority: Major
>
> 1.Support open-ended intervals:
>   In our kudu source code, we just cull row-sets when lower_bound_key and 
> exclusive_upper_bound_key are existing at the same time. And if not, we will 
> grab all row-sets of the tablet, then have to seek the key in 
> ‘CFileSet::Iterator::PushdownRangeScanPredicate’ for the unnecessary row-sets 
> (which will waste disk io).
> *After modification*, we could cull rowsets whether lower_bound_key or 
> exclusive_upper_bound_key exists or not;
> 2.The upper bound key is exclusive, but the RowSetTree function takes an 
> inclusive interval. So, we might end up fetching one more rowset than 
> necessary.
> *After modification*, the upper bound will only fetch the exactly rowsets;
> 3.Perf improvement: using raw slices instead of copying to strings while 
> querying.
> *After modification*, the copying from slices to string is discarded.
> 4.Simplify the logic of CaptureConsistentIterators in tablet.cc.
> *After modification*,  the function logic will be simpler.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-2566) Enhance rowset tree pruning and fix a perf todo while querying

2018-09-06 Thread HeLifu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KUDU-2566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

HeLifu updated KUDU-2566:
-
Description: 
1.Support open-ended intervals:
  In our kudu source code, we just cull row-sets when lower_bound_key and 
exclusive_upper_bound_key are existing at the same time. And if not, we will 
grab all row-sets of the tablet, then have to seek the key in 
‘CFileSet::Iterator::PushdownRangeScanPredicate’ for the unnecessary row-sets 
(which will waste disk io).
*After modification*, we could cull rowsets whether lower_bound_key or 
exclusive_upper_bound_key exists or not;
2.The upper bound key is exclusive, but the RowSetTree function takes an 
inclusive interval. So, we might end up fetching one more rowset than necessary.
*After modification*, the upper bound will only fetch the exactly rowsets;
3.Perf improvement: using raw slices instead of copying to strings while 
querying.
*After modification*, the copying from slices to string is discarded.
4.Simplify the logic of CaptureConsistentIterators in tablet.cc.

  was:
In function of 'Tablet::CaptureConsistentIterators' of tablet.cc, there are two 
TODO:

1.TODO(todd): support open-ended intervals:

  In our kudu source code, we just cull row-sets when lower_bound_key and 
exclusive_upper_bound_key are existing at the same time. And if not, we will 
grab all row-sets of the tablet, then have to seek the key in 
‘CFileSet::Iterator::PushdownRangeScanPredicate’ for the unnecessary row-sets 
(which will waste disk io).

2.TODO(todd): the upper bound key is exclusive, but the RowSetTree function 
takes an inclusive interval. So, we might end up fetching one more rowset than 
necessary.


> Enhance rowset tree pruning and fix a perf todo while querying
> --
>
> Key: KUDU-2566
> URL: https://issues.apache.org/jira/browse/KUDU-2566
> Project: Kudu
>  Issue Type: Improvement
>  Components: tablet, util
>Affects Versions: 1.0.0
>Reporter: HeLifu
>Priority: Major
>
> 1.Support open-ended intervals:
>   In our kudu source code, we just cull row-sets when lower_bound_key and 
> exclusive_upper_bound_key are existing at the same time. And if not, we will 
> grab all row-sets of the tablet, then have to seek the key in 
> ‘CFileSet::Iterator::PushdownRangeScanPredicate’ for the unnecessary row-sets 
> (which will waste disk io).
> *After modification*, we could cull rowsets whether lower_bound_key or 
> exclusive_upper_bound_key exists or not;
> 2.The upper bound key is exclusive, but the RowSetTree function takes an 
> inclusive interval. So, we might end up fetching one more rowset than 
> necessary.
> *After modification*, the upper bound will only fetch the exactly rowsets;
> 3.Perf improvement: using raw slices instead of copying to strings while 
> querying.
> *After modification*, the copying from slices to string is discarded.
> 4.Simplify the logic of CaptureConsistentIterators in tablet.cc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-2566) Enhance rowset tree pruning and fix a perf todo

2018-09-06 Thread HeLifu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KUDU-2566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

HeLifu updated KUDU-2566:
-
Summary: Enhance rowset tree pruning and fix a perf todo  (was: Enhance 
rowset tree pruning)

> Enhance rowset tree pruning and fix a perf todo
> ---
>
> Key: KUDU-2566
> URL: https://issues.apache.org/jira/browse/KUDU-2566
> Project: Kudu
>  Issue Type: Improvement
>  Components: tablet, util
>Affects Versions: 1.0.0
>Reporter: HeLifu
>Priority: Major
>
> In function of 'Tablet::CaptureConsistentIterators' of tablet.cc, there are 
> two TODO:
> 1.TODO(todd): support open-ended intervals:
>   In our kudu source code, we just cull row-sets when lower_bound_key and 
> exclusive_upper_bound_key are existing at the same time. And if not, we will 
> grab all row-sets of the tablet, then have to seek the key in 
> ‘CFileSet::Iterator::PushdownRangeScanPredicate’ for the unnecessary row-sets 
> (which will waste disk io).
> 2.TODO(todd): the upper bound key is exclusive, but the RowSetTree function 
> takes an inclusive interval. So, we might end up fetching one more rowset 
> than necessary.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-2566) Enhance rowset tree pruning

2018-09-06 Thread HeLifu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KUDU-2566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

HeLifu updated KUDU-2566:
-
Summary: Enhance rowset tree pruning  (was: support open-ended intervals &  
end up fetching one  more rowset than necessary)

> Enhance rowset tree pruning
> ---
>
> Key: KUDU-2566
> URL: https://issues.apache.org/jira/browse/KUDU-2566
> Project: Kudu
>  Issue Type: Improvement
>  Components: tablet, util
>Affects Versions: 1.0.0
>Reporter: HeLifu
>Priority: Major
>
> In function of 'Tablet::CaptureConsistentIterators' of tablet.cc, there are 
> two TODO:
> 1.TODO(todd): support open-ended intervals:
>   In our kudu source code, we just cull row-sets when lower_bound_key and 
> exclusive_upper_bound_key are existing at the same time. And if not, we will 
> grab all row-sets of the tablet, then have to seek the key in 
> ‘CFileSet::Iterator::PushdownRangeScanPredicate’ for the unnecessary row-sets 
> (which will waste disk io).
> 2.TODO(todd): the upper bound key is exclusive, but the RowSetTree function 
> takes an inclusive interval. So, we might end up fetching one more rowset 
> than necessary.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (KUDU-2529) kudu CLI command supports list the tablets under a table and list the replicas of a tablet

2018-09-04 Thread HeLifu (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16594724#comment-16594724
 ] 

HeLifu edited comment on KUDU-2529 at 9/4/18 9:41 AM:
--

Thanks to [~granthenke]'s comments.

I added a new flag "{color:#ff}_-tables=_{color}" to "_kudu table 
list  -list_tablets_" command which could list the tablets 
under a table.

[https://gerrit.cloudera.org/#/c/11360/|[https://gerrit.cloudera.org/#/c/11360/]]

 And the "_kudu cluster ksck  -tables= 
-tablets= -verbose_" command could list the replicas for a tablet 
indeed, but the output is a little bit complex.


was (Author: helifu):
Thanks to [~granthenke]'s comments.

I added a new flag "{color:#ff}_-tables=_{color}" to "_kudu table 
list  -list_tablets_" command which could list the tablets 
under a table.

[https://gerrit.cloudera.org/#/c/11360/|https://gerrit.cloudera.org/#/c/11360/

 And the "_kudu cluster ksck  -tables= 
-tablets= -verbose_" command could list the replicas for a tablet 
indeed, but the output is a little bit complex.

> kudu CLI command supports list the tablets under a table and list the 
> replicas of a tablet 
> ---
>
> Key: KUDU-2529
> URL: https://issues.apache.org/jira/browse/KUDU-2529
> Project: Kudu
>  Issue Type: Improvement
>  Components: CLI
>Affects Versions: 1.7.1
>Reporter: HeLifu
>Priority: Major
>
> kudu CLI command supports list the tablets under a table and list the 
> replicas of a tablet.
> Example:
>  
> {code:java}
> kudu table tablet  
> kudu tablet replica  
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (KUDU-2529) kudu CLI command supports list the tablets under a table and list the replicas of a tablet

2018-09-04 Thread HeLifu (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16594724#comment-16594724
 ] 

HeLifu edited comment on KUDU-2529 at 9/4/18 9:40 AM:
--

Thanks to [~granthenke]'s comments.

I added a new flag "{color:#ff}_-tables=_{color}" to "_kudu table 
list  -list_tablets_" command which could list the tablets 
under a table.

[https://gerrit.cloudera.org/#/c/11360/|https://gerrit.cloudera.org/#/c/11360/

 And the "_kudu cluster ksck  -tables= 
-tablets= -verbose_" command could list the replicas for a tablet 
indeed, but the output is a little bit complex.


was (Author: helifu):
Thanks to [~granthenke]'s comments.

I added a new flag "{color:#ff}_-tables=_{color}" to "_kudu table 
list  -list_tablets_" command which could list the tablets 
under a table.

[link title|https://gerrit.cloudera.org/#/c/11360/]

 

And the "_kudu cluster ksck  -tables= 
-tablets= -verbose_" command could list the replicas for a tablet 
indeed, but the output is a little bit complex.

> kudu CLI command supports list the tablets under a table and list the 
> replicas of a tablet 
> ---
>
> Key: KUDU-2529
> URL: https://issues.apache.org/jira/browse/KUDU-2529
> Project: Kudu
>  Issue Type: Improvement
>  Components: CLI
>Affects Versions: 1.7.1
>Reporter: HeLifu
>Priority: Major
>
> kudu CLI command supports list the tablets under a table and list the 
> replicas of a tablet.
> Example:
>  
> {code:java}
> kudu table tablet  
> kudu tablet replica  
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (KUDU-2529) kudu CLI command supports list the tablets under a table and list the replicas of a tablet

2018-09-04 Thread HeLifu (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16594724#comment-16594724
 ] 

HeLifu edited comment on KUDU-2529 at 9/4/18 9:39 AM:
--

Thanks to [~granthenke]'s comments.

I added a new flag "{color:#ff}_-tables=_{color}" to "_kudu table 
list  -list_tablets_" command which could list the tablets 
under a table.

[link title|https://gerrit.cloudera.org/#/c/11360/]

 

And the "_kudu cluster ksck  -tables= 
-tablets= -verbose_" command could list the replicas for a tablet 
indeed, but the output is a little bit complex.


was (Author: helifu):
Thanks to [~granthenke]'s comments.

I added a new flag "{color:#FF}_-tables=_{color}" to "_kudu table 
list  -list_tablets_" command which could list the tablets 
under a table.

And the "_kudu cluster ksck  -tables= 
-tablets= -verbose_" command could list the replicas for a tablet 
indeed, but the output is a little bit complex.

> kudu CLI command supports list the tablets under a table and list the 
> replicas of a tablet 
> ---
>
> Key: KUDU-2529
> URL: https://issues.apache.org/jira/browse/KUDU-2529
> Project: Kudu
>  Issue Type: Improvement
>  Components: CLI
>Affects Versions: 1.7.1
>Reporter: HeLifu
>Priority: Major
>
> kudu CLI command supports list the tablets under a table and list the 
> replicas of a tablet.
> Example:
>  
> {code:java}
> kudu table tablet  
> kudu tablet replica  
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KUDU-2566) support open-ended intervals & end up fetching one more rowset than necessary

2018-09-03 Thread HeLifu (JIRA)
HeLifu created KUDU-2566:


 Summary: support open-ended intervals &  end up fetching one  more 
rowset than necessary
 Key: KUDU-2566
 URL: https://issues.apache.org/jira/browse/KUDU-2566
 Project: Kudu
  Issue Type: Improvement
  Components: tablet, util
Affects Versions: 1.0.0
Reporter: HeLifu


In function of 'Tablet::CaptureConsistentIterators' of tablet.cc, there are two 
TODO:

1.TODO(todd): support open-ended intervals:

  In our kudu source code, we just cull row-sets when lower_bound_key and 
exclusive_upper_bound_key are existing at the same time. And if not, we will 
grab all row-sets of the tablet, then have to seek the key in 
‘CFileSet::Iterator::PushdownRangeScanPredicate’ for the unnecessary row-sets 
(which will waste disk io).

2.TODO(todd): the upper bound key is exclusive, but the RowSetTree function 
takes an inclusive interval. So, we might end up fetching one more rowset than 
necessary.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-2529) kudu CLI command supports list the tablets under a table and list the replicas of a tablet

2018-08-28 Thread HeLifu (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16594724#comment-16594724
 ] 

HeLifu commented on KUDU-2529:
--

Thanks to [~granthenke]'s comments.

I added a new flag "{color:#FF}_-tables=_{color}" to "_kudu table 
list  -list_tablets_" command which could list the tablets 
under a table.

And the "_kudu cluster ksck  -tables= 
-tablets= -verbose_" command could list the replicas for a tablet 
indeed, but the output is a little bit complex.

> kudu CLI command supports list the tablets under a table and list the 
> replicas of a tablet 
> ---
>
> Key: KUDU-2529
> URL: https://issues.apache.org/jira/browse/KUDU-2529
> Project: Kudu
>  Issue Type: Improvement
>  Components: CLI
>Affects Versions: 1.7.1
>Reporter: HeLifu
>Priority: Major
>
> kudu CLI command supports list the tablets under a table and list the 
> replicas of a tablet.
> Example:
>  
> {code:java}
> kudu table tablet  
> kudu tablet replica  
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-2527) Add Describe Table Tool

2018-08-07 Thread HeLifu (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16572576#comment-16572576
 ] 

HeLifu commented on KUDU-2527:
--

ok, i will have a try and here is the 
jira:[KUDU-2529|https://issues.apache.org/jira/browse/KUDU-2529]

 

> Add Describe Table Tool
> ---
>
> Key: KUDU-2527
> URL: https://issues.apache.org/jira/browse/KUDU-2527
> Project: Kudu
>  Issue Type: Improvement
>Reporter: Grant Henke
>Assignee: Grant Henke
>Priority: Major
>
> Add a tool to describe a table on the cli with similar information shown in 
> the table web ui. Perhaps include a verbosity flag or the option to provide 
> what "columns" of information to include. 
> Example: 
> {code}
> kudu table describe   ...
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KUDU-2529) kudu CLI command supports list the tablets under a table and list the replicas of a tablet

2018-08-07 Thread HeLifu (JIRA)
HeLifu created KUDU-2529:


 Summary: kudu CLI command supports list the tablets under a table 
and list the replicas of a tablet 
 Key: KUDU-2529
 URL: https://issues.apache.org/jira/browse/KUDU-2529
 Project: Kudu
  Issue Type: Improvement
  Components: CLI
Affects Versions: 1.7.1
Reporter: HeLifu


kudu CLI command supports list the tablets under a table and list the replicas 
of a tablet.

Example:

 
{code:java}
kudu table tablet  
kudu tablet replica  
{code}
 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-2527) Add Describe Table Tool

2018-08-07 Thread HeLifu (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16572531#comment-16572531
 ] 

HeLifu commented on KUDU-2527:
--

it would be better if the tool can list the tablets under a table and list the 
replicas of a tablet ;)

> Add Describe Table Tool
> ---
>
> Key: KUDU-2527
> URL: https://issues.apache.org/jira/browse/KUDU-2527
> Project: Kudu
>  Issue Type: Improvement
>Reporter: Grant Henke
>Assignee: Grant Henke
>Priority: Major
>
> Add a tool to describe a table on the cli with similar information shown in 
> the table web ui. Perhaps include a verbosity flag or the option to provide 
> what "columns" of information to include. 
> Example: 
> {code}
> kudu table describe   ...
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   >