[jira] [Created] (KUDU-2312) Scan predicate application ordering is possibly buggy

2018-02-16 Thread Dan Burkert (JIRA)
Dan Burkert created KUDU-2312:
-

 Summary: Scan predicate application ordering is possibly buggy
 Key: KUDU-2312
 URL: https://issues.apache.org/jira/browse/KUDU-2312
 Project: Kudu
  Issue Type: Bug
  Components: tserver
Affects Versions: 1.6.0
Reporter: Dan Burkert
Assignee: Dan Burkert


It appears that the {{SelectivityComparator}} that is supposed to ensure that
predicates are applied in selectivity order does not follow the interface for 
STL comparators.  Additionally, we aren't properly using tie-breakers to ensure 
a stable evaluation order.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KUDU-2304) Add TABLET_DATA_DELETED option to kudu remote_replica delete

2018-02-16 Thread Mike Percy (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-2304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Percy resolved KUDU-2304.
--
   Resolution: Won't Fix
Fix Version/s: n/a

Closing this in favor of KUDU-2311 which is less awkward to implement but also 
addresses the use case I had in mind for this change to the tool.

> Add TABLET_DATA_DELETED option to kudu remote_replica delete
> 
>
> Key: KUDU-2304
> URL: https://issues.apache.org/jira/browse/KUDU-2304
> Project: Kudu
>  Issue Type: Improvement
>  Components: ops-tooling
>Affects Versions: 1.6.0
>Reporter: Mike Percy
>Priority: Major
> Fix For: n/a
>
>
> The remote_replica delete tool should have the option to delete a replica in 
> "unsafe clean" mode in addition to the default of tombstoning, just like the 
> local_replica delete command. This is useful in certain recovery scenarios, 
> such as when the consensus metadata file is corrupted.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KUDU-2311) kudu remote_replica copy -force_copy should overwrite unreadable cmeta

2018-02-16 Thread Mike Percy (JIRA)
Mike Percy created KUDU-2311:


 Summary: kudu remote_replica copy -force_copy should overwrite 
unreadable cmeta
 Key: KUDU-2311
 URL: https://issues.apache.org/jira/browse/KUDU-2311
 Project: Kudu
  Issue Type: Improvement
  Components: consensus, ops-tooling
Affects Versions: 1.6.0
Reporter: Mike Percy


If the consensus metadata file is corrupt, and the remote copy is attempted 
with the "force" flag, the destination should allow the copy. Currently it 
rejects the copy because it cannot open the cmeta file and compare the 
last_logged_opid from request to the one on disk.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KUDU-2310) ksck shows an incorrect summary table when specifying the -tablet filter

2018-02-16 Thread Will Berkeley (JIRA)
Will Berkeley created KUDU-2310:
---

 Summary: ksck shows an incorrect summary table when specifying the 
-tablet filter
 Key: KUDU-2310
 URL: https://issues.apache.org/jira/browse/KUDU-2310
 Project: Kudu
  Issue Type: Bug
  Components: ksck
Affects Versions: 1.6.0
Reporter: Will Berkeley


Running ksck against a cluster with 3 tables each having three tablets while 
restricting ksck to just check one tablet shows the following wrong summary 
table:

{noformat}
$ build/latest/bin/kudu cluster ksck 
127.0.0.1:9878,127.0.0.1:9877,127.0.0.1:9876 
-tablets=2d14df764be34f948ef225cc49e5e37f
Connected to the Master
Fetched info from all 5 Tablet Servers
Table loadgen_auto_95d5198227bd4b93ba412cef94005397 is HEALTHY (0 tablet(s) 
checked)

Table loadgen_auto_c159d0d2aec140bda8b735a3603f50dd is HEALTHY (0 tablet(s) 
checked)

Table loadgen_auto_1cdce4a07ae546abb0d757d804d66e0e is HEALTHY (1 tablet(s) 
checked)

Table Summary
 Name  | Status  | Total Tablets | 
Healthy | Under-replicated | Unavailable
---+-+---+-+--+-
 loadgen_auto_1cdce4a07ae546abb0d757d804d66e0e | HEALTHY | 1 | 1
   | 0| 0
 loadgen_auto_95d5198227bd4b93ba412cef94005397 | HEALTHY | 0 | 0
   | 0| 0
 loadgen_auto_c159d0d2aec140bda8b735a3603f50dd | HEALTHY | 0 | 0
   | 0| 0
The metadata for 3 table(s) is HEALTHY
OK
{noformat}

This is pretty misleading, and a lot of worthless output besides.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KUDU-2309) /masters can show the wrong list of masters

2018-02-16 Thread Will Berkeley (JIRA)
Will Berkeley created KUDU-2309:
---

 Summary: /masters can show the wrong list of masters
 Key: KUDU-2309
 URL: https://issues.apache.org/jira/browse/KUDU-2309
 Project: Kudu
  Issue Type: Bug
  Components: ops-tooling
Affects Versions: 1.6.0
Reporter: Will Berkeley


Consider the following steps:
 # Three masters are started with UUIDs A, B, and C.
 # A is shut down and its data deleted. A new master with UUID D is started on 
the same machine.

After this, visiting /masters on B or C should show A, B, and C as the 
registered masters, since they are the masters in B and C's quorum. D's 
/masters should just show D. However, right now we show B, C, D in all three 
/masters pages.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KUDU-2308) Add a tool to dump master data in a readable format

2018-02-16 Thread Will Berkeley (JIRA)
Will Berkeley created KUDU-2308:
---

 Summary: Add a tool to dump master data in a readable format
 Key: KUDU-2308
 URL: https://issues.apache.org/jira/browse/KUDU-2308
 Project: Kudu
  Issue Type: Improvement
  Components: ops-tooling
Affects Versions: 1.6.0
Reporter: Will Berkeley


The kudu fs dump cfile command dumps a cfile. This can be used even against the 
master tablet, except that the useful master tablet columns are hard to read 
when dumped, being either an integer representing an enum or an encoded 
protobuf. We should build a specialized tool for dumping the master tablet from 
the local filesystem, or even remotely.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KUDU-2307) ksck should differentiate between different kinds of "unavailable"

2018-02-16 Thread Mike Percy (JIRA)
Mike Percy created KUDU-2307:


 Summary: ksck should differentiate between different kinds of 
"unavailable"
 Key: KUDU-2307
 URL: https://issues.apache.org/jira/browse/KUDU-2307
 Project: Kudu
  Issue Type: Improvement
  Components: ops-tooling
Affects Versions: 1.6.0
Reporter: Mike Percy


ksck should differentiate between different kinds of UNAVAILABLE tablets:

* OFFLINE (no replicas available)
 * READ-ONLY (only a minority of running replicas) -- is this a good name for 
this condition?
 * NO LEADER (majority of running replicas but no leader) – possibly transient 
condition during leader election

Marking tables with tablets undergoing a leader election as UNAVAILABLE is 
misleading and not very helpful



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KUDU-2306) ksck should also detect bad master replicas

2018-02-16 Thread Mike Percy (JIRA)
Mike Percy created KUDU-2306:


 Summary: ksck should also detect bad master replicas
 Key: KUDU-2306
 URL: https://issues.apache.org/jira/browse/KUDU-2306
 Project: Kudu
  Issue Type: Improvement
  Components: ops-tooling
Affects Versions: 1.6.0
Reporter: Mike Percy


ksck should detect bad master replicas, not just bad tserver tablet replicas.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-2153) Servers delete tmp files before obtaining directory lock

2018-02-16 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated KUDU-2153:
--
Priority: Critical  (was: Major)

> Servers delete tmp files before obtaining directory lock
> 
>
> Key: KUDU-2153
> URL: https://issues.apache.org/jira/browse/KUDU-2153
> Project: Kudu
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 1.2.0, 1.3.1, 1.4.0, 1.5.0
>Reporter: Todd Lipcon
>Priority: Critical
>
> In FsManager::Open() we currently call DeleteTmpFiles very early, before 
> starting the block manager. This means that, if you accidentally start a 
> tserver while another is running, it's possible for it to delete temporary 
> files that are in-use by the running tserver, causing it to exhibit strange 
> behavior, crash, etc (as in KUDU-2152).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-1954) Improve maintenance manager behavior in heavy write workload

2018-02-16 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/KUDU-1954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16367930#comment-16367930
 ] 

Todd Lipcon commented on KUDU-1954:
---

I think most of this has been improved in the last year:
bq. we don't schedule flushes until we are already in "backpressure" realm, so 
we spent most of our time doing backpressure
- KUDU-1949 changed this to start triggering flushes at 60%, and backpressure 
only starts at 80%.

bq. even if we configure N maintenance threads, we typically are only using 
~50% of those threads due to the scheduling granularity
40aa4c3c271c9df20a17a1d353ce582ee3fda742 (in 1.4.0) changed the MM to 
immediately schedule new work when a thread frees up.

bq. when we do hit the "memory-pressure flush" threshold, all threads quickly 
switch to flushing, which then brings us far beneath the threshold
bq. long running compactions can temporarily starve flushes
bq. high volume of writes can starve compactions
These three are not yet addressed, though various improvements to flush/compact 
performance make long-running ones less common.

> Improve maintenance manager behavior in heavy write workload
> 
>
> Key: KUDU-1954
> URL: https://issues.apache.org/jira/browse/KUDU-1954
> Project: Kudu
>  Issue Type: Improvement
>  Components: perf, tserver
>Affects Versions: 1.3.0
>Reporter: Todd Lipcon
>Priority: Major
> Attachments: mm-trace.png
>
>
> During the investigation in [this 
> doc|https://docs.google.com/document/d/1U1IXS1XD2erZyq8_qG81A1gZaCeHcq2i0unea_eEf5c/edit]
>  I found a few maintenance-manager-related issues during heavy writes:
> - we don't schedule flushes until we are already in "backpressure" realm, so 
> we spent most of our time doing backpressure
> - even if we configure N maintenance threads, we typically are only using 
> ~50% of those threads due to the scheduling granularity
> - when we do hit the "memory-pressure flush" threshold, all threads quickly 
> switch to flushing, which then brings us far beneath the threshold
> - long running compactions can temporarily starve flushes
> - high volume of writes can starve compactions



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-1900) Localhost connections to single-host clusters on Ubuntu don't skip TLS

2018-02-16 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-1900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated KUDU-1900:
--
Labels: newbie  (was: )

> Localhost connections to single-host clusters on Ubuntu don't skip TLS
> --
>
> Key: KUDU-1900
> URL: https://issues.apache.org/jira/browse/KUDU-1900
> Project: Kudu
>  Issue Type: Bug
>  Components: perf, security
>Reporter: Todd Lipcon
>Priority: Major
>  Labels: newbie
>
> On Ubuntu, it seems like we sometimes end up with connections from 127.0.1.1 
> to 127.0.0.1 when running a local cluster and connecting to to it from the 
> same machine. This is because Ubuntu puts an entry with the host's external 
> hostname in /etc/hosts as 127.0.1.1, and the tablet server ends up 
> registering with that name. The code that detects loopback connections sees 
> the "127.0.0.1 -> 127.0.1.1" and decides it's not loopback.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KUDU-1584) follower memory throttling results in error log messages on the leader

2018-02-16 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-1584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon resolved KUDU-1584.
---
   Resolution: Cannot Reproduce
Fix Version/s: n/a

This was fixed at some point in the last year as far as I can tell. Nowadays we 
see messages on the leader with the appropriate error message from the follower

> follower memory throttling results in error log messages on the leader
> --
>
> Key: KUDU-1584
> URL: https://issues.apache.org/jira/browse/KUDU-1584
> Project: Kudu
>  Issue Type: Bug
>Reporter: Todd Lipcon
>Priority: Major
> Fix For: n/a
>
>
> W0828 22:07:42.156687 49842 consensus_peers.cc:333] T 
> c3810e04cd5f4ce8aa8cef40bcf15e33 P f92dc14d005d45e08ab52cf8142ea5b1 -> Peer 
> 83e1da1e50ac4fbb9efa3310d58bb8ef (e1216.halxg.cloudera.com:7050): Couldn't 
> send request to peer 83e1da1e50ac4fbb9efa3310d58bb8ef for tablet 
> c3810e04cd5f4ce8aa8cef40bcf15e33. Status: Runtime error: (unknown error 
> code). Retrying in the next heartbeat period. Already tried 8 times.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KUDU-2305) Local variables can overflow when serializing a 2GB message

2018-02-16 Thread Joe McDonnell (JIRA)
Joe McDonnell created KUDU-2305:
---

 Summary: Local variables can overflow when serializing a 2GB 
message
 Key: KUDU-2305
 URL: https://issues.apache.org/jira/browse/KUDU-2305
 Project: Kudu
  Issue Type: Bug
  Components: rpc
Affects Versions: 1.6.0
Reporter: Joe McDonnell
Assignee: Joe McDonnell


When rpc_max_message_size is set to its maximum of INT_MAX (2147483647), 
certain local variables in SerializeMessage can overflow as messages approach 
this size. Specifically, recorded_size, size_with_delim, and total_size are 4 
byte signed integers and could overflow when additional_size becomes large.

Since INT_MAX is the largest allowable value for rpc_max_message_size (a 4 byte 
signed integer), these variables will not overflow if changed to 4 byte 
unsigned integers. This would eliminate the potential problem for serialization.

A similar problem exists in the InboundTransfer::ReceiveBuffer() and similar 
codepaths. Changing those variables to unsigned integers should resolve the 
issue.

This does not impact existing systems, because the default value of 
rpc_max_message_size is 50MB.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-1589) ksck should retry its Ping() RPC to tservers

2018-02-16 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-1589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated KUDU-1589:
--
Component/s: supportability

> ksck should retry its Ping() RPC to tservers
> 
>
> Key: KUDU-1589
> URL: https://issues.apache.org/jira/browse/KUDU-1589
> Project: Kudu
>  Issue Type: Bug
>  Components: ksck, supportability
>Affects Versions: 0.10.0
>Reporter: Todd Lipcon
>Priority: Major
>
> I ran a ksck while a cluster was under a stress test and it determined that 
> one of the tablet servers was "unavailable" because it responded 
> SERVER_TOO_BUSY to the Ping() RPC. We should either retry the Ping() or make 
> sure that the Ping goes to the 'admin service' rpc queue instead of the 
> 'tablet service' RPC queue (which is more likely to be saturated)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-817) Add stack watchdog and latency metric around reads

2018-02-16 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated KUDU-817:
-
Component/s: supportability

> Add stack watchdog and latency metric around reads
> --
>
> Key: KUDU-817
> URL: https://issues.apache.org/jira/browse/KUDU-817
> Project: Kudu
>  Issue Type: Improvement
>  Components: cfile, metrics, supportability
>Affects Versions: Private Beta
>Reporter: Todd Lipcon
>Priority: Minor
>
> This is pretty easy and would be helpful for debugging cases like KUDU-742.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-636) optimization: we spend a lot of time in alloc/free

2018-02-16 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/KUDU-636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16367912#comment-16367912
 ] 

Todd Lipcon commented on KUDU-636:
--

Brain dump update on some of the points above:
1) Threadpool is now LIFO as of 1.7 which should help keep a smaller number of 
threads with "hot" caches
2) We've removed almost all of the per-tablet threads and they now use shared 
threadpools
3) we should still look into thread cache tuning. perhaps we could even have 
kudu automatically tune at runtime to allow large thread caches when the 
process memory is under its soft threshold and then tune it down as it reaches 
its high threshold. Then we wouldn't need users to concern themselves with this 
tuning parameter.
4) we're on PB3 now but haven't yet switched on arena support

> optimization: we spend a lot of time in alloc/free
> --
>
> Key: KUDU-636
> URL: https://issues.apache.org/jira/browse/KUDU-636
> Project: Kudu
>  Issue Type: Improvement
>  Components: perf
>Affects Versions: Public beta
>Reporter: Todd Lipcon
>Priority: Major
>
> Looking at a workload in the cluster, several of the top 10 lines of perf 
> report are tcmalloc-related. It seems like we don't do a good job of making 
> use of the per-thread free-lists, and we end up in a lot of contention on the 
> central free list. There are a few low-hanging fruit things we could do to 
> improve this for a likely perf boost.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-2295) nullptr dereference while scanning on already shutdown tablet replica

2018-02-16 Thread Alexey Serbin (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-2295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Serbin updated KUDU-2295:

Code Review:   (was: http://gerrit.cloudera.org:8080/9350)

> nullptr dereference while scanning on already shutdown tablet replica
> -
>
> Key: KUDU-2295
> URL: https://issues.apache.org/jira/browse/KUDU-2295
> Project: Kudu
>  Issue Type: Bug
>  Components: tserver
>Affects Versions: 1.7.0
>Reporter: Alexey Serbin
>Assignee: Alexey Serbin
>Priority: Major
>
> While running the \{{raft_consensus_stress-itest}}, I find one of tablet 
> servers crashed with the following stack trace:
> {noformat}
>      
> *** Aborted at 1518480865 (unix time) try "date -d @1518480865" if you are 
> using GNU date ***
> PC: @ 0x7f1e02025790 scoped_refptr<>::operator->()
>   
> *** SIGSEGV (@0x160) received by PID 8782 (TID 0x7f1de3c7e700) from PID 352; 
> stack trace: ***
>     @ 0x7f1dfdcfc330 (unknown) at ??:0
>   
>     @ 0x7f1e02025790 scoped_refptr<>::operator->() at ??:0
>   
>     @ 0x7f1e00ae62e7 kudu::tablet::Tablet::GetTabletAncientHistoryMark() 
> at ??:0
>     @ 0x7f1e00ae627d kudu::tablet::Tablet::GetHistoryGcOpts() at ??:0 
>   
>     @ 0x7f1e02012c53 kudu::tserver::(anonymous 
> namespace)::VerifyNotAncientHistory() at ??:0
>     @ 0x7f1e0201223b 
> kudu::tserver::TabletServiceImpl::HandleScanAtSnapshot() at ??:0
>     @ 0x7f1e0200c6dd 
> kudu::tserver::TabletServiceImpl::HandleNewScanRequest() at ??:0
>     @ 0x7f1e02009d33 kudu::tserver::TabletServiceImpl::Scan() at ??:0 
>   
>     @ 0x7f1dfc90de4d 
> kudu::tserver::TabletServerServiceIf::TabletServerServiceIf()::$_5::operator()()
>  at ??:0
>     @ 0x7f1dfc90dc92 std::_Function_handler<>::_M_invoke() at ??:0
>   
>     @ 0x7f1dfba728ab std::function<>::operator()() at ??:0
>   
>     @ 0x7f1dfba7216d kudu::rpc::GeneratedServiceIf::Handle() at ??:0  
>   
>     @ 0x7f1dfba74526 kudu::rpc::ServicePool::RunThread() at ??:0  
>   
>     @ 0x7f1dfba76ad9 boost::_mfi::mf0<>::operator()() at ??:0 
>   
>     @ 0x7f1dfba76a40 boost::_bi::list1<>::operator()<>() at ??:0  
>   
>     @ 0x7f1dfba769ea boost::_bi::bind_t<>::operator()() at ??:0   
>   
>     @ 0x7f1dfba767cd 
> boost::detail::function::void_function_obj_invoker0<>::invoke() at ??:0
>     @ 0x7f1dfba190f8 boost::function0<>::operator()() at ??:0 
>   
>     @ 0x7f1df9d1788d kudu::Thread::SuperviseThread() at ??:0  
>   
>     @ 0x7f1dfdcf4184 start_thread at ??:0 
>   
>     @ 0x7f1df6023ffd clone at ??:0
>   
>     @    0x0 (unknown){noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-1683) Kudu client support for pushing runtime min/max filters

2018-02-16 Thread Thomas Tauber-Marshall (JIRA)

[ 
https://issues.apache.org/jira/browse/KUDU-1683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16367768#comment-16367768
 ] 

Thomas Tauber-Marshall commented on KUDU-1683:
--

AFAIK, the proposal to apply predicates mid-scan hasn't been implemented (at 
least, Impala doesn't use this functionality currently), so you could maybe 
file another JIRA for that so that we don't lose the idea, but I don't think 
that's very important to Impala in the short term so I would say this JIRA can 
be closed.

> Kudu client support for pushing runtime min/max filters
> ---
>
> Key: KUDU-1683
> URL: https://issues.apache.org/jira/browse/KUDU-1683
> Project: Kudu
>  Issue Type: New Feature
>  Components: client, perf
>Affects Versions: 1.0.0
>Reporter: Matthew Jacobs
>Priority: Major
>  Labels: impala
>
> Impala would like to generate runtime min/max filters to be pushed to Kudu, 
> at least for scan tokens that haven't been opened yet.
> https://issues.cloudera.org/browse/IMPALA-4252



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KUDU-1334) Support pid_max > 16 bits in the mini cluster

2018-02-16 Thread Alexey Serbin (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-1334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Serbin resolved KUDU-1334.
-
   Resolution: Fixed
Fix Version/s: 1.6.0

[~granthenke] nice find!  Yes, indeed: this was fixed in 
0bd8800b77a5de53b5db896df20abb83ff2e1779.

> Support pid_max > 16 bits in the mini cluster
> -
>
> Key: KUDU-1334
> URL: https://issues.apache.org/jira/browse/KUDU-1334
> Project: Kudu
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 0.5.0
>Reporter: Jean-Daniel Cryans
>Assignee: Alexey Serbin
>Priority: Major
> Fix For: 1.6.0
>
>
> Pretty much anybody running on newer machines/platforms will hit this while 
> running the unit tests:
> {noformat}
> I0216 11:27:23.617383 110702 external_mini_cluster.cc:582] Started
> /home/stack/apache-kudu-incubating-0.7.0/build/rc/bin/kudu-master as pid
> 110706
> F0216 11:27:23.617473 110702 external_mini_cluster.cc:258] Check failed: p
> <= MathLimits::kMax (110702 vs. 65535) Cannot run on systems with
> >16-bit pid
> *** Check failure stack trace: ***
> {noformat}
> Having this limitation was fine but now it's something everybody hits.
> The workaround is running this:
> {noformat}
> echo "32768" > /proc/sys/kernel/pid_max
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KUDU-2304) Add TABLET_DATA_DELETED option to kudu remote_replica delete

2018-02-16 Thread Mike Percy (JIRA)
Mike Percy created KUDU-2304:


 Summary: Add TABLET_DATA_DELETED option to kudu remote_replica 
delete
 Key: KUDU-2304
 URL: https://issues.apache.org/jira/browse/KUDU-2304
 Project: Kudu
  Issue Type: Improvement
  Components: ops-tooling
Affects Versions: 1.6.0
Reporter: Mike Percy


The remote_replica delete tool should have the option to delete a replica in 
"unsafe clean" mode in addition to the default of tombstoning, just like the 
local_replica delete command. This is useful in certain recovery scenarios, 
such as when the consensus metadata file is corrupted.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-18) Handle "slow readers" not holding too much memory

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-18?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-18:

Target Version/s: 1.8.0  (was: 1.5.0)

> Handle "slow readers" not holding too much memory
> -
>
> Key: KUDU-18
> URL: https://issues.apache.org/jira/browse/KUDU-18
> Project: Kudu
>  Issue Type: Improvement
>  Components: tablet
>Affects Versions: M3
>Reporter: Todd Lipcon
>Priority: Major
>
> Currently if a scanner is held open, it can hold on to resources from the 
> tablet indefinitely. This includes memory resources like 
> MemRowSet/DeltaMemStore which can be quite large. We should figure out a way 
> to forceably expire slow scanners or otherwise migrate them to the flushed 
> copies.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-90) Add a header checksum to our RPC protocol

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-90?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-90:

Target Version/s: 1.8.0  (was: 1.5.0)

> Add a header checksum to our RPC protocol
> -
>
> Key: KUDU-90
> URL: https://issues.apache.org/jira/browse/KUDU-90
> Project: Kudu
>  Issue Type: Bug
>  Components: rpc
>Affects Versions: M4
>Reporter: Jean-Daniel Cryans
>Priority: Major
>  Labels: newbie
>
> See the context for this here: 
> http://gerrit.ent.cloudera.com:8080/?l=259#/c/1077/8/java/kudu-client/src/main/java/kudu/rpc/KuduRpc.java



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-109) Keep a record checksum for each record in the WAL

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-109:
-
Target Version/s:   (was: Public beta)

> Keep a record checksum for each record in the WAL
> -
>
> Key: KUDU-109
> URL: https://issues.apache.org/jira/browse/KUDU-109
> Project: Kudu
>  Issue Type: Improvement
>  Components: tablet
>Affects Versions: M5
>Reporter: Mike Percy
>Assignee: Todd Lipcon
>Priority: Minor
>
> We don't currently keep a checksum of WAL records, only a length and the 
> serialized LogEntry protobuf. Add a checksum following each protobuf.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-258) Create an integration test that performs writes with multiple consistency modes

2018-02-16 Thread Grant Henke (JIRA)

[ 
https://issues.apache.org/jira/browse/KUDU-258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16367493#comment-16367493
 ] 

Grant Henke commented on KUDU-258:
--

Is this still "In Progress"?

> Create an integration test that performs writes with multiple consistency 
> modes
> ---
>
> Key: KUDU-258
> URL: https://issues.apache.org/jira/browse/KUDU-258
> Project: Kudu
>  Issue Type: Sub-task
>  Components: tserver
>Affects Versions: M3
>Reporter: David Alves
>Assignee: David Alves
>Priority: Major
>
> Right now we test consistency modes independently, but they will eventually 
> coexist and that can spawn trouble (e.g. KUDU-242). We should have an 
> integration test that runs writes on multiple consistency modes at the same 
> time.
> Plus we should have the YCSB run on multiple consistency modes at the same 
> time (need to revive/cleanup what I did for the HT paper)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-418) Dynamically update peer host-ports based on registration in master

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-418:
-
Target Version/s:   (was: 1.5.0)

> Dynamically update peer host-ports based on registration in master
> --
>
> Key: KUDU-418
> URL: https://issues.apache.org/jira/browse/KUDU-418
> Project: Kudu
>  Issue Type: Improvement
>  Components: consensus, master
>Affects Versions: M4
>Reporter: Todd Lipcon
>Priority: Major
>
> Currently we assume that, once initialized, a tserver never changes its IP or 
> port. That's a poor assumption (eg environments like ec2, or even on-premise 
> when a network gets renumbered).
> We should introduce a "resolver" interface to map from uuid->hostport by 
> contacting the master (the master already maintains this registry locally). 
> Consensus needs to be able to use this interface when it has trouble 
> connecting to its cached peer hostport info.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-423) Implement scalable and performant on-disk storage

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-423:
-
Target Version/s: 1.8.0  (was: 1.5.0)

> Implement scalable and performant on-disk storage
> -
>
> Key: KUDU-423
> URL: https://issues.apache.org/jira/browse/KUDU-423
> Project: Kudu
>  Issue Type: New Feature
>  Components: fs
>Affects Versions: M4
>Reporter: Todd Lipcon
>Assignee: Adar Dembo
>Priority: Major
>  Labels: kudu-roadmap, scalability
>
> Our current on-disk storage has a number of issues:
> - only writes to one local disk (need to scale to at least 12)
> - exhausts file descriptors quickly (see KUDU-56)
> - causes a ton of seeks at startup (KUDU-374)
> - has big latency stalls on write due to ext4 journal-related chunky writeback
> - doesn't support features we'll eventually want to support such as tiered 
> storage



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-430) Consistent Operations

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-430:
-
Target Version/s: 1.8.0  (was: 1.5.0)

> Consistent Operations
> -
>
> Key: KUDU-430
> URL: https://issues.apache.org/jira/browse/KUDU-430
> Project: Kudu
>  Issue Type: New Feature
>  Components: client, tablet, tserver
>Affects Versions: M4
>Reporter: Todd Lipcon
>Assignee: David Alves
>Priority: Major
>  Labels: kudu-roadmap
>
> This ticket tracks consistency/isolation work for 1.2.
> Scope Doc: 
> https://docs.google.com/document/d/1EaKlJyQdMBz6G-Xn5uktY-d_x0uRmjMCrDGP5rZ7AoI/edit#
> The sub-tasks that don't target 1.2 will likely be moved somewhere else, or 
> promoted to tasks once this ticket is done, but for now it's handy to have a 
> single view of all the remaining work



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-437) Support tablet splits

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-437:
-
Target Version/s: 1.8.0  (was: 1.5.0)

> Support tablet splits
> -
>
> Key: KUDU-437
> URL: https://issues.apache.org/jira/browse/KUDU-437
> Project: Kudu
>  Issue Type: Task
>  Components: consensus, tablet
>Affects Versions: M4.5
>Reporter: Jean-Daniel Cryans
>Priority: Minor
>  Labels: kudu-roadmap
>
> Pre-splitting can get us pretty far but at some point we'll need to have 
> manual and automatic tablet splitting.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-515) Log-dump, and anything else that acts on logs must handle schema changes mid segment.

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-515:
-
Target Version/s:   (was: 1.5.0)

> Log-dump, and anything else that acts on logs must handle schema changes mid 
> segment.
> -
>
> Key: KUDU-515
> URL: https://issues.apache.org/jira/browse/KUDU-515
> Project: Kudu
>  Issue Type: Sub-task
>  Components: log, tablet
>Affects Versions: M4.5
>Reporter: Alex Feinberg
>Assignee: Todd Lipcon
>Priority: Minor
>
> Currently (see KUDU-508) we log the tablet's schema to the log segment's 
> header. However, a schema may change mid-flight. We need to be able to handle 
> this gracefully.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-543) With big rows we can easily get a scan response that is higher than our max allowed frame size

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-543:
-
Target Version/s: 1.8.0  (was: 1.5.0)

> With big rows we can easily get a scan response that is higher than our max 
> allowed frame size
> --
>
> Key: KUDU-543
> URL: https://issues.apache.org/jira/browse/KUDU-543
> Project: Kudu
>  Issue Type: Bug
>  Components: tserver
>Affects Versions: M4.5
>Reporter: David Alves
>Priority: Major
>
> Our minimum scanning unit is a row block. If an (encoded) row block is bigger 
> than the max frame size we cannot send it through the wire. We need to be 
> able to split into smaller units so that we always send less than the frame 
> size.
> Steps to repro: create a bunch of rows with 128kb and try to scan them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-560) Consensus/WAL/Transactions Optimizations and tests

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-560:
-
Target Version/s: 1.8.0  (was: 1.5.0)

> Consensus/WAL/Transactions Optimizations and tests
> --
>
> Key: KUDU-560
> URL: https://issues.apache.org/jira/browse/KUDU-560
> Project: Kudu
>  Issue Type: Improvement
>  Components: consensus, log
>Affects Versions: M4.5
>Reporter: David Alves
>Priority: Major
>
> This is an umbrella jira for several optimization and tests that we should 
> add in the near future.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-582) Send TS specific errors back to the client when the client is supposed to take specific actions, such as trying another replica

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-582:
-
Target Version/s: 1.7.0  (was: 1.5.0)

> Send TS specific errors back to the client when the client is supposed to 
> take specific actions, such as trying another replica
> ---
>
> Key: KUDU-582
> URL: https://issues.apache.org/jira/browse/KUDU-582
> Project: Kudu
>  Issue Type: Bug
>  Components: client, consensus, tserver
>Affects Versions: M4.5
>Reporter: David Alves
>Priority: Critical
>
> Right now we're sending umbrella statuses that the client is supposed to 
> interpret as a command that it should failover to another replica. This is 
> misusing statuses but it's also a problem in that we're likely (or will 
> likely) sending the same statuses (illegal state and abort) in places where 
> we don't mean the client to failover.
> This should be treated holistically in both clients and in the server 
> components.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-582) Send TS specific errors back to the client when the client is supposed to take specific actions, such as trying another replica

2018-02-16 Thread Grant Henke (JIRA)

[ 
https://issues.apache.org/jira/browse/KUDU-582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16367480#comment-16367480
 ] 

Grant Henke commented on KUDU-582:
--

Should this still be open?

> Send TS specific errors back to the client when the client is supposed to 
> take specific actions, such as trying another replica
> ---
>
> Key: KUDU-582
> URL: https://issues.apache.org/jira/browse/KUDU-582
> Project: Kudu
>  Issue Type: Bug
>  Components: client, consensus, tserver
>Affects Versions: M4.5
>Reporter: David Alves
>Priority: Critical
>
> Right now we're sending umbrella statuses that the client is supposed to 
> interpret as a command that it should failover to another replica. This is 
> misusing statuses but it's also a problem in that we're likely (or will 
> likely) sending the same statuses (illegal state and abort) in places where 
> we don't mean the client to failover.
> This should be treated holistically in both clients and in the server 
> components.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-591) consensus: Return specific RPC error codes when ops fail due to incorrect role

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-591:
-
Target Version/s:   (was: 1.5.0)

> consensus: Return specific RPC error codes when ops fail due to incorrect role
> --
>
> Key: KUDU-591
> URL: https://issues.apache.org/jira/browse/KUDU-591
> Project: Kudu
>  Issue Type: Bug
>  Components: consensus, rpc
>Affects Versions: M4.5
>Reporter: Mike Percy
>Priority: Major
>
> We currently return generic RPC error codes like IllegalState when you try to 
> write to a non-leader replica. We need to plumb specific error codes all the 
> way through the consensus system and return them to the client, or risk 
> having brittle or overly defensive error handling code on the client or 
> master side.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-605) Implement block placement hints

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-605:
-
Target Version/s:   (was: 1.5.0)

> Implement block placement hints
> ---
>
> Key: KUDU-605
> URL: https://issues.apache.org/jira/browse/KUDU-605
> Project: Kudu
>  Issue Type: Sub-task
>  Components: fs
>Affects Versions: M5
>Reporter: Adar Dembo
>Assignee: Adar Dembo
>Priority: Trivial
>
> The block managers need an API that lets users choose where blocks are 
> placed. Different block managers support different levels of granularity 
> (e.g. locality in the log block manager can refer to disks or containers), 
> and so we need to find a way to abstractly describe this stuff.
> Some sample hints:
> # As close as possible to block X
> # As close as possible to 
> # On disk A
> # On disk A, B, or C
> # On any disk
> # Some combination of 1-2 and 3-5 (i.e. block locality + disk locality)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-616) Mitigate tablet damage when disks are lost

2018-02-16 Thread Grant Henke (JIRA)

[ 
https://issues.apache.org/jira/browse/KUDU-616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16367479#comment-16367479
 ] 

Grant Henke commented on KUDU-616:
--

[~andrew.wong] was this work handled in some other jiras?

> Mitigate tablet damage when disks are lost
> --
>
> Key: KUDU-616
> URL: https://issues.apache.org/jira/browse/KUDU-616
> Project: Kudu
>  Issue Type: Sub-task
>  Components: fs
>Affects Versions: M5
>Reporter: Adar Dembo
>Assignee: Andrew Wong
>Priority: Major
>
> Disk loss is an unfortunate fact of life, and Kudu should provide mechanisms 
> for mitigating disk loss.
> # Make it possible to isolate specific tablets to some subset of the 
> machine's disks, so that if one disk dies it doesn't take out all the tablets 
> with it. This is more complicated than it looks:
> ** We need a concrete way of describing disk groups. It can be per-node, or 
> abstract enough that it makes sense across the entire cluster, or perhaps we 
> aggregate information (e.g. ten machines have 5 disks and the other forty 
> machines have 6 disks).
> ** This mechanism needs to be used for both data blocks and other bits of 
> metadata (master blocks, superblocks, and other random files).
> ** Presumably it needs to be provided when a table is created (or a tablet is 
> split), and it needs to be persisted as part of tablet metadata. It might be 
> sufficient to express it in Kudu configuration (i.e. complex gflags) but 
> since it can be associated to tablet metadata, it's hard to see how this 
> would work.
> # When a disk fails, the server needs to handle it appropriately (mark it as 
> failed, put affected tablets in a failed state, etc.).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-636) optimization: we spend a lot of time in alloc/free

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-636:
-
Target Version/s:   (was: 1.5.0)

> optimization: we spend a lot of time in alloc/free
> --
>
> Key: KUDU-636
> URL: https://issues.apache.org/jira/browse/KUDU-636
> Project: Kudu
>  Issue Type: Improvement
>  Components: perf
>Affects Versions: Public beta
>Reporter: Todd Lipcon
>Priority: Major
>
> Looking at a workload in the cluster, several of the top 10 lines of perf 
> report are tcmalloc-related. It seems like we don't do a good job of making 
> use of the per-thread free-lists, and we end up in a lot of contention on the 
> central free list. There are a few low-hanging fruit things we could do to 
> improve this for a likely perf boost.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-623) Evaluate automated deploy+tests on EC2 or Cloudstack

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-623:
-
Target Version/s:   (was: Public beta)

> Evaluate automated deploy+tests on EC2 or Cloudstack
> 
>
> Key: KUDU-623
> URL: https://issues.apache.org/jira/browse/KUDU-623
> Project: Kudu
>  Issue Type: Task
>  Components: test
>Affects Versions: Public beta
>Reporter: Todd Lipcon
>Priority: Trivial
>
> Currently, we have a 7-node cluster internally mostly running ITBLL. This 
> cluster does double duty of testing master and occasionally testing in-flight 
> patches prior to commit. When we're in the latter mode, we don't always have 
> good coverage of our current master. We also have a 10 nodes cluster that 
> runs YCSB every 3 hours.
> We should invest some time in building an easy (i.e anyone can do it by 
> following directions) deploy+test infrastructure based on amazon or 
> cloudstack. We can probably use cheap instances, since we're mostly concerned 
> with correctness and not performance. Ideally, once we have such a thing, we 
> could have some policy like starting a new cluster every 2 days, and running 
> each cluster for 7 days. Once a week we could start a cluster which we will 
> let run for a month or something to get better longevity.
> The overall goal is to hit multiple points in the recency/longevity trade-off 
> while also keeping costs manageable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-639) Leader doesn't overwrite demoted follower's log properly

2018-02-16 Thread Grant Henke (JIRA)

[ 
https://issues.apache.org/jira/browse/KUDU-639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16367473#comment-16367473
 ] 

Grant Henke commented on KUDU-639:
--

Should we close this and open a Jira to track adding an integration test?

> Leader doesn't overwrite demoted follower's log properly
> 
>
> Key: KUDU-639
> URL: https://issues.apache.org/jira/browse/KUDU-639
> Project: Kudu
>  Issue Type: Bug
>  Components: consensus
>Affects Versions: M4.5
>Reporter: David Alves
>Assignee: Todd Lipcon
>Priority: Minor
>
> We just ran into this situation in the YCSB cluster, which is apparently a 
> log divergence.
> We have nodes a, b, c (corresponding to nodes 
> 33c8fb1dc4434df0938ccc27ecfd58a1/a1219, 
> 4ed2e09f80e04d198edeb53e15b3539e/a1220, 
> ab8ed89f9041495a95b8d2b77591c9d7/a1215).
> Node a is leader for term 3, timesout
> Node b is elected leader for term 5 with votes from b, c
> When b is elected leader the log state is:
> State: All replicated op: 3.6546, Majority replicated op: 3.6533, Committed 
> index: 3.6533, Last appended: 3.6546, Current term: 5
> b never actually replicates anything and eventually loses leadership to node 
> a, again.
> When b loses leadership it's wall is at the following state:
> State: All replicated op: 0.0, Majority replicated op: 3.6533, Committed 
> index: 3.6533, Last appended: 5.6547, Current term: 5
> That is b appended a message in term 5 but never actually got to commit it.
> However, if we look at b's log we find a message in term 5 committed:
> 3.6546@99404  REPLICATE WRITE_OP
> COMMIT 3.6533
> 5.6547@99789  REPLICATE CHANGE_CONFIG_OP
> COMMIT 3.6535
> COMMIT 3.6536
> COMMIT 3.6537
> COMMIT 3.6538
> COMMIT 3.6534
> COMMIT 3.6541
> COMMIT 3.6540
> COMMIT 3.6543
> COMMIT 3.6542
> COMMIT 3.6545
> COMMIT 3.6546
> COMMIT 3.6544
> COMMIT 3.6539
> COMMIT 5.6547
> 3.6548@99430  REPLICATE WRITE_OP
> 6.6549@99795  REPLICATE CHANGE_CONFIG_OP
> And more problematically, that diverges from the other two nodes's logs:
> 3.6546@99404  REPLICATE WRITE_OP
> COMMIT 3.6533
> COMMIT 3.6536
> COMMIT 3.6537
> COMMIT 3.6535
> COMMIT 3.6539
> COMMIT 3.6538
> COMMIT 3.6534
> COMMIT 3.6541
> COMMIT 3.6540
> COMMIT 3.6543
> COMMIT 3.6542
> COMMIT 3.6544
> 3.6547@99429  REPLICATE WRITE_OP
> 3.6548@99430  REPLICATE WRITE_OP
> 6.6549@99795  REPLICATE CHANGE_CONFIG_OP
> 6.6550@99878  REPLICATE WRITE_OP
> 6.6551@99879  REPLICATE WRITE_OP
> 6.6552@99880  REPLICATE WRITE_OP
> COMMIT 3.6545
> COMMIT 3.6548
> COMMIT 3.6547
> COMMIT 3.6546
> COMMIT 6.6549



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-646) Check the scan type before applying the selection vector

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-646:
-
Target Version/s:   (was: 1.5.0)

> Check the scan type before applying the selection vector
> 
>
> Key: KUDU-646
> URL: https://issues.apache.org/jira/browse/KUDU-646
> Project: Kudu
>  Issue Type: Improvement
>  Components: perf, tablet
>Affects Versions: M5
>Reporter: Andrew Wang
>Priority: Trivial
>
> As pointed out during code review of the MergeIterator, after merging the 
> selection vector is all true (all rows selected).
> This means during a scan, if we are using the MergeIterator, we can skip 
> checking the resulting SelectionVector. This will save us some CPU.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-675) Collect perf stats in benchmarks

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-675:
-
Labels: beginner newbie starter  (was: )

> Collect perf stats in benchmarks
> 
>
> Key: KUDU-675
> URL: https://issues.apache.org/jira/browse/KUDU-675
> Project: Kudu
>  Issue Type: Improvement
>  Components: test
>Affects Versions: Public beta
>Reporter: Todd Lipcon
>Priority: Trivial
>  Labels: beginner, newbie, starter
>
> Several of our benchmarks vary over time with no code changes. It's not clear 
> why. We should get perf stats on the benchmark runs as the next level of 
> diagnosis



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-683) Clean up multi-master tech debt

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-683:
-
Target Version/s: 1.8.0  (was: 1.5.0)

> Clean up multi-master tech debt
> ---
>
> Key: KUDU-683
> URL: https://issues.apache.org/jira/browse/KUDU-683
> Project: Kudu
>  Issue Type: Bug
>  Components: client
>Affects Versions: M5
>Reporter: Adar Dembo
>Priority: Major
>
> Multi-master support in the C++ client has introduced a fair amount of 
> RPC-related tech debt. There's a lot of duplication in the handling of 
> timeouts, retries, and error conditions. The various callbacks are also 
> tricky to follow and error prone. Now that the code has settled and we 
> understand what's painful about it, we're in a better position to fix it.
> Here's a high-level design idea: there should only be one RPC class that's 
> responsible for RPC delivery end-to-end, including retries, leader master 
> discovery, etc. Within that class there should be a single callback that's 
> reused for every asynchronous function, and there should be a separate state 
> machine that tracks the ongoing status of the RPC. Per-RPC specialization 
> should be as minimal as possible, via templates on the PBs, callbacks, or, 
> worst case, subclassing.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-682) Add metric for number of tablets in each consensus state

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-682:
-
Target Version/s: 1.8.0  (was: Public beta)

> Add metric for number of tablets in each consensus state
> 
>
> Key: KUDU-682
> URL: https://issues.apache.org/jira/browse/KUDU-682
> Project: Kudu
>  Issue Type: Improvement
>  Components: metrics, ops-tooling
>Affects Versions: Public beta
>Reporter: Todd Lipcon
>Priority: Minor
>
> Would be a useful metric to count the number of tablets in each consensus 
> state (eg leader/follower/candidate)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-675) Collect perf stats in benchmarks

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-675:
-
Target Version/s: 1.8.0  (was: 1.5.0)

> Collect perf stats in benchmarks
> 
>
> Key: KUDU-675
> URL: https://issues.apache.org/jira/browse/KUDU-675
> Project: Kudu
>  Issue Type: Improvement
>  Components: test
>Affects Versions: Public beta
>Reporter: Todd Lipcon
>Priority: Trivial
>
> Several of our benchmarks vary over time with no code changes. It's not clear 
> why. We should get perf stats on the benchmark runs as the next level of 
> diagnosis



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-709) Add ability to "quarantine" a tablet replica

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-709:
-
Target Version/s:   (was: 1.5.0)

> Add ability to "quarantine" a tablet replica
> 
>
> Key: KUDU-709
> URL: https://issues.apache.org/jira/browse/KUDU-709
> Project: Kudu
>  Issue Type: Improvement
>  Components: tablet, tserver
>Affects Versions: Private Beta
>Reporter: Todd Lipcon
>Priority: Major
>
> Occasionally we have bugs which cause an "operation of death" to cause a 
> tserver to crash. Often, this same operation will then cause it to crash 
> again when you restart, as it is faithfully replayed.
> We should add some tool which allows a tablet to be put into a "quarantine" 
> state, where we skip bootstrapping it, so that we can at least allow the 
> tserver to restart.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-740) Add MemorySanitizer (MSAN) build

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-740:
-
Target Version/s: 1.8.0  (was: 1.5.0)

> Add MemorySanitizer (MSAN) build
> 
>
> Key: KUDU-740
> URL: https://issues.apache.org/jira/browse/KUDU-740
> Project: Kudu
>  Issue Type: Improvement
>  Components: build, test
>Affects Versions: Public beta
>Reporter: Todd Lipcon
>Priority: Minor
>
> MemorySanitizer is yet another nice tool that ships with clang that detects 
> cases where we read uninitialized memory. We should add another build which 
> runs MSAN on all our unit tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-694) Re-visit C++ client scan retry logic

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-694:
-
Target Version/s: 1.8.0  (was: 1.5.0)

> Re-visit C++ client scan retry logic
> 
>
> Key: KUDU-694
> URL: https://issues.apache.org/jira/browse/KUDU-694
> Project: Kudu
>  Issue Type: Bug
>  Components: client
>Affects Versions: Private Beta
>Reporter: Andrew Wang
>Priority: Major
>
> There are a number of remaining issues with scanner robustness, even after 
> KUDU-597:
> * Once a node is marked as failed, it will not be used again in the call. 
> This is more of an issue with longer timeouts (since the node is more likely 
> to come back), or if the scan is LEADER_ONLY (since only one node being down 
> leads to unavailability).
> * In the LEADER_ONLY case, since we don't refresh quorum information within 
> the call, we won't recover when a failover happens.
> * The scanner code calls a number of other RPCs that are not retried on 
> error, i.e. LookupTabletByKey or RefreshProxy's DNS resolution in 
> GetTabletServer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-801) Delta flush doesn't wait for transactions to commit

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-801:
-
Target Version/s: 1.7.0  (was: 1.6.0)

> Delta flush doesn't wait for transactions to commit
> ---
>
> Key: KUDU-801
> URL: https://issues.apache.org/jira/browse/KUDU-801
> Project: Kudu
>  Issue Type: Bug
>  Components: tablet
>Affects Versions: Private Beta
>Reporter: Todd Lipcon
>Assignee: Hao Hao
>Priority: Critical
>
> I saw a case of mt-tablet-test failing with what I think is the following 
> scenario:
> - transaction applies an update to DMS
> - delta flush happens
> - major delta compaction runs (the update is now part of base data and we 
> have an UNDO)
> - the RS is selected for compaction
> - CHECK failure because the UNDO delta contains something that is not yet 
> committed.
> We probably need to ensure that we don't Flush data which isn't yet committed 
> from an MVCC standpoint.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-740) Add MemorySanitizer (MSAN) build

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-740:
-
Target Version/s: 1.8.0  (was: 1.5.0)

> Add MemorySanitizer (MSAN) build
> 
>
> Key: KUDU-740
> URL: https://issues.apache.org/jira/browse/KUDU-740
> Project: Kudu
>  Issue Type: Improvement
>  Components: build, test
>Affects Versions: Public beta
>Reporter: Todd Lipcon
>Priority: Minor
>
> MemorySanitizer is yet another nice tool that ships with clang that detects 
> cases where we read uninitialized memory. We should add another build which 
> runs MSAN on all our unit tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-792) Investigate "optimization" where LMP mismatch is handled in two places

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-792:
-
Target Version/s:   (was: 1.5.0)

> Investigate "optimization" where LMP mismatch is handled in two places
> --
>
> Key: KUDU-792
> URL: https://issues.apache.org/jira/browse/KUDU-792
> Project: Kudu
>  Issue Type: Sub-task
>  Components: consensus
>Affects Versions: Private Beta
>Reporter: Todd Lipcon
>Priority: Major
>
> We have a bit of code when handling the LMP mismatch that says:
> {code}
>   // If the terms mismatch we abort down to the index before the leader's 
> preceding,
>   // since we know that is the last opid that has a chance of not being 
> overwritten.
>   // Aborting preemptively here avoids us reporting a last received index 
> that is
>   // possibly higher than the leader's causing an avoidable cache miss on the 
> leader's
>   // queue.
>   if (term_mismatch) {
> return state_->AbortOpsAfterUnlocked(req.preceding_opid->index() - 1);
>   }
> {code}
> The code _looks_ like it's somewhat redundant (since we also abort operations 
> elsewhere on the replica side), and it claims to be an optimization, but if 
> we remove it, a bunch of correctness tests fail. We should understand better 
> why this code is here and at least update the comment.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-820) Add metrics for diagnosing recent cluster issues

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-820:
-
Target Version/s: 1.8.0  (was: 1.5.0)

> Add metrics for diagnosing recent cluster issues
> 
>
> Key: KUDU-820
> URL: https://issues.apache.org/jira/browse/KUDU-820
> Project: Kudu
>  Issue Type: Improvement
>  Components: metrics, supportability
>Affects Versions: 1.2.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Major
>
> We had a lot of difficulty trouble-shooting some recent issues on bolt80. We 
> need the following metrics added:
> - number of ops in the PREPARE queue
> - number of ops in the APPLY queue
> - number of milliseconds of spinlock contention (histogram?)
> - consensus error rate seen by leader
> - consensus RTT seen by leader



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-817) Add stack watchdog and latency metric around reads

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-817:
-
Target Version/s:   (was: 1.5.0)

> Add stack watchdog and latency metric around reads
> --
>
> Key: KUDU-817
> URL: https://issues.apache.org/jira/browse/KUDU-817
> Project: Kudu
>  Issue Type: Improvement
>  Components: cfile, metrics
>Affects Versions: Private Beta
>Reporter: Todd Lipcon
>Priority: Minor
>
> This is pretty easy and would be helpful for debugging cases like KUDU-742.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-830) Ensure thirdparty builds with fastest compiler for releases

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-830:
-
Target Version/s:   (was: 1.5.0)

> Ensure thirdparty builds with fastest compiler for releases
> ---
>
> Key: KUDU-830
> URL: https://issues.apache.org/jira/browse/KUDU-830
> Project: Kudu
>  Issue Type: Improvement
>  Components: build, perf
>Affects Versions: Public beta
>Reporter: Todd Lipcon
>Priority: Major
>
> On el6 slaves, it appears we currently build all of thirdparty with clang 3.3 
> from toolchain. This is a pretty old compiler which generates slower code 
> than gcc or newer clangs. It's not a big deal for most stuff, but probably 
> measurable difference on compression codecs like zlib/lz4. We should make 
> sure we use the best compiler for these perf hotspots.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-844) tablet layout SVG no longer includes rowsets undergoing compaction

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-844:
-
Target Version/s:   (was: 1.5.0)

> tablet layout SVG no longer includes rowsets undergoing compaction
> --
>
> Key: KUDU-844
> URL: https://issues.apache.org/jira/browse/KUDU-844
> Project: Kudu
>  Issue Type: Bug
>  Components: ops-tooling
>Affects Versions: Private Beta
>Reporter: Todd Lipcon
>Priority: Minor
>
> The tablet rowset layout svg no longer shows any rowsets which are currently 
> in the midst of compaction. That makes sense since it re-uses the compaction 
> selection code to generate the diagram. However, it's misleading.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-832) consider "sloppy" memcpy for better performance

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-832:
-
Target Version/s:   (was: 1.5.0)

> consider "sloppy" memcpy for better performance
> ---
>
> Key: KUDU-832
> URL: https://issues.apache.org/jira/browse/KUDU-832
> Project: Kudu
>  Issue Type: Improvement
>  Components: perf
>Affects Versions: 1.2.0
>Reporter: Todd Lipcon
>Priority: Major
> Attachments: test.cc
>
>
> On the read path, a very high percentage of our time is spent in memcpy. 
> Typically, we are copying data to and from large allocations (eg from a data 
> block into a column block, or from a data block into a rowblock's arena, or 
> from an MRS arena into either of the above). In pretty much all of these 
> cases, it would be easy to ensure that the source and destination both have 
> at least 8 bytes of "padding" past the last valid value, and then round all 
> of our memcpys up to the nearest 8 bytes (even if the amount to be copied is 
> much smaller). This enables a really tight and fast memcpy loop, which 
> microbenchmarks indicate could be 40-50% faster.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-883) Calculate and expose time-windowed histograms

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-883:
-
Target Version/s:   (was: 1.5.0)

> Calculate and expose time-windowed histograms
> -
>
> Key: KUDU-883
> URL: https://issues.apache.org/jira/browse/KUDU-883
> Project: Kudu
>  Issue Type: New Feature
>  Components: ops-tooling
>Affects Versions: Private Beta
>Reporter: Todd Lipcon
>Priority: Major
>
> Many of our metrics are in the form of HdrHistograms. Until CM has support 
> for handling histograms, we should do some very simple lagging time window 
> support and expose percentiles as simple gauge metrics.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-859) Actively remove data and deltas for deleted columns

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-859:
-
Target Version/s:   (was: 1.5.0)

> Actively remove data and deltas for deleted columns
> ---
>
> Key: KUDU-859
> URL: https://issues.apache.org/jira/browse/KUDU-859
> Project: Kudu
>  Issue Type: Improvement
>  Components: tablet
>Affects Versions: Private Beta
>Reporter: Todd Lipcon
>Priority: Major
>
> Currently, we only remove data blocks for deleted columns when we run a 
> compaction. If a table is already fully compacted, this won't happen at all. 
> In most cases, it will happen very slowly.
> We should add some kind of in-the-background operation which removes the 
> CFiles corresponding to deleted columns, and modify the delta compaction code 
> paths to actually drop delta data for deleted columns.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (KUDU-886) Cluster load balancing

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke reassigned KUDU-886:


Assignee: Will Berkeley

> Cluster load balancing
> --
>
> Key: KUDU-886
> URL: https://issues.apache.org/jira/browse/KUDU-886
> Project: Kudu
>  Issue Type: New Feature
>  Components: master
>Affects Versions: 1.2.0
>Reporter: Todd Lipcon
>Assignee: Will Berkeley
>Priority: Major
>
> We should add some load balancing support for GA:
> - move leaders to evenly spread RPC load.
> - eventually move tablets to even out disk space or load.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-972) cache should track memory overhead

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-972:
-
Target Version/s:   (was: 1.5.0)

> cache should track memory overhead
> --
>
> Key: KUDU-972
> URL: https://issues.apache.org/jira/browse/KUDU-972
> Project: Kudu
>  Issue Type: Bug
>  Components: cfile, util
>Affects Versions: Private Beta
>Reporter: Todd Lipcon
>Priority: Major
>
> Currently the cache only accounts for the cache _values_ in the memtracker. 
> Each key seems to have 88 or so bytes of memory usage as well (potentially 
> rounded up due to allocation overhead, etc).
> For the DRAM cache, this is still a ~700:1 ratio assuming 64kb block sizes, 
> but for PMEM, where we expect hundreds of GBs of block cache, a 700:1 ratio 
> may turn out to be somewhat substantial.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-973) Add a way to ksck a bootstrapping tablet against the node its catching up from

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-973:
-
Target Version/s:   (was: 1.5.0)

> Add a way to ksck a bootstrapping tablet against the node its catching up from
> --
>
> Key: KUDU-973
> URL: https://issues.apache.org/jira/browse/KUDU-973
> Project: Kudu
>  Issue Type: Bug
>  Components: consensus
>Affects Versions: Feature Complete
>Reporter: David Alves
>Priority: Major
>
> We should add a way (either automatically or from the command line) to check 
> a new tablet against the tablet its getting its data from.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-975) Review Java API for alter schema

2018-02-16 Thread Grant Henke (JIRA)

[ 
https://issues.apache.org/jira/browse/KUDU-975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16367465#comment-16367465
 ] 

Grant Henke commented on KUDU-975:
--

Is this complete [~tlipcon]? 

> Review Java API for alter schema
> 
>
> Key: KUDU-975
> URL: https://issues.apache.org/jira/browse/KUDU-975
> Project: Kudu
>  Issue Type: Improvement
>  Components: api, client
>Affects Versions: Private Beta
>Reporter: Todd Lipcon
>Priority: Major
>
> Shoudl review these APIs and make sure they are reasonable (and support 
> things like changing column encoding/compression)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-975) Review Java API for alter schema

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-975:
-
Target Version/s: 1.7.0  (was: 1.5.0)

> Review Java API for alter schema
> 
>
> Key: KUDU-975
> URL: https://issues.apache.org/jira/browse/KUDU-975
> Project: Kudu
>  Issue Type: Improvement
>  Components: api, client
>Affects Versions: Private Beta
>Reporter: Todd Lipcon
>Priority: Major
>
> Shoudl review these APIs and make sure they are reasonable (and support 
> things like changing column encoding/compression)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-997) Expose client-side metrics

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-997:
-
Target Version/s:   (was: 1.6.0)

> Expose client-side metrics
> --
>
> Key: KUDU-997
> URL: https://issues.apache.org/jira/browse/KUDU-997
> Project: Kudu
>  Issue Type: New Feature
>  Components: client
>Affects Versions: Feature Complete
>Reporter: Adar Dembo
>Priority: Major
> Attachments: patch
>
>
> I think client-side metrics have been a desirable feature for quite some 
> time, but I especially wanted them while debugging KUDU-993.
> There are some challenges in collecting metric data in a cohesive way across 
> the client (at least in C++, where there isn't a completely uniform way to 
> send/receive RPCs). But I think the main challenge is figuring out how to 
> expose it to users. I'm not sure we want to expose metrics.h directly, 
> because it's deeply intertwined with gutil and other Kudu util code.
> I'm attaching a patch I wrote yesterday to help with KUDU-993. It doesn't 
> tackle the API problem at all, but shows how to build a histogram tracking 
> all writes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KUDU-1018) Add method to get data from a single row

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-1018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke resolved KUDU-1018.
---
  Resolution: Duplicate
   Fix Version/s: n/a
Target Version/s:   (was: 1.5.0)

> Add method to get data from a single row
> 
>
> Key: KUDU-1018
> URL: https://issues.apache.org/jira/browse/KUDU-1018
> Project: Kudu
>  Issue Type: New Feature
>  Components: client
>Affects Versions: M5
>Reporter: Erick Tryzelaar
>Priority: Major
>  Labels: hackathon-feedback
> Fix For: n/a
>
>
> Could there be a builder added that simplifies the API to get a single row's 
> data, along the lines of 
> [this|http://github.mtv.cloudera.com/erickt/titan/blob/ea2683c92fd2dd79df0b6359f5d5520a78aea637/titan-kudu/src/main/java/com/cloudera/titan/diskstorage/kudu/KuduKeyValueStore.java#L314-L338]?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-1028) Be more graceful about clock unsynch errors

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-1028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-1028:
--
Target Version/s:   (was: 1.5.0)

> Be more graceful about clock unsynch errors
> ---
>
> Key: KUDU-1028
> URL: https://issues.apache.org/jira/browse/KUDU-1028
> Project: Kudu
>  Issue Type: Improvement
>  Components: tserver
>Affects Versions: Feature Complete
>Reporter: David Alves
>Priority: Major
>
> We should likely refuse to execute the txns but not crash outright.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-1589) ksck should retry its Ping() RPC to tservers

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-1589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-1589:
--
Target Version/s:   (was: 1.5.0)

> ksck should retry its Ping() RPC to tservers
> 
>
> Key: KUDU-1589
> URL: https://issues.apache.org/jira/browse/KUDU-1589
> Project: Kudu
>  Issue Type: Bug
>  Components: ksck
>Affects Versions: 0.10.0
>Reporter: Todd Lipcon
>Priority: Major
>
> I ran a ksck while a cluster was under a stress test and it determined that 
> one of the tablet servers was "unavailable" because it responded 
> SERVER_TOO_BUSY to the Ping() RPC. We should either retry the Ping() or make 
> sure that the Ping goes to the 'admin service' rpc queue instead of the 
> 'tablet service' RPC queue (which is more likely to be saturated)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-1346) Consensus queue crashes creating message for peer due to batch size

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-1346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-1346:
--
Target Version/s:   (was: 1.5.0)

> Consensus queue crashes creating message for peer due to batch size
> ---
>
> Key: KUDU-1346
> URL: https://issues.apache.org/jira/browse/KUDU-1346
> Project: Kudu
>  Issue Type: Bug
>  Components: consensus
>Affects Versions: 0.7.0
>Reporter: David Alves
>Priority: Major
>
> bruce_song zhang hit the following error:
> {code}
> F0222 11:30:16.552686 23428 consensus_queue.cc:370] Check failed: 
> request->ByteSize() <= FLAGS_consensus_max_batch_size_bytes (3042252 vs. 
> 1048576) 
> F0222 11:30:16.552693 23416 consensus_queue.cc:370] Check failed: 
> request->ByteSize() <= FLAGS_consensus_max_batch_size_bytes (3042252 vs. 
> 1048576)
> {code}
> It seems plausible that we might allow a write that is bigger than the max 
> consensus batch size, and since there is a path in log cache that always 
> sends at least one message, we might fail this check and crash.
> The workaround is to set a bigger value than the default for 
> --consensus_max_batch_size_bytes



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-1067) Add metrics for tablet row count, size on disk

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-1067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-1067:
--
Component/s: impala

> Add metrics for tablet row count, size on disk
> --
>
> Key: KUDU-1067
> URL: https://issues.apache.org/jira/browse/KUDU-1067
> Project: Kudu
>  Issue Type: Improvement
>  Components: impala, metrics, ops-tooling
>Affects Versions: Private Beta
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Major
>
> Would be nice to expose these metrics up to CM. If we expose tablet-level 
> metrics, CM will aggregate by table for us.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-1067) Add metrics for tablet row count, size on disk

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-1067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-1067:
--
Target Version/s:   (was: 1.5.0)

> Add metrics for tablet row count, size on disk
> --
>
> Key: KUDU-1067
> URL: https://issues.apache.org/jira/browse/KUDU-1067
> Project: Kudu
>  Issue Type: Improvement
>  Components: metrics, ops-tooling
>Affects Versions: Private Beta
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Major
>
> Would be nice to expose these metrics up to CM. If we expose tablet-level 
> metrics, CM will aggregate by table for us.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-1097) Higher availability re-replication support

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-1097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-1097:
--
Target Version/s: 1.7.0  (was: 1.5.0)

> Higher availability re-replication support
> --
>
> Key: KUDU-1097
> URL: https://issues.apache.org/jira/browse/KUDU-1097
> Project: Kudu
>  Issue Type: Sub-task
>  Components: consensus
>Affects Versions: Public beta
>Reporter: Mike Percy
>Assignee: Mike Percy
>Priority: Critical
>
> Relative to the re-replication support outlined in KUDU-1096, we can do 
> better in terms of availability properties. Here is a rough outline of such a 
> design.
> Design:
> # When a voter falls behind the leader's log GC threshold, the leader 
> notifies the Master that the voter is no longer up to date.
> # The Master selects a node to act as a replacement. It adds that node as a 
> PRE_VOTER to the config (see KUDU-869) and when that node is caught up, it is 
> automatically promoted to a VOTER.
> # When the Master detects that the node has been promoted, it removes the bad 
> node from the config.
> Additional cases to detect and handle:
> * If the config is in such a state that it would be impossible to add a node, 
> due to a voter that has fallen behind the log GC threshold being in the 
> required majority, then remotely bootstrap that voter without changing the 
> config. The tablet will continue to be unable to serve writes during this 
> time, but will self-heal without administrator intervention.
> This can be further improved by adding support for aborting a config-change 
> operation that cannot commit.
> This requires some additional plumbing from the leader to the Master to 
> notify it of slow followers.
> Pros:
> * Closer to optimal fault-tolerance properties; "majority lost" less likely 
> to occur so administrator intervention less likely 
> Cons:
> * Requires support for pre-voter and a smarter master.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-1133) Scan range lengths should match the tablet size

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-1133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-1133:
--
Target Version/s: 1.8.0  (was: 1.5.0)

> Scan range lengths should match the tablet size
> ---
>
> Key: KUDU-1133
> URL: https://issues.apache.org/jira/browse/KUDU-1133
> Project: Kudu
>  Issue Type: Task
>  Components: impala
>Affects Versions: Feature Complete
>Reporter: David Alves
>Priority: Major
>
> We're currently using a dummy number (1000) as scan range lengths, we should 
> make that the tablet size.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-1160) Integrate alter-table into automated cluster testing

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-1160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-1160:
--
Target Version/s: 1.8.0  (was: 1.5.0)

> Integrate alter-table into automated cluster testing
> 
>
> Key: KUDU-1160
> URL: https://issues.apache.org/jira/browse/KUDU-1160
> Project: Kudu
>  Issue Type: Task
>  Components: test
>Affects Versions: Private Beta
>Reporter: Todd Lipcon
>Priority: Major
>
> Alter-table is the sort of thing that would benefit from scale testing and 
> testing with concurrent real workloads (eg adding a column to the ycsb table 
> while running YCSB). We should get that running regularly on integration test 
> clusters before GA.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-1172) Enable deleting tablets before or while they bootstrap

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-1172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-1172:
--
Target Version/s:   (was: 1.5.0)

> Enable deleting tablets before or while they bootstrap
> --
>
> Key: KUDU-1172
> URL: https://issues.apache.org/jira/browse/KUDU-1172
> Project: Kudu
>  Issue Type: Improvement
>  Components: tserver
>Affects Versions: Feature Complete
>Reporter: Jean-Daniel Cryans
>Priority: Minor
>
> An issue we faced recently on the YCSB cluster is that some disks got full 
> and there was no easy way to bring the tservers back. We had a lot of old 
> tables we could have deleted but in order to do that the tablets must first 
> go through bootstrapping... which is impossible on a full disk.
> Ideas (Todd's mostly):
> - Add a configurable delay (default:0s) when a tserver starts before it 
> bootstraps, this way it could get DeleteTablet calls from the master.
> - Make it possible to delete tablets while they are in queue to bootstrap.
> - Make it possible to delete bootstrapping tablets.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-1219) OS X Limitations & Known Issues

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-1219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-1219:
--
Target Version/s: 1.8.0  (was: 1.5.0)

> OS X Limitations & Known Issues
> ---
>
> Key: KUDU-1219
> URL: https://issues.apache.org/jira/browse/KUDU-1219
> Project: Kudu
>  Issue Type: Task
> Environment: OS X
>Reporter: Dan Burkert
>Assignee: Dan Burkert
>Priority: Major
>
> This is a tracking ticket for known issues of running Kudu on OS X.
> # The hybrid logical clock has error permanently set to 0us. This is a result 
> of the {{ntp_gettime}} (or similar) API not existing on Darwin.  The result 
> is that using the hybrid logical clock on a cluster of OS X hosts is 
> unsupported (a single-host Kudu installation is fine).
> # The Kudu client library does not properly hide non-public symbols. This is 
> a result of the {{--version-script}} option being unavailable on the OS X 
> system linker.
> # The log block manager is not supported on OS X. This is a result of OS X 
> not supporting sparse files and hole punching.
> # Some of the monitoring and debugging tools built in to Kudu do not work 
> properly.  In particular, stack traces (both user and kernel) may not work, 
> and the {{/pprof}} endpoint on server pages may not work correctly. 
> # ASAN tests will run and flag issues correctly, but LSAN is disabled (it is 
> Linux only), and there are many false positives.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-1334) Support pid_max > 16 bits in the mini cluster

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-1334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-1334:
--
Target Version/s: 1.6.0  (was: 1.5.0)

> Support pid_max > 16 bits in the mini cluster
> -
>
> Key: KUDU-1334
> URL: https://issues.apache.org/jira/browse/KUDU-1334
> Project: Kudu
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 0.5.0
>Reporter: Jean-Daniel Cryans
>Priority: Major
>
> Pretty much anybody running on newer machines/platforms will hit this while 
> running the unit tests:
> {noformat}
> I0216 11:27:23.617383 110702 external_mini_cluster.cc:582] Started
> /home/stack/apache-kudu-incubating-0.7.0/build/rc/bin/kudu-master as pid
> 110706
> F0216 11:27:23.617473 110702 external_mini_cluster.cc:258] Check failed: p
> <= MathLimits::kMax (110702 vs. 65535) Cannot run on systems with
> >16-bit pid
> *** Check failure stack trace: ***
> {noformat}
> Having this limitation was fine but now it's something everybody hits.
> The workaround is running this:
> {noformat}
> echo "32768" > /proc/sys/kernel/pid_max
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-1326) Support auto timestamp column

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-1326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-1326:
--
Target Version/s:   (was: 1.5.0)

> Support auto timestamp column
> -
>
> Key: KUDU-1326
> URL: https://issues.apache.org/jira/browse/KUDU-1326
> Project: Kudu
>  Issue Type: New Feature
>  Components: api
>Affects Versions: Public beta
>Reporter: kuduser
>Priority: Major
>
> It would be good to have a last_updated_time column that is automatically set 
> on record creation and update.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (KUDU-1632) Assert on request size in consensus_queue.cc

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke reopened KUDU-1632:
---

> Assert on request size in consensus_queue.cc
> 
>
> Key: KUDU-1632
> URL: https://issues.apache.org/jira/browse/KUDU-1632
> Project: Kudu
>  Issue Type: Bug
>  Components: consensus
>Affects Versions: 1.0.0
> Environment: Development environment: Linux or MacOS X will do.  
> Build the project in debug configuration.
>Reporter: Alexey Serbin
>Priority: Major
> Fix For: n/a
>
>
> After switching tests from running in MANUAL_FLUSH mode into 
> AUTO_FLUSH_BACKGROUND mode, the debug assert started to fire when running the 
> all_types-itest built in debug configuration and KUDU_ALLOW_SLOW_TESTS=1.  
> It's 100% reproducible: just run
> {noformat}
> KUDU_ALLOW_SLOW_TESTS=1 ./bin/all_types-itest  2>/tmp/all_types.log
> {noformat}
> The following stacktrace is reported by the test:
> {noformat}
> F0920 12:26:24.726744 13286 consensus_queue.cc:401] Check failed: 
> request->ByteSize() <= FLAGS_consensus_max_batch_size_bytes (1168286 vs. 
> 1048576)
> *** Check failure stack trace: ***
> @ 0x7f74c9f42294  google::LogMessage::SendToLog()
> @ 0x7f74c9f42790  google::LogMessage::Flush()
> @ 0x7f74c9f463c2  google::LogMessageFatal::~LogMessageFatal()
> @ 0x7f74cf109918 kudu::consensus::PeerMessageQueue::RequestForPeer()
> @ 0x7f74cf0fc302  kudu::consensus::Peer::SendNextRequest()
> @ 0x7f74cf0fdfaf  kudu::consensus::Peer::DoProcessResponse()
> @ 0x7f74cf1050a9  kudu::internal::RunnableAdapter<>::Run()
> @ 0x7f74cf10502c  kudu::internal::InvokeHelper<>::MakeItSo()
> @ 0x7f74cf104ffa  kudu::internal::Invoker<>::Run()
> @ 0x7f74cf1284ce  kudu::Callback<>::Run()
> @ 0x7f74cf12e7b9  boost::_mfi::cmf0<>::operator()()
> @ 0x7f74cf12e720  boost::_bi::list1<>::operator()<>()
> @ 0x7f74cf12e6ca  boost::_bi::bind_t<>::operator()()
> @ 0x7f74cf12e430 
> boost::detail::function::void_function_obj_invoker0<>::invoke()
> @ 0x7f74ce164b98  boost::function0<>::operator()()
> @ 0x7f74cab47589  kudu::FunctionRunnable::Run()
> @ 0x7f74cab45994  kudu::ThreadPool::DispatchThread()
> @ 0x7f74cab49cc9  boost::_mfi::mf1<>::operator()()
> @ 0x7f74cab49c27  boost::_bi::list2<>::operator()<>()
> @ 0x7f74cab49baa  boost::_bi::bind_t<>::operator()()
> @ 0x7f74cab49930 
> boost::detail::function::void_function_obj_invoker0<>::invoke()
> @ 0x7f74ce164b98  boost::function0<>::operator()()
> @ 0x7f74cab3ab30  kudu::Thread::SuperviseThread()
> @   0x3ae0e079d1  (unknown)
> @   0x3ae0ae88fd  (unknown)
> @  (nil)  (unknown)
> {noformat}
> The patch which introduces the change from MANUAL_FLUSH to 
> AUTO_FLUSH_BACKGROUND can be found at: https://gerrit.cloudera.org/#/c/4471/
> Also, if trying to reproduce the issue, set the 
> {{consensus_max_batch_size_bytes}} to 1048576 (i.e. 1MiB): as a temporary 
> workaround to avoid triggering the debug assert, the {{all_types-itest}} set 
> it to 2MiB.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KUDU-1632) Assert on request size in consensus_queue.cc

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke resolved KUDU-1632.
---
Resolution: Duplicate

> Assert on request size in consensus_queue.cc
> 
>
> Key: KUDU-1632
> URL: https://issues.apache.org/jira/browse/KUDU-1632
> Project: Kudu
>  Issue Type: Bug
>  Components: consensus
>Affects Versions: 1.0.0
> Environment: Development environment: Linux or MacOS X will do.  
> Build the project in debug configuration.
>Reporter: Alexey Serbin
>Priority: Major
> Fix For: n/a
>
>
> After switching tests from running in MANUAL_FLUSH mode into 
> AUTO_FLUSH_BACKGROUND mode, the debug assert started to fire when running the 
> all_types-itest built in debug configuration and KUDU_ALLOW_SLOW_TESTS=1.  
> It's 100% reproducible: just run
> {noformat}
> KUDU_ALLOW_SLOW_TESTS=1 ./bin/all_types-itest  2>/tmp/all_types.log
> {noformat}
> The following stacktrace is reported by the test:
> {noformat}
> F0920 12:26:24.726744 13286 consensus_queue.cc:401] Check failed: 
> request->ByteSize() <= FLAGS_consensus_max_batch_size_bytes (1168286 vs. 
> 1048576)
> *** Check failure stack trace: ***
> @ 0x7f74c9f42294  google::LogMessage::SendToLog()
> @ 0x7f74c9f42790  google::LogMessage::Flush()
> @ 0x7f74c9f463c2  google::LogMessageFatal::~LogMessageFatal()
> @ 0x7f74cf109918 kudu::consensus::PeerMessageQueue::RequestForPeer()
> @ 0x7f74cf0fc302  kudu::consensus::Peer::SendNextRequest()
> @ 0x7f74cf0fdfaf  kudu::consensus::Peer::DoProcessResponse()
> @ 0x7f74cf1050a9  kudu::internal::RunnableAdapter<>::Run()
> @ 0x7f74cf10502c  kudu::internal::InvokeHelper<>::MakeItSo()
> @ 0x7f74cf104ffa  kudu::internal::Invoker<>::Run()
> @ 0x7f74cf1284ce  kudu::Callback<>::Run()
> @ 0x7f74cf12e7b9  boost::_mfi::cmf0<>::operator()()
> @ 0x7f74cf12e720  boost::_bi::list1<>::operator()<>()
> @ 0x7f74cf12e6ca  boost::_bi::bind_t<>::operator()()
> @ 0x7f74cf12e430 
> boost::detail::function::void_function_obj_invoker0<>::invoke()
> @ 0x7f74ce164b98  boost::function0<>::operator()()
> @ 0x7f74cab47589  kudu::FunctionRunnable::Run()
> @ 0x7f74cab45994  kudu::ThreadPool::DispatchThread()
> @ 0x7f74cab49cc9  boost::_mfi::mf1<>::operator()()
> @ 0x7f74cab49c27  boost::_bi::list2<>::operator()<>()
> @ 0x7f74cab49baa  boost::_bi::bind_t<>::operator()()
> @ 0x7f74cab49930 
> boost::detail::function::void_function_obj_invoker0<>::invoke()
> @ 0x7f74ce164b98  boost::function0<>::operator()()
> @ 0x7f74cab3ab30  kudu::Thread::SuperviseThread()
> @   0x3ae0e079d1  (unknown)
> @   0x3ae0ae88fd  (unknown)
> @  (nil)  (unknown)
> {noformat}
> The patch which introduces the change from MANUAL_FLUSH to 
> AUTO_FLUSH_BACKGROUND can be found at: https://gerrit.cloudera.org/#/c/4471/
> Also, if trying to reproduce the issue, set the 
> {{consensus_max_batch_size_bytes}} to 1048576 (i.e. 1MiB): as a temporary 
> workaround to avoid triggering the debug assert, the {{all_types-itest}} set 
> it to 2MiB.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-1447) Document recommendation to disable THP

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-1447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-1447:
--
Target Version/s: 1.7.0  (was: 1.5.0)

> Document recommendation to disable THP
> --
>
> Key: KUDU-1447
> URL: https://issues.apache.org/jira/browse/KUDU-1447
> Project: Kudu
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 0.8.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Major
>
> Doing a bunch of cluster testing, I finally got to the root of why sometimes 
> threads take several seconds to start up, causing various timeout issues, 
> false elections, etc. It turns out that khugepaged does synchronous page 
> compaction while holding a process's mmap semaphore, and when that's 
> concurrent with lots of IO, can block for several seconds.
> https://lkml.org/lkml/2011/7/26/103
> To avoid this, we should tell users to set hugepages to "madvise" or "never" 
> -- it's not sufficient to just disable defrag, because khugepaged still runs 
> in the background in that case and causes this sporadic issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-1366) Consider switching to jemalloc

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-1366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-1366:
--
Target Version/s: 1.8.0  (was: 1.5.0)

> Consider switching to jemalloc
> --
>
> Key: KUDU-1366
> URL: https://issues.apache.org/jira/browse/KUDU-1366
> Project: Kudu
>  Issue Type: Bug
>  Components: perf
>Reporter: Todd Lipcon
>Priority: Major
> Attachments: Kudu Benchmarks.pdf, Kudu Benchmarks.pdf
>
>
> We spend a fair amount of time in the allocator. While we could spend some 
> time trying to use arenas more, it's also worth considering switching 
> allocators. I ran a few quick tests with jemalloc 4.1 and it seems like it 
> might be better than the version of tcmalloc that we use (and has much more 
> active development)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-1465) Large allocations for scanner result buffers harm allocator thread caching

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-1465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-1465:
--
Target Version/s:   (was: 1.5.0)

> Large allocations for scanner result buffers harm allocator thread caching
> --
>
> Key: KUDU-1465
> URL: https://issues.apache.org/jira/browse/KUDU-1465
> Project: Kudu
>  Issue Type: Bug
>  Components: perf
>Affects Versions: 0.8.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Major
>
> I was looking at the performance of a random-read stress test on a 70 node 
> cluster and found that threads were often spending time in allocator 
> contention, particularly when deallocating RpcSidecar objects. After a bit of 
> analysis, I determined this is because we always preallocate buffers of 1MB 
> (the default batch size) even if the response is only going to be a single 
> row. Such large allocations go directly to the central freelist instead of 
> using thread-local caches.
> As a simple test, I used the set_flag command to drop the default batch size 
> to 4KB, and the read throughput (reads/second) increased substantially.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-1502) Block cache with churn burns lots of CPU in MemTracker consume and release

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-1502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-1502:
--
Target Version/s:   (was: 1.5.0)

> Block cache with churn burns lots of CPU in MemTracker consume and release
> --
>
> Key: KUDU-1502
> URL: https://issues.apache.org/jira/browse/KUDU-1502
> Project: Kudu
>  Issue Type: Bug
>  Components: perf, util
>Affects Versions: 0.9.0
>Reporter: Todd Lipcon
>Priority: Major
> Attachments: fg-memtracker-commented-out.svg, fg.svg
>
>
> I am running a random-write workload where the bloom filters don't fit in the 
> block cache (but do fit in page cache) which causes a lot of block cache 
> churn. I'm seeing MemTracker::Release and MemTracker::Consume take the 
> majority of CPU on the system.
> It seems like this is low-hanging fruit -- we don't need exactly up-to-date 
> accounting here so should be pretty easy to optimize.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-1466) C++ client errors misreported as GetTableLocations timeouts

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-1466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-1466:
--
Target Version/s: 1.7.0  (was: 1.6.0)

> C++ client errors misreported as GetTableLocations timeouts
> ---
>
> Key: KUDU-1466
> URL: https://issues.apache.org/jira/browse/KUDU-1466
> Project: Kudu
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.8.0
>Reporter: Todd Lipcon
>Assignee: Alexey Serbin
>Priority: Critical
>
> client-test is currently very flaky due to this issue:
> - we are injecting some kind of failure on the tablet server (eg DNS 
> resolution failure)
> - when we fail to connect to the TS, we correctly re-trigger a lookup against 
> the master
> - depending how the backoffs and retries line up, we sometimes end up 
> triggering the lookup retry when the remaining operation budget is very short 
> (eg <10ms)
> -- this GetTabletLocations RPC times out since the master is unable to 
> respond within the ridiculously short timeout
> During the course of retrying some operation, we should probably not replace 
> the 'last_error' with a master error, so long as we have had at least one 
> successful master lookup (thus indicating that the master is not the problem)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-1503) Add a WAL truncation tool

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-1503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-1503:
--
Target Version/s:   (was: 1.5.0)

> Add a WAL truncation tool
> -
>
> Key: KUDU-1503
> URL: https://issues.apache.org/jira/browse/KUDU-1503
> Project: Kudu
>  Issue Type: New Feature
>  Components: consensus, ops-tooling
>Affects Versions: 0.9.0
>Reporter: Mike Percy
>Priority: Major
>
> It would be useful to build a tool that allows truncation of a consensus 
> write-ahead log. This could be necessary if something happened in production 
> that wrote a "bad" record to a WAL and we need to back it out because it's 
> causing a crash, for example.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-1514) A tablet that ends up under replicated will spam logs

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-1514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-1514:
--
Component/s: supportability

> A tablet that ends up under replicated will spam logs
> -
>
> Key: KUDU-1514
> URL: https://issues.apache.org/jira/browse/KUDU-1514
> Project: Kudu
>  Issue Type: Bug
>  Components: consensus, supportability
>Reporter: Jean-Daniel Cryans
>Priority: Major
>
> Trying to debug a tablet that got down to 1 replica is hard when these lines 
> are printed multiple times per second:
> {noformat}
> W0704 23:05:30.999037   312 transaction_tracker.cc:112] Transaction failed, 
> tablet 807ff8e42640482d8d947b693d56ce03 transaction memory consumption 
> (67107918) has exceeded its limit (67108864) or the limit of an ancestral 
> tracker [suppressed 140 similar messages]
> I0704 23:05:31.000737 24321 consensus_peers.cc:181] T 
> 807ff8e42640482d8d947b693d56ce03 P 9e59a4c24de44e3f9de219df865b4f3b -> Peer 
> 94051c9253f94dadbc1af38098b41077 (e1105.halxg.cloudera.com:7050): Could not 
> obtain request from queue for peer: 94051c9253f94dadbc1af38098b41077. Status: 
> Not found: Failed to read ops 2302557..2325361: Segment 1118 which contained 
> index 2302557 has been GCed
> I0704 23:05:31.000780 24452 raft_consensus.cc:629] T 
> 807ff8e42640482d8d947b693d56ce03 P 9e59a4c24de44e3f9de219df865b4f3b [term 29 
> LEADER]: Processing failure of peer 94051c9253f94dadbc1af38098b41077 in term 
> 29 (The logs necessary to catch up peer 94051c9253f94dadbc1af38098b41077 have 
> been garbage collected. The follower will never be able to catch up (Not 
> found: Failed to read ops 2302557..2325361: Segment 1118 which contained 
> index 2302557 has been GCed)): There is already a config change operation in 
> progress. Unable to evict follower until it completes. Doing nothing.
> I0704 23:05:31.138310   378 raft_consensus.cc:1603] T 
> 807ff8e42640482d8d947b693d56ce03 P 9e59a4c24de44e3f9de219df865b4f3b [term 29 
> LEADER]: Leader election vote request: Denying vote to candidate 
> 94051c9253f94dadbc1af38098b41077 for term 5380 because replica is either 
> leader or believes a valid leader to be alive.
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-1506) Add Consensus "follower lag" metrics

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-1506:
--
Target Version/s:   (was: 1.5.0)

> Add Consensus "follower lag" metrics
> 
>
> Key: KUDU-1506
> URL: https://issues.apache.org/jira/browse/KUDU-1506
> Project: Kudu
>  Issue Type: New Feature
>  Components: consensus, metrics, supportability
>Affects Versions: 0.9.0
>Reporter: Mike Percy
>Priority: Major
>
> It would be useful to have metrics that measured the lag time between leader 
> WAL writes and follower WAL writes. Imagine if a node on a cluster had a very 
> slow disk or was extremely overloaded. That node may constantly be falling 
> behind and/or remote bootstrapping. It would help to be able to monitor for 
> nodes that were constantly very far behind the leader (high seconds or 
> minutes) so that administrators could take a look at these slow machines and 
> either remove them from the cluster or fix the underlying issues.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-1506) Add Consensus "follower lag" metrics

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-1506:
--
Component/s: supportability

> Add Consensus "follower lag" metrics
> 
>
> Key: KUDU-1506
> URL: https://issues.apache.org/jira/browse/KUDU-1506
> Project: Kudu
>  Issue Type: New Feature
>  Components: consensus, metrics, supportability
>Affects Versions: 0.9.0
>Reporter: Mike Percy
>Priority: Major
>
> It would be useful to have metrics that measured the lag time between leader 
> WAL writes and follower WAL writes. Imagine if a node on a cluster had a very 
> slow disk or was extremely overloaded. That node may constantly be falling 
> behind and/or remote bootstrapping. It would help to be able to monitor for 
> nodes that were constantly very far behind the leader (high seconds or 
> minutes) so that administrators could take a look at these slow machines and 
> either remove them from the cluster or fix the underlying issues.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-1505) Add a tool to mark a tablet as quarantined

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-1505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-1505:
--
Target Version/s:   (was: 1.5.0)

> Add a tool to mark a tablet as quarantined
> --
>
> Key: KUDU-1505
> URL: https://issues.apache.org/jira/browse/KUDU-1505
> Project: Kudu
>  Issue Type: New Feature
>  Components: ops-tooling, tablet
>Affects Versions: 0.9.0
>Reporter: Mike Percy
>Priority: Major
>
> It would be useful to have a tool that could mark a tablet as "quarantined" 
> or manually "deleted", so that the TSTabletManager does not attempt to start 
> the tablet. This would be helpful in the case that there is a problem with 
> the tablet that is causing a TS crash on startup, for example.
> Quarantining could carry out the following steps:
> 1. Construct a tar ball consisting of the contents of the tablet, including 
> superblock, data blocks, write ahead logs, consensus metadata, and any other 
> relevant information. This could be saved for later debugging by a Kudu 
> developer.
> 2. Safely remove all of the contents of the tablet, including all data blocks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-1514) A tablet that ends up under replicated will spam logs

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-1514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-1514:
--
Target Version/s: 1.8.0  (was: 1.5.0)

> A tablet that ends up under replicated will spam logs
> -
>
> Key: KUDU-1514
> URL: https://issues.apache.org/jira/browse/KUDU-1514
> Project: Kudu
>  Issue Type: Bug
>  Components: consensus
>Reporter: Jean-Daniel Cryans
>Priority: Major
>
> Trying to debug a tablet that got down to 1 replica is hard when these lines 
> are printed multiple times per second:
> {noformat}
> W0704 23:05:30.999037   312 transaction_tracker.cc:112] Transaction failed, 
> tablet 807ff8e42640482d8d947b693d56ce03 transaction memory consumption 
> (67107918) has exceeded its limit (67108864) or the limit of an ancestral 
> tracker [suppressed 140 similar messages]
> I0704 23:05:31.000737 24321 consensus_peers.cc:181] T 
> 807ff8e42640482d8d947b693d56ce03 P 9e59a4c24de44e3f9de219df865b4f3b -> Peer 
> 94051c9253f94dadbc1af38098b41077 (e1105.halxg.cloudera.com:7050): Could not 
> obtain request from queue for peer: 94051c9253f94dadbc1af38098b41077. Status: 
> Not found: Failed to read ops 2302557..2325361: Segment 1118 which contained 
> index 2302557 has been GCed
> I0704 23:05:31.000780 24452 raft_consensus.cc:629] T 
> 807ff8e42640482d8d947b693d56ce03 P 9e59a4c24de44e3f9de219df865b4f3b [term 29 
> LEADER]: Processing failure of peer 94051c9253f94dadbc1af38098b41077 in term 
> 29 (The logs necessary to catch up peer 94051c9253f94dadbc1af38098b41077 have 
> been garbage collected. The follower will never be able to catch up (Not 
> found: Failed to read ops 2302557..2325361: Segment 1118 which contained 
> index 2302557 has been GCed)): There is already a config change operation in 
> progress. Unable to evict follower until it completes. Doing nothing.
> I0704 23:05:31.138310   378 raft_consensus.cc:1603] T 
> 807ff8e42640482d8d947b693d56ce03 P 9e59a4c24de44e3f9de219df865b4f3b [term 29 
> LEADER]: Leader election vote request: Denying vote to candidate 
> 94051c9253f94dadbc1af38098b41077 for term 5380 because replica is either 
> leader or believes a valid leader to be alive.
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-1520) Possible race between alter schema lock release and tablet shutdown

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-1520:
--
Target Version/s:   (was: 1.5.0)

> Possible race between alter schema lock release and tablet shutdown
> ---
>
> Key: KUDU-1520
> URL: https://issues.apache.org/jira/browse/KUDU-1520
> Project: Kudu
>  Issue Type: Bug
>  Components: tablet
>Affects Versions: 0.9.1
>Reporter: Adar Dembo
>Priority: Major
>
> I've been running a new stress that hammers a cluster with concurrent alter 
> and delete table requests, and one of my test runs failed with the following:
> {noformat}
> F0707 18:59:34.311122   373 rw_semaphore.h:145] Check failed: 
> base::subtle::NoBarrier_Load(_) == kWriteFlag (0 vs. 2147483648) 
> *** Check failure stack trace: ***
> @ 0x7f86cd37df5d  google::LogMessage::Fail() at ??:0
> @ 0x7f86cd37fe5d  google::LogMessage::SendToLog() at ??:0
> @ 0x7f86cd37da99  google::LogMessage::Flush() at ??:0
> @ 0x7f86cd3808ff  google::LogMessageFatal::~LogMessageFatal() at ??:0
> @ 0x7f86d4f77c78  kudu::rw_semaphore::unlock() at ??:0
> @ 0x7f86d3728de0  std::unique_lock<>::unlock() at ??:0
> @ 0x7f86d3727192  std::unique_lock<>::~unique_lock() at ??:0
> @ 0x7f86d3725582  
> kudu::tablet::AlterSchemaTransactionState::~AlterSchemaTransactionState() at 
> ??:0
> @ 0x7f86d37255be  
> kudu::tablet::AlterSchemaTransactionState::~AlterSchemaTransactionState() at 
> ??:0
> @ 0x7f86d4f68dce  std::default_delete<>::operator()() at ??:0
> @ 0x7f86d4f670b9  std::unique_ptr<>::~unique_ptr() at ??:0
> @ 0x7f86d374510e  
> kudu::tablet::AlterSchemaTransaction::~AlterSchemaTransaction() at ??:0
> @ 0x7f86d374514a  
> kudu::tablet::AlterSchemaTransaction::~AlterSchemaTransaction() at ??:0
> @ 0x7f86d373f532  kudu::DefaultDeleter<>::operator()() at ??:0
> @ 0x7f86d373df4a  
> kudu::internal::gscoped_ptr_impl<>::~gscoped_ptr_impl() at ??:0
> @ 0x7f86d373d552  gscoped_ptr<>::~gscoped_ptr() at ??:0
> @ 0x7f86d373d580  
> kudu::tablet::TransactionDriver::~TransactionDriver() at ??:0
> @ 0x7f86d3740ab4  kudu::RefCountedThreadSafe<>::DeleteInternal() at 
> ??:0
> @ 0x7f86d3740405  
> kudu::DefaultRefCountedThreadSafeTraits<>::Destruct() at ??:0
> @ 0x7f86d373f928  kudu::RefCountedThreadSafe<>::Release() at ??:0
> @ 0x7f86d373e769  scoped_refptr<>::~scoped_refptr() at ??:0
> @ 0x7f86d37397cd  kudu::tablet::TabletPeer::SubmitAlterSchema() at 
> ??:0
> @ 0x7f86d4f4e070  
> kudu::tserver::TabletServiceAdminImpl::AlterSchema() at ??:0
> @ 0x7f86d27a4e92  
> _ZZN4kudu7tserver26TabletServerAdminServiceIfC4ERK13scoped_refptrINS_12MetricEntityEEENKUlPKN6google8protobuf7MessageEPS9_PNS_3rpc10RpcContextEE1_clESB_SC_SF_
>  at ??:0
> @ 0x7f86d27a5d96  
> _ZNSt17_Function_handlerIFvPKN6google8protobuf7MessageEPS2_PN4kudu3rpc10RpcContextEEZNS6_7tserver26TabletServerAdminServiceIfC4ERK13scoped_refptrINS6_12MetricEntityEEEUlS4_S5_S9_E1_E9_M_invokeERKSt9_Any_dataS4_S5_S9_
>  at ??:0
> @ 0x7f86d22ce6e4  std::function<>::operator()() at ??:0
> @ 0x7f86d22ce19b  kudu::rpc::GeneratedServiceIf::Handle() at ??:0
> @ 0x7f86d22d0a97  kudu::rpc::ServicePool::RunThread() at ??:0
> @ 0x7f86d22d1d45  boost::_mfi::mf0<>::operator()() at ??:0
> @ 0x7f86d22d1b6c  boost::_bi::list1<>::operator()<>() at ??:0
> @ 0x7f86d22d1a61  boost::_bi::bind_t<>::operator()() at ??:0
> @ 0x7f86d22d1998  
> boost::detail::function::void_function_obj_invoker0<>::invoke() at ??:0
> {noformat}
> After looking through the code a bit, I suspect this happened because, in the 
> event of failure, the AlterSchema transaction releases the tablet's schema 
> lock implicitly (i.e. when AlterSchemaTransactionState is destroyed) _after_ 
> the transaction itself is removed from the driver's TransactionTracker. Thus, 
> the WaitForAllToFinish() performed during the tablet shutdown process thinks 
> all the transactions are done and proceeds to free tablet state. Later, the 
> last reference to the transaction is released (in 
> TabletPeer::SubmitAlterSchema), the transaction is destroyed, and we try to 
> unlock a lock whose memory has already been freed.
> If this analysis is correct, the broken invariant is: once the transaction 
> has been released from the tracker, it may no longer access any tablet state.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-1525) Create metrics for errors

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-1525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-1525:
--
Target Version/s: 1.8.0  (was: 1.5.0)

> Create metrics for errors
> -
>
> Key: KUDU-1525
> URL: https://issues.apache.org/jira/browse/KUDU-1525
> Project: Kudu
>  Issue Type: Improvement
>  Components: supportability
>Reporter: Jean-Daniel Cryans
>Priority: Major
>
> There's a class of issue that can be hard to debug, namely when things fail 
> semi-silently on the client-side. We currently have glog_warning_messages and 
> glog_error_messages, but it could be good to have more granular metrics. A 
> few I have in mind:
>  - rpc errors, basically any "recv error"
>  - server-level errors, like when it says TOO BUSY.
>  - any kind of insert rejection, right now we have row key duplicates and 
> memory pressure, but we're missing things like txn_tracker rejections, "not a 
> leader".
>  - raft errors like dropping a follower because we don't have the WALs around 
> and it's lagging too much.
> There's probably more but the above would be a good start.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-1554) Tombstoned replicas remain on TS even after table is deleted

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-1554:
--
Target Version/s:   (was: 1.5.0)

> Tombstoned replicas remain on TS even after table is deleted
> 
>
> Key: KUDU-1554
> URL: https://issues.apache.org/jira/browse/KUDU-1554
> Project: Kudu
>  Issue Type: Bug
>  Components: master, tserver
>Affects Versions: 1.2.0
>Reporter: Todd Lipcon
>Priority: Minor
>
> If a replica is deleted on a live table, a tombstone replica is left with 
> TABLET_DATA_TOMBSTONED state. If the table is then deleted, those tombstones 
> aren't cleaned up, and will remain on the tserver until the next time the 
> tserver restarts.
> Not a big deal, but it may be confusing to users to see these tombstones 
> sticking around.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-1575) Backup and restore procedures

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-1575:
--
Issue Type: Improvement  (was: Bug)

> Backup and restore procedures
> -
>
> Key: KUDU-1575
> URL: https://issues.apache.org/jira/browse/KUDU-1575
> Project: Kudu
>  Issue Type: Improvement
>  Components: master, tserver
>Reporter: Mike Percy
>Priority: Major
>
> Kudu needs backup and restore procedures, both for data and for metadata.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-1584) follower memory throttling results in error log messages on the leader

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-1584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-1584:
--
Target Version/s:   (was: 1.5.0)

> follower memory throttling results in error log messages on the leader
> --
>
> Key: KUDU-1584
> URL: https://issues.apache.org/jira/browse/KUDU-1584
> Project: Kudu
>  Issue Type: Bug
>Reporter: Todd Lipcon
>Priority: Major
>
> W0828 22:07:42.156687 49842 consensus_peers.cc:333] T 
> c3810e04cd5f4ce8aa8cef40bcf15e33 P f92dc14d005d45e08ab52cf8142ea5b1 -> Peer 
> 83e1da1e50ac4fbb9efa3310d58bb8ef (e1216.halxg.cloudera.com:7050): Couldn't 
> send request to peer 83e1da1e50ac4fbb9efa3310d58bb8ef for tablet 
> c3810e04cd5f4ce8aa8cef40bcf15e33. Status: Runtime error: (unknown error 
> code). Retrying in the next heartbeat period. Already tried 8 times.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-1575) Backup and restore procedures

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-1575:
--
Target Version/s: 1.8.0  (was: 1.5.0)

> Backup and restore procedures
> -
>
> Key: KUDU-1575
> URL: https://issues.apache.org/jira/browse/KUDU-1575
> Project: Kudu
>  Issue Type: Bug
>  Components: master, tserver
>Reporter: Mike Percy
>Priority: Major
>
> Kudu needs backup and restore procedures, both for data and for metadata.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-1587) Memory-based backpressure is insufficient on seek-bound workloads

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-1587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-1587:
--
Target Version/s: 1.7.0  (was: 1.6.0)

> Memory-based backpressure is insufficient on seek-bound workloads
> -
>
> Key: KUDU-1587
> URL: https://issues.apache.org/jira/browse/KUDU-1587
> Project: Kudu
>  Issue Type: Bug
>  Components: tserver
>Affects Versions: 0.10.0
>Reporter: Todd Lipcon
>Priority: Critical
> Attachments: graph.png, queue-time.png
>
>
> I pushed a uniform random insert workload from a bunch of clients to the 
> point that the vast majority of bloom filters no longer fit in buffer cache, 
> and the compaction had fallen way behind. Thus, every inserted row turns into 
> 40+ seeks (due to non-compact data) and takes 400-500ms. In this kind of 
> workload, the current backpressure (based on memory usage) is insufficient to 
> prevent ridiculously long queues.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-1592) Documentation that mentions file block manager should sound more ominous

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-1592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-1592:
--
Target Version/s: 1.7.0  (was: 1.5.0)

> Documentation that mentions file block manager should sound more ominous
> 
>
> Key: KUDU-1592
> URL: https://issues.apache.org/jira/browse/KUDU-1592
> Project: Kudu
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Todd Lipcon
>Priority: Major
>
> In troubleshooting.adoc, as well as in the error message when we fail to hole 
> punch, we suggest using the file block manager as a workaround. It says 
> something vague about "at the cost of some scalability and efficiency" but 
> should be something a lot more ominous -- users are quickly running out of 
> file descriptors if they try the FBM.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-1683) Kudu client support for pushing runtime min/max filters

2018-02-16 Thread Grant Henke (JIRA)

[ 
https://issues.apache.org/jira/browse/KUDU-1683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16367372#comment-16367372
 ] 

Grant Henke commented on KUDU-1683:
---

Is this Jira complete? I see 
[IMPALA-4252|https://issues.cloudera.org/browse/IMPALA-4252] is resolved. cc 
[~twmarshall] [~tlipcon]

> Kudu client support for pushing runtime min/max filters
> ---
>
> Key: KUDU-1683
> URL: https://issues.apache.org/jira/browse/KUDU-1683
> Project: Kudu
>  Issue Type: New Feature
>  Components: client, perf
>Affects Versions: 1.0.0
>Reporter: Matthew Jacobs
>Priority: Major
>  Labels: impala
>
> Impala would like to generate runtime min/max filters to be pushed to Kudu, 
> at least for scan tokens that haven't been opened yet.
> https://issues.cloudera.org/browse/IMPALA-4252



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-1702) Document/Implement read-your-writes for Impala/Spark etc.

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-1702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-1702:
--
Target Version/s:   (was: 1.5.0)

> Document/Implement read-your-writes for Impala/Spark etc.
> -
>
> Key: KUDU-1702
> URL: https://issues.apache.org/jira/browse/KUDU-1702
> Project: Kudu
>  Issue Type: Sub-task
>  Components: client, tablet, tserver
>Affects Versions: 1.1.0
>Reporter: David Alves
>Assignee: David Alves
>Priority: Major
>
> Engines like Impala/Spark use many independent client instances, so we should 
> provide a way to have read-your-writes across many independent client 
> instances, which translates to providing a way to get linearizable behavior. 
> At first this can be done using the APIs that are already available. For 
> instance if the objective is to be sure to have the results of a write in a a 
> following scan, the following steps can be taken:
> - After a write the engine should collect the last observed timestamps from 
> kudu clients
> - The engine's coordinator then takes the max of those timestamps, adds 1 and 
> uses that as a snapshot scan timestamp.
> One important pre-requisite of the behavior above is that scans be done in 
> READ_AT_SNAPSHOT mode. Also the steps above currently don't actually 
> guarantee the expected behavior, but should as the currently anomalies are 
> taken care of (as part of KUDU-430).
> In the immediate future we'll add APIs to the Kudu client so as to make the 
> inner workings of getting this behavior oblivious to the engine. The steps 
> will still be the same, i.e. timestamps or timestamp tokens will still be 
> passed around, but the kudu client will encapsulate the choice of the 
> timestamp for the scan.
> Later we will add a way to obtain this behavior without timestamp 
> propagation, either by doing a write-side commit-wait, where clients wait out 
> the clock error after/during the last write thus making sure any future 
> operation will have a higher timestamp; or by making read-side commit wait, 
> where we provide an api on the kudu client for the engine to perform a 
> similar call before the scan call to obtain a scan timestamp.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-1736) kudu crash in debug build: unordered undo delta

2018-02-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-1736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-1736:
--
Target Version/s: 1.7.0  (was: 1.6.0)

> kudu crash in debug build: unordered undo delta
> ---
>
> Key: KUDU-1736
> URL: https://issues.apache.org/jira/browse/KUDU-1736
> Project: Kudu
>  Issue Type: Bug
>  Components: tablet
>Reporter: zhangsong
>Priority: Critical
> Attachments: mt-tablet-test-20171123.txt.xz, mt-tablet-test.txt, 
> mt-tablet-test.txt.gz
>
>
> in jd cluster we met kudu-tserver crash with fatal messages described as 
> follow:
> Check failed: last_key_.CompareTo(key) <= 0 must insert undo deltas in 
> sorted order (ascending key, then descending ts): got key (row 
> 1422@tx6052042821982183424) after (row 1422@tx6052042821953155072)
> This is a dcheck which should not failed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   >