[jira] [Commented] (KUDU-2638) kudu cluster restart very long time to reused

2018-12-13 Thread jiaqiyang (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16720962#comment-16720962
 ] 

jiaqiyang commented on KUDU-2638:
-

yes,first thank you very much;

i know that i provide the log is not enough;

thank you for your attention!

i will give out full log for the tserver!

i am very intresting in kudu!

> kudu cluster restart very long time to reused
> -
>
> Key: KUDU-2638
> URL: https://issues.apache.org/jira/browse/KUDU-2638
> Project: Kudu
>  Issue Type: Improvement
>Reporter: jiaqiyang
>Priority: Major
> Fix For: n/a
>
>
> when restart my kudu cluster ;all tablet not avalible:
> run kudu cluster ksck show that:
> Table Summary                                                                 
>                                                                               
>    
> Name | Status | Total Tablets | Healthy | Under-replicated | Unavailable
> +
> t1 | HEALTHY | 1 | 1 | 0 | 0
> t2 | UNAVAILABLE | 5 | 0 | 1 | 4
> t3 | UNAVAILABLE | 6 | 2 | 0 | 4
> t3 | UNAVAILABLE | 3 | 0 | 0 | 3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-2639) How do I clear kudu memory or how do I ensure that kudu memory does not exceed the limit

2018-12-13 Thread Adar Dembo (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16720932#comment-16720932
 ] 

Adar Dembo commented on KUDU-2639:
--

If I'm understanding you correctly, you're asking what will happen if you try 
to insert 10G of data when your limit is configured to 8G? Will the limit be 
exceeded?

It's a difficult question to answer definitively. As Kudu approaches the memory 
limit, there will be more and more backpressure on incoming writes. When the 
memory limit is reached, pretty much all writes will fail. In theory this 
allows Kudu to flush to disk without accumulating additional data and to bring 
the memory consumption back down. I say "in theory" because it's not an 
airtight system. As I mentioned before, scans aren't accounted for in memory 
consumption, and there's no scan-side backpressure when memory consumption is 
high. So it's possible for consumption to be at the limit and to accept new 
scans that further increase the consumption.

But, in most cases, your 10G insert workload will receive backpressure and will 
slow down so that the server can flush enough data to stay under its 8G memory 
limit.

 

> How do I clear kudu memory or how do I ensure that kudu memory does not 
> exceed the limit
> 
>
> Key: KUDU-2639
> URL: https://issues.apache.org/jira/browse/KUDU-2639
> Project: Kudu
>  Issue Type: Bug
>  Components: master, metrics, server
>Affects Versions: 1.7.0
>Reporter: wangkang
>Priority: Major
> Fix For: n/a
>
> Attachments: 1544690968288.jpg, 1544691002343.jpg
>
>
> When I insert 1.2 gigabytes of data, the server value keeps increasing, 
> reaching a peak of 3.2 gigabytes and the memory utilization reaches 48%. So 
> if I want to insert more, is it possible to cause memory usage limit? How to 
> avoid this situation? Can the memory used by this server be cleared manually?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-2639) How do I clear kudu memory or how do I ensure that kudu memory does not exceed the limit

2018-12-13 Thread wangkang (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16720852#comment-16720852
 ] 

wangkang commented on KUDU-2639:


I know peak and the difference in total consumption was advanced memory write 
data, then release so down, but I want to ask is, if I want to insert a large 
amount of data such as 10 g, then the server at this time of the peak is 
bigger, lead to consumption at that point is big, more than a memory hard limit 
service is not available

> How do I clear kudu memory or how do I ensure that kudu memory does not 
> exceed the limit
> 
>
> Key: KUDU-2639
> URL: https://issues.apache.org/jira/browse/KUDU-2639
> Project: Kudu
>  Issue Type: Bug
>  Components: master, metrics, server
>Affects Versions: 1.7.0
>Reporter: wangkang
>Priority: Major
> Fix For: n/a
>
> Attachments: 1544690968288.jpg, 1544691002343.jpg
>
>
> When I insert 1.2 gigabytes of data, the server value keeps increasing, 
> reaching a peak of 3.2 gigabytes and the memory utilization reaches 48%. So 
> if I want to insert more, is it possible to cause memory usage limit? How to 
> avoid this situation? Can the memory used by this server be cleared manually?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-2387) exportAuthenticationCredentials does not retry connectToCluster

2018-12-13 Thread Adar Dembo (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16720732#comment-16720732
 ] 

Adar Dembo commented on KUDU-2387:
--

I spent some time independently looking into this before realizing there was 
already a JIRA filed. Below are my notes, stored for posterity.

When {{ConnectToCluster}} fails to find a leader master, it'll invoke its 
deferred's callback with an exception of type {{RecoverableException}}. As far 
as I can tell, whether or not the request is retried depends on what errbacks 
have been chained to that deferred. If we reached {{ConnectToCluster}} via 
{{sendRpcToTablet()}}, it ensures that a {{RetryRpcErrback}} is chained to the 
deferred, and when invoked, that errback looks at the exception type, and if 
it's recoverable, it schedules a retry of the RPC. So the important observation 
here is that invoking a deferred with a {{RecoverableException}} isn't enough 
for an RPC to be retried; we must ensure that your call chain includes 
{{RetryRpcErrback}} (usually done by calling {{sendRpcToTablet()}}).

So what's going on? Well, {{exportAuthenticationCredentials()}} uses 
{{ConnectToCluster}} via {{getMasterTableLocationsPB()}}, but there's no 
wrapping in {{sendRpcToTablet()}}. So I don't see who on the call chain is 
going to retry the {{RecoverableException}}, and it makes sense that the 
exception just propagates up the stack and fails the application. BTW, 
{{getHiveMetastoreConfig()}} is equally vulnerable to this.

Here's the log that helped me make sense of this. It's a failure in 
DefaultSourceTest with the cluster logging stripped away:
{noformat}
20:09:07.846 [INFO - Test worker] (RandomUtils.java:46) Using random seed: 
1543896547846
20:09:07.847 [INFO - Test worker] (KuduTestHarness.java:137) Creating a new 
MiniKuduCluster...
20:09:07.847 [INFO - Test worker] (MiniKuduCluster.java:146) Using the temp 
directory defined by TEST_TMPDIR: 
/data/somelongdirectorytoavoidrpathissues/src/kudutest
20:09:07.847 [INFO - Test worker] (KuduBinaryLocator.java:52) Using Kudu binary 
directory specified by system property 'kuduBinDir': 
/data/somelongdirectorytoavoidrpathissues/src/kudu/java/../build/latest/bin
20:09:07.847 [INFO - Test worker] (MiniKuduCluster.java:197) Starting process: 
[/data/somelongdirectorytoavoidrpathissues/src/kudu/java/../build/latest/bin/kudu,
 test, mini_cluster, --serialization=pb]
20:09:08.266 [INFO - Test worker] (KuduTestHarness.java:139) Creating a new 
Kudu client...
20:09:08.337 [WARN - New I/O worker #587] (ConnectToCluster.java:278) None of 
the provided masters 
127.27.137.61:53849,127.27.137.60:37363,127.27.137.62:55005 is a leader; will 
retry
20:09:08.373 [ERROR - Test worker] (RetryRule.java:76) 
testUpsertRowsIgnoreNulls(org.apache.kudu.spark.kudu.DefaultSourceTest): failed 
run 1
java.security.PrivilegedActionException: 
org.apache.kudu.client.NoLeaderFoundException: Master config 
(127.27.137.61:53849,127.27.137.60:37363,127.27.137.62:55005) has no leader.
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:360)
at org.apache.kudu.spark.kudu.KuduContext.(KuduContext.scala:122)
at org.apache.kudu.spark.kudu.KuduContext.(KuduContext.scala:65)
at 
org.apache.kudu.spark.kudu.KuduTestSuite$class.setUpBase(KuduTestSuite.scala:129)
at 
org.apache.kudu.spark.kudu.DefaultSourceTest.setUpBase(DefaultSourceTest.scala:38)
at sun.reflect.GeneratedMethodAccessor60.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
at 
org.apache.kudu.test.junit.RetryRule$RetryStatement.evaluate(RetryRule.java:72)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at 

[jira] [Commented] (KUDU-2387) exportAuthenticationCredentials does not retry connectToCluster

2018-12-13 Thread Adar Dembo (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16720726#comment-16720726
 ] 

Adar Dembo commented on KUDU-2387:
--

This issue causes flakiness in every test that calls 
exportAuthenticationCredentials() without using the same hack as TestSecurity. 
At the time of writing, this includes:
{noformat}
TestKuduClient.testGetAuthnToken
TestKuduClient.testCloseShortlyAfterOpen
TestKuduClient.testNoLogSpewOnConnectionRefused
{noformat}

Additionally, there are calls in KuduTableMapReduceUtil (kudu-mapreduce) and 
KuduContext (kudu-spark) that not only cause all associated tests to be flaky, 
but are also vulnerabilities in the product itself: if someone calls 
exportAuthenticationCredentials() on a fresh KuduClient during a master leader 
election, it's liable to fail and not retry.

Finally, getHiveMetastoreConfig() (from the new HMS integration code) is 
structured like exportAuthenticationCredentials(), so it (and its dependents) 
is equally vulnerable.


> exportAuthenticationCredentials does not retry connectToCluster
> ---
>
> Key: KUDU-2387
> URL: https://issues.apache.org/jira/browse/KUDU-2387
> Project: Kudu
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.7.0
>Reporter: Todd Lipcon
>Priority: Critical
>
> TestSecurity has the following TODO:
> {code}
> // TODO(todd): it seems that exportAuthenticationCredentials() doesn't 
> properly retry
> // in the case that there is no leader, even though 
> NoLeaderFoundException is a RecoverableException.
> // So, we have to use a hack of calling listTabletServers, which _does_ 
> properly retry,
> // in order to wait for the masters to elect a leader.
> {code}
> It seems like this causes occasional failures of tests like KuduRDDTest -- I 
> saw a case where the client failed to connect due to a negotiation timeout, 
> and then didn't retry at all. It's not clear why the 3-second negotiation 
> timeout was insufficient in this test case but likely just machine load or 
> somesuch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KUDU-2219) org.apache.kudu.client.TestKuduClient.testCloseShortlyAfterOpen flaky

2018-12-13 Thread Adar Dembo (JIRA)


 [ 
https://issues.apache.org/jira/browse/KUDU-2219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adar Dembo resolved KUDU-2219.
--
   Resolution: Duplicate
 Assignee: (was: Andrew Wong)
Fix Version/s: n/a

Hard to say whether Todd's original report was due to KUDU-2387, but Alexey's 
certainly was. I'm going to optimistically close this as a dupe; after fixing 
KUDU-2387, we can reopen this bug if the failure Todd reported resurfaces.

> org.apache.kudu.client.TestKuduClient.testCloseShortlyAfterOpen flaky
> -
>
> Key: KUDU-2219
> URL: https://issues.apache.org/jira/browse/KUDU-2219
> Project: Kudu
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.6.0
>Reporter: Todd Lipcon
>Priority: Major
> Fix For: n/a
>
> Attachments: 
> org.apache.kudu.client.TestKuduClient.testCloseShortlyAfterOpen.log.xz
>
>
> This test has an assertion that no exceptions get logged, but it seems to 
> fail sometiimes with an IllegalStateException in the log:
> {code}
> ERROR - [peer master-127.62.82.1:64034] unexpected exception from downstream 
> on [id: 0xc4472f9d, /127.62.82.1:58372 :> /127.62.82.1:64034]
> java.lang.IllegalStateException
>   at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:429)
>   at 
> org.apache.kudu.client.Connection.messageReceived(Connection.java:264)
>   at 
> org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
>   at org.apache.kudu.client.Connection.handleUpstream(Connection.java:236)
>   at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
>   at 
> org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
>   at 
> org.jboss.netty.handler.timeout.ReadTimeoutHandler.messageReceived(ReadTimeoutHandler.java:184)
>   at 
> org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
>   at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
>   at 
> org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
>   at 
> org.jboss.netty.handler.codec.oneone.OneToOneDecoder.handleUpstream(OneToOneDecoder.java:68)
>   at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
>   at 
> org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
>   at 
> org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:291)
>   at 
> org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
>   at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
>   at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
>   at 
> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
>   at 
> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
>   at org.apache.kudu.client.Negotiator.finish(Negotiator.java:653)
>   at 
> org.apache.kudu.client.Negotiator.handleSuccessResponse(Negotiator.java:641)
>   at 
> org.apache.kudu.client.Negotiator.handleSaslMessage(Negotiator.java:278)
>   at org.apache.kudu.client.Negotiator.handleResponse(Negotiator.java:258)
>   at 
> org.apache.kudu.client.Negotiator.messageReceived(Negotiator.java:231)
>   at 
> org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
>   at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
>   at 
> org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
>   at 
> org.jboss.netty.handler.timeout.ReadTimeoutHandler.messageReceived(ReadTimeoutHandler.java:184)
>   at 
> org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
>   at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
>   at 
> org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
>   at 
> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
>   at 
> org.jboss.netty.handler.codec.oneone.OneToOneDecoder.handleUpstream(OneToOneDecoder.java:70)
>   at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)

[jira] [Updated] (KUDU-2387) exportAuthenticationCredentials does not retry connectToCluster

2018-12-13 Thread Adar Dembo (JIRA)


 [ 
https://issues.apache.org/jira/browse/KUDU-2387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adar Dembo updated KUDU-2387:
-
Priority: Critical  (was: Major)

> exportAuthenticationCredentials does not retry connectToCluster
> ---
>
> Key: KUDU-2387
> URL: https://issues.apache.org/jira/browse/KUDU-2387
> Project: Kudu
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.7.0
>Reporter: Todd Lipcon
>Priority: Critical
>
> TestSecurity has the following TODO:
> {code}
> // TODO(todd): it seems that exportAuthenticationCredentials() doesn't 
> properly retry
> // in the case that there is no leader, even though 
> NoLeaderFoundException is a RecoverableException.
> // So, we have to use a hack of calling listTabletServers, which _does_ 
> properly retry,
> // in order to wait for the masters to elect a leader.
> {code}
> It seems like this causes occasional failures of tests like KuduRDDTest -- I 
> saw a case where the client failed to connect due to a negotiation timeout, 
> and then didn't retry at all. It's not clear why the 3-second negotiation 
> timeout was insufficient in this test case but likely just machine load or 
> somesuch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KUDU-2639) How do I clear kudu memory or how do I ensure that kudu memory does not exceed the limit

2018-12-13 Thread Adar Dembo (JIRA)


 [ 
https://issues.apache.org/jira/browse/KUDU-2639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adar Dembo resolved KUDU-2639.
--
   Resolution: Information Provided
Fix Version/s: n/a

Your screenshots show a tserver whose memory peaked at 3.2G (or more), but then 
the consumption dropped following what I'm guessing were MemRowSet flushes. Is 
your concern that, despite the current server consumption being only ~300M, the 
"total consumption" is still at ~1.7G? Discrepancies like these can arise:
 # tcmalloc will release memory back to the OS slowly. Does the total 
consumption drop if the server is left idle?
 # The server consumption only includes Kudu objects whose memory is explicitly 
tracked. There are many other objects that aren't tracked, and depending on 
your workload, the total consumption of those untracked objects may be higher 
or lower. The biggest offenders are scans; all memory consumption from scans 
will be included in "total consumption" but not in "server consumption".

I'm closing this Jira as there's nothing actionable here. Let's continue this 
discussion either in the Kudu user mailing list or Slack channels.

> How do I clear kudu memory or how do I ensure that kudu memory does not 
> exceed the limit
> 
>
> Key: KUDU-2639
> URL: https://issues.apache.org/jira/browse/KUDU-2639
> Project: Kudu
>  Issue Type: Bug
>  Components: master, metrics, server
>Affects Versions: 1.7.0
>Reporter: wangkang
>Priority: Major
> Fix For: n/a
>
> Attachments: 1544690968288.jpg, 1544691002343.jpg
>
>
> When I insert 1.2 gigabytes of data, the server value keeps increasing, 
> reaching a peak of 3.2 gigabytes and the memory utilization reaches 48%. So 
> if I want to insert more, is it possible to cause memory usage limit? How to 
> avoid this situation? Can the memory used by this server be cleared manually?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KUDU-2638) kudu cluster restart very long time to reused

2018-12-13 Thread Adar Dembo (JIRA)


 [ 
https://issues.apache.org/jira/browse/KUDU-2638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adar Dembo resolved KUDU-2638.
--
   Resolution: Information Provided
Fix Version/s: n/a

I spent some time looking at your log; here are my observations:
 * The bulk of the time appears to be spent loading tablet metadata. How many 
tablets are on this node? What kind of hardware is being used for the tserver's 
metadata directory?
 * The actual time spent compacting this tablet is minimal: ~8 seconds for all 
of the compaction operations to run, vs. ~2m to bootstrap the tablet.
 * This tablet only has one other peer (besides the local replica), which means 
both replicas need to be running before you'll be able to write to the tablet. 
Was the tablet's table created with a replication factor of 2?
 * By removing all but the references to tablet 
5aae5dc9e6f4468aaf00c060152d4fed it's much more difficult to understand what's 
going on. For example, I don't know how many data directories you have (which 
affects compaction speed). I don't know how many tablets you have, or which 
block manager you're using (both of which affect startup time). A full tserver 
log (with e.g. hostnames redacted) would yield better results.

Anyway, I'm not seeing anything actionable here, so I'm going to close this 
JIRA. General queries like these (i.e. "help me understand why my cluster is 
restarting slowly") should be directed towards the Kudu user mailing list or 
Slack channels.

> kudu cluster restart very long time to reused
> -
>
> Key: KUDU-2638
> URL: https://issues.apache.org/jira/browse/KUDU-2638
> Project: Kudu
>  Issue Type: Improvement
>Reporter: jiaqiyang
>Priority: Major
> Fix For: n/a
>
>
> when restart my kudu cluster ;all tablet not avalible:
> run kudu cluster ksck show that:
> Table Summary                                                                 
>                                                                               
>    
> Name | Status | Total Tablets | Healthy | Under-replicated | Unavailable
> +
> t1 | HEALTHY | 1 | 1 | 0 | 0
> t2 | UNAVAILABLE | 5 | 0 | 1 | 4
> t3 | UNAVAILABLE | 6 | 2 | 0 | 4
> t3 | UNAVAILABLE | 3 | 0 | 0 | 3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-1575) Backup and restore procedures

2018-12-13 Thread Mike Percy (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16720484#comment-16720484
 ] 

Mike Percy commented on KUDU-1575:
--

Hey Tim latest progress on this is we have some of the low-level work done but 
still working on finishing up the ability to do diff scans which are the basis 
for incremental backups. Once we finish that there is quite a bit of work left 
to implement restore of incremental backups, plus a lot of testing to ensure 
perf / scale / stability are all acceptable. No commitment on timeline but I am 
hoping a basic version of backup makes it out in the next release or two of 
Kudu.

> Backup and restore procedures
> -
>
> Key: KUDU-1575
> URL: https://issues.apache.org/jira/browse/KUDU-1575
> Project: Kudu
>  Issue Type: Improvement
>  Components: master, tserver
>Reporter: Mike Percy
>Assignee: Mike Percy
>Priority: Major
>
> Kudu needs backup and restore procedures, both for data and for metadata.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-2597) Add CLI tool to parse metrics from diagnostics log

2018-12-13 Thread Andrew Wong (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-2597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16720480#comment-16720480
 ] 

Andrew Wong commented on KUDU-2597:
---

Haven't gotten around to completing this, but there's a should-be-functional 
tool that spits out a tsv here: https://github.com/andrwng/kudu/commits/metrics

> Add CLI tool to parse metrics from diagnostics log
> --
>
> Key: KUDU-2597
> URL: https://issues.apache.org/jira/browse/KUDU-2597
> Project: Kudu
>  Issue Type: Sub-task
>  Components: CLI, metrics, ops-tooling
>Reporter: Andrew Wong
>Assignee: Andrew Wong
>Priority: Major
>
> We have a somewhat-crufty 'parse_metrics_log.py' script that isn't 
> particularly great. It'd be nice if metrics parsing were baked into the CLI 
> to provide something more fit for human consumption: a tsv, a summary of 
> different perf metrics, etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-1563) Add support for INSERT IGNORE

2018-12-13 Thread Brock Noland (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-1563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16720459#comment-16720459
 ] 

Brock Noland commented on KUDU-1563:


Thanks [~r1pp3rj4ck] for picking this up. While I'd like to contribute it, I am 
more concerned with getting access to the feature. Is this something you have 
bandwidth to work on now?

bq. I agree that operation level is more intuitive and more flexible, though I 
don't really see a use case for that added flexibility. Can you articulate one?

I don't have a use case. Only a that feels slightly odd changing a session 
level parameter to define how a operation behaves. I am fine with the suggested 
approach it'll just take more work to implement.

> Add support for INSERT IGNORE
> -
>
> Key: KUDU-1563
> URL: https://issues.apache.org/jira/browse/KUDU-1563
> Project: Kudu
>  Issue Type: New Feature
>Reporter: Dan Burkert
>Assignee: Attila Bukor
>Priority: Major
>  Labels: newbie
>
> The Java client currently has an [option to ignore duplicate row key errors| 
> https://kudu.apache.org/apidocs/org/kududb/client/AsyncKuduSession.html#setIgnoreAllDuplicateRows-boolean-],
>  which is implemented by filtering the errors on the client side.  If we are 
> going to continue to support this feature (and the consensus seems to be that 
> we probably should), we should promote it to a first class operation type 
> that is handled on the server side.  This would have a modest perf. 
> improvement since less errors are returned, and it would allow INSERT IGNORE 
> ops to be mixed in the same batch as other INSERT, DELETE, UPSERT, etc. ops.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KUDU-2640) Add a KuduSink for Spark Structured Streaming

2018-12-13 Thread Grant Henke (JIRA)
Grant Henke created KUDU-2640:
-

 Summary: Add a KuduSink for Spark Structured Streaming
 Key: KUDU-2640
 URL: https://issues.apache.org/jira/browse/KUDU-2640
 Project: Kudu
  Issue Type: New Feature
Affects Versions: 1.8.0
Reporter: Grant Henke
Assignee: Grant Henke


Today writing to Kudu from spark takes some clever usage of the KuduContext. 
This Jira tracks adding a fully configurable KuduSink so that direct usage of 
the KuduContext is not required. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (KUDU-1563) Add support for INSERT IGNORE

2018-12-13 Thread Attila Bukor (JIRA)


 [ 
https://issues.apache.org/jira/browse/KUDU-1563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Bukor reassigned KUDU-1563:
--

Assignee: Attila Bukor  (was: Brock Noland)

> Add support for INSERT IGNORE
> -
>
> Key: KUDU-1563
> URL: https://issues.apache.org/jira/browse/KUDU-1563
> Project: Kudu
>  Issue Type: New Feature
>Reporter: Dan Burkert
>Assignee: Attila Bukor
>Priority: Major
>  Labels: newbie
>
> The Java client currently has an [option to ignore duplicate row key errors| 
> https://kudu.apache.org/apidocs/org/kududb/client/AsyncKuduSession.html#setIgnoreAllDuplicateRows-boolean-],
>  which is implemented by filtering the errors on the client side.  If we are 
> going to continue to support this feature (and the consensus seems to be that 
> we probably should), we should promote it to a first class operation type 
> that is handled on the server side.  This would have a modest perf. 
> improvement since less errors are returned, and it would allow INSERT IGNORE 
> ops to be mixed in the same batch as other INSERT, DELETE, UPSERT, etc. ops.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-2639) How do I clear kudu memory or how do I ensure that kudu memory does not exceed the limit

2018-12-13 Thread wangkang (JIRA)


 [ 
https://issues.apache.org/jira/browse/KUDU-2639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangkang updated KUDU-2639:
---
Summary: How do I clear kudu memory or how do I ensure that kudu memory 
does not exceed the limit  (was: 如何清除kudu memory 或者 如何保证kudu memory 不会超过限制?)

> How do I clear kudu memory or how do I ensure that kudu memory does not 
> exceed the limit
> 
>
> Key: KUDU-2639
> URL: https://issues.apache.org/jira/browse/KUDU-2639
> Project: Kudu
>  Issue Type: Bug
>  Components: master, metrics, server
>Affects Versions: 1.7.0
>Reporter: wangkang
>Priority: Major
> Attachments: 1544690968288.jpg, 1544691002343.jpg
>
>
> When I insert 1.2 gigabytes of data, the server value keeps increasing, 
> reaching a peak of 3.2 gigabytes and the memory utilization reaches 48%. So 
> if I want to insert more, is it possible to cause memory usage limit? How to 
> avoid this situation? Can the memory used by this server be cleared manually?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KUDU-2639) 如何清除kudu memory 或者 如何保证kudu memory 不会超过限制?

2018-12-13 Thread wangkang (JIRA)


 [ 
https://issues.apache.org/jira/browse/KUDU-2639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangkang updated KUDU-2639:
---
Description: When I insert 1.2 gigabytes of data, the server value keeps 
increasing, reaching a peak of 3.2 gigabytes and the memory utilization reaches 
48%. So if I want to insert more, is it possible to cause memory usage limit? 
How to avoid this situation? Can the memory used by this server be cleared 
manually?  (was: 当我插入1.2G数据的时候 server值就一直增加,峰值达到了3.2G 
内存使用率达到了48%,那如果我要插入更大,是不是就有可能造成内存使用限制?这种情况怎么避免,还有这个server使用的内存能手动清除吗?)

> 如何清除kudu memory 或者 如何保证kudu memory 不会超过限制?
> --
>
> Key: KUDU-2639
> URL: https://issues.apache.org/jira/browse/KUDU-2639
> Project: Kudu
>  Issue Type: Bug
>  Components: master, metrics, server
>Affects Versions: 1.7.0
>Reporter: wangkang
>Priority: Major
> Attachments: 1544690968288.jpg, 1544691002343.jpg
>
>
> When I insert 1.2 gigabytes of data, the server value keeps increasing, 
> reaching a peak of 3.2 gigabytes and the memory utilization reaches 48%. So 
> if I want to insert more, is it possible to cause memory usage limit? How to 
> avoid this situation? Can the memory used by this server be cleared manually?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KUDU-2639) 如何清除kudu memory 或者 如何保证kudu memory 不会超过限制?

2018-12-13 Thread wangkang (JIRA)
wangkang created KUDU-2639:
--

 Summary: 如何清除kudu memory 或者 如何保证kudu memory 不会超过限制?
 Key: KUDU-2639
 URL: https://issues.apache.org/jira/browse/KUDU-2639
 Project: Kudu
  Issue Type: Bug
  Components: master, metrics, server
Affects Versions: 1.7.0
Reporter: wangkang
 Attachments: 1544690968288.jpg, 1544691002343.jpg

当我插入1.2G数据的时候 server值就一直增加,峰值达到了3.2G 
内存使用率达到了48%,那如果我要插入更大,是不是就有可能造成内存使用限制?这种情况怎么避免,还有这个server使用的内存能手动清除吗?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)