Re: [ANNOUNCE] Welcoming Márton Greber as Kudu committer and PMC member

2023-11-14 Thread Alexey Serbin
Congratulations, Márton!

On Tue, Nov 14, 2023 at 9:37 AM Andrew Wong  wrote:

> Hi Kudu community,
>
> I'm happy to announce that the Kudu PMC has voted to add Márton Greber as a
> new committer and PMC member.
>
> Some of Márton's contributions include:
> - Getting Kudu to build and run on Apple silicon
> - Improving feature parity of the Python client with a number of features
> - Various bug fixes around the codebase
>
> Please join me in congratulating Márton!
>


Re: Does Kudu connecor has Kerberos auth support?

2023-09-15 Thread Alexey Serbin
Hi Melih,

Yes, Kudu clients do support Kerberos authentication.  I'm not sure what
exactly you referred to
as "Kudu connector", but both C++ and Java clients can authenticate to a
secure Kudu cluster
using Kerberos.

https://kudu.apache.org/docs/security.html#_client_authentication_to_secure_kudu_clusters

I hope this helps.


Kind regards,

Alexey

On Fri, Sep 15, 2023 at 5:49 AM Melih Taşdizen  wrote:

> Hi folks,
>
> I spent some time browsing docs but couldn't find any info on Kerberos
> support for the Kudu connector.
>
> I would appreciate it if someone could point out some docs or send some
> example configurations.
>
> Bests,
>
> Melih
>


Re: [ANNOUNCE] Welcoming Abhishek Chennaka as Kudu committer and PMC member

2023-02-27 Thread Alexey Serbin
Congrats, Abhishek!

I'm happy to know you've accepted the invitation and look forward
to contributing to the project.


Kind regards,

Alexey

On Mon, Feb 27, 2023 at 9:18 AM Abhishek Chennaka 
wrote:

> Thank you all a ton for your appreciation. I'll try to keep contributing
> more and more.
>
> Regards,
> Abhishek
>
> On Mon, Feb 27, 2023 at 5:27 AM Zoltan Chovan 
> wrote:
>
>> Congrats Abhishek!
>>
>> On Mon, Feb 27, 2023 at 1:57 PM Attila Bukor  wrote:
>>
>> > Congrats Abhishek, well deserved!
>> >
>>
>


Re: Kudu cluster sizing questions

2021-09-23 Thread Alexey Serbin
Hi Chetan,

Thank you for taking a look at Kudu!  Apache Kudu is designed to perform
well in OLAP workloads.

You can scale Kudu cluster horizontally pretty well at least up to few
hundreds of nodes, and here you could find more information on recommended
data-per-node-sizes, scaling limitations, and more:
https://kudu.apache.org/docs/known_issues.html#_scale

In the past, scans could become slower if the ingestion of the data follows
the 'trickling inserts' pattern (see
https://issues.apache.org/jira/browse/KUDU-1400), but it's been addressed
and newer versions (1.10 and newer) don't have the issue.

There isn't a limit of how many large tables you can host in a Kudu
cluster, assuming you partition those large tables appropriately (see
https://kudu.apache.org/docs/schema_design.html#schema_design) and scale
the cluster as needed, especially, if some of those tables contain 'cold'
data.

Random reads and updates are supported regardless of the scale of a Kudu
cluster.  Even more: starting with Kudu 1.15 there is support for multi-row
transactions marked as an experimental feature supporting
INSERT/INSERT_IGNORE operations only at this point -- it targets rather
the 'bulk ingest' use case, not OLTP patterns with many small transactions,
though.

The important points to allow as many parallel workloads against a single
Kudu cluster are (a) choose table schema properly (b) partition tables
accordingly (c) use multiple data directories backed by separate HDD/SSD
per node (d) use SSD or NVMe devices for the WAL (d) allocate enough memory
for the block cache.  I'd recommend building a POC to get some real numbers
because workloads vary and it's hard to provide exact numbers without
knowing much of the details.

As for related articles/blogs about using Kudu, I can recommend taking a
look at the following relatively recent posts:
  https://boristyukin.com/building-near-real-time-big-data-lake-part-i/
  https://boristyukin.com/building-near-real-time-big-data-lake-part-2/

Probably, other people could chime in to provide more insights based on
their own experience running Kudu with their workloads.


Kind regards,

Alexey

On Tue, Sep 21, 2021 at 11:29 PM Chetan Rautela 
wrote:

> Hi team ,
>
> I am looking for some storage solution that can give high ingestion/update
> rates and able to run OLAP queries, Apache Kudu looks one promising
> solution,
> Please help me to check if Apache Kudu is correct fit
>
> Use Case:
>  .
> I am receiving 40K records per sec. record size is less, 5 fields
> max. 2 string 2 timestamp 1 number.
> With primary key I will be getting ~ 2 billion unique records per
> day and rest will be updates.
> With Apache Spark aggregation we can reduce 20% of updates.
> TTL of each record will be 30 days.
>
> How much data can we store in kudu per node ?
> With large updates , will get/scan request become slow over time ?
> How much large tables can we create in Kudu ?
> Will random read and update be supported at this scale ?
> How many parallel ingestion jobs can we in a Kudu, for different tables ?
>
>
> Please suggest some articles related to kudu sizing and performance.
>
> Regards,
> Chetan Rautela


Re: Failure to find org.apache.kudu:kudu-binary:jar:linux-aarch_64:1.9.0 in Maven Central

2021-05-11 Thread Alexey Serbin
Hi,

Indeed, with the work performed in the context
https://issues.apache.org/jira/browse/KUDU-3007, it has become possible to
build and run Kudu on ARM/aarch64 in 1.13 release.  It seems in 1.14 and in
the main trunk the ARM/aarch64 build is now broken [1].

I don't think that upgrading the kudu-binary artifact in Gora would fix the
issue since such artifact hasn't been published for ARM/aarch64
architecture.  Per Kudu's release procedure [2], the kudu-binary is built
and published only for x86_64 architecture on Linux and macOS, for
ARM/aarch64 the artifact isn't built/published as of now.

I think the major obstacle here is that ARM/aarch64 boxes are not
accessible for the majority of developers in the community. At least, I
work on Kudu using hardware and cloud resources provided and managed by
Cloudera.  These are all x86_64 machines and I don't have access to an
ARM/aarch64 box as of now.

BTW, there are other requests to publish kudu-binary for ARM/aarch64 [3].
I guess it would be great to have the ARM/aarch64 build fixed for the
upcoming 1.15 release and have the kudu-binary artifact published for
ARM/aarch64 Linux correspondingly.  However, I don't know what are the
steps to make this happen.  Any help in this context is appreciated!


Kind regards,

Alexey

[1] https://issues.apache.org/jira/browse/KUDU-3263
[2] https://github.com/apache/kudu/blob/master/RELEASING.adoc
[3] https://issues.apache.org/jira/browse/KUDU-3264

On Tue, May 11, 2021 at 6:16 PM Lewis John McGibbney 
wrote:

> Hi user@,
> A Gora community member recently opened
> https://issues.apache.org/jira/browse/GORA-672 which indicates that Gora
> master branch cannot be built on ARM/aarch64 platform.
> I did come across https://issues.apache.org/jira/browse/KUDU-3007 and see
> that this may have been implemented for some tim since kudu 1.13.0 release.
> Would upgrading the kudu-binary artifact in Gora fix this issue?
> Thank you for any insight.
> lewismc
>
>


Re: Cache error running KuduTestHarness

2020-10-12 Thread Alexey Serbin
Hi Steve,

Thank you for the update.  Yes, --block_cache_capacity_mb should be in
MiBytes (and yes, I messed things up with --block_cache_type, of course --
it seems I copy-pasted the wrong flag name in there).

That's a good call: I don't know any valid reason why not to include
javadocs at least for KuduTestHarness into the API docs published at the
kudu.apache.org site.  I think adding it there would be a great
improvement.  Are you interested in making a contribution, posting a patch
for that?  You can find the guidelines for contributions at
https://kudu.apache.org/docs/contributing.html

Patches are always welcome! :)


Thanks,

Alexey

On Wed, Oct 7, 2020 at 10:19 AM  wrote:

> Thanks Alexey,
>
>
>
> I solved this in the end by reading the source code. My mistake was to
> assume the test harness managed the builder as a singleton and the get
> method was an accessor. Instead it creates a fresh one every time and the
> get method is factory. You have to inject your customised builder when you
> call the harness rule. I also corrected the fact I was providing a value in
> bytes instead of MB to the flag. This is the solution.
>
>
>
> public static MiniKuduClusterBuilder builder;
>
>
>
> @BeforeClass
>
> public static void classInit(){
>
> builder=KuduTestHarness.getBaseClusterBuilder()
>
> .addMasterServerFlag("--block_cache_capacity_mb=475")
>
> .addTabletServerFlag("--block_cache_capacity_mb=475");
>
> }
>
>
>
> @Rule
>
> public KuduTestHarness harness=new KuduTestHarness(builder);
>
>
>
> Is there a reason why the javadocs for the test cluster classes are not
> available on the main kudu.apache.org site?
>
>
>
> *Cheers*
>
> *Steve*
>
>
>
> *From:* Alexey Serbin [mailto:aser...@cloudera.com]
> *Sent:* 07 October 2020 18:01
> *To:* user@kudu.apache.org
> *Subject:* Re: Cache error running KuduTestHarness
>
>
>
> Hi,
>
>
>
> I haven't looked at the issue with the builder ignoring the settings you
> added, but as a working example of adding custom flags to Kudu master and
> tablet servers you can take a look at:
> https://github.com/apache/kudu/blob/90aa4fa7d1527f376803440a4642668e3d798748/java/kudu-client/src/test/java/org/apache/kudu/client/TestNegotiation.java#L48-L52
> <https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fkudu%2Fblob%2F90aa4fa7d1527f376803440a4642668e3d798748%2Fjava%2Fkudu-client%2Fsrc%2Ftest%2Fjava%2Forg%2Fapache%2Fkudu%2Fclient%2FTestNegotiation.java%23L48-L52=02%7C01%7Cstephen.hindmarch%40bt.com%7C6503391bd9a04c2dc13008d86ae2bad3%7Ca7f356889c004d5eba4129f146377ab0%7C0%7C0%7C637376869750326403=xJrv%2BhEh7e36QLg16bJhp4nlCrEThTeQ2t9nDl7jV7o%3D=0>
> and
> https://github.com/apache/kudu/blob/90aa4fa7d1527f376803440a4642668e3d798748/java/kudu-client/src/test/java/org/apache/kudu/client/TestTimeouts.java#L94
> <https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fkudu%2Fblob%2F90aa4fa7d1527f376803440a4642668e3d798748%2Fjava%2Fkudu-client%2Fsrc%2Ftest%2Fjava%2Forg%2Fapache%2Fkudu%2Fclient%2FTestTimeouts.java%23L94=02%7C01%7Cstephen.hindmarch%40bt.com%7C6503391bd9a04c2dc13008d86ae2bad3%7Ca7f356889c004d5eba4129f146377ab0%7C0%7C0%7C637376869750336362=B3phwkG3omjGU62Q4rAPnybvNqIj8msEjDVeSRxjKos%3D=0>
> (I guess in this context a custom MiniKuduClusterBuilder isn't necessary).
>
>
>
> It seems in case of running tests at machines with less memory it's worth
> setting --block_cache_type to something low for both master and tablet
> servers (like 128MB: --block_cache_type=134217728) because master and
> tserver will eat up some memory once started.  An alternative option is to
> add --force_block_cache_capacity flag to both master's and tserver's flags.
>
>
>
>
>
> HTH,
>
>
>
> Alexey
>
>
>
> On Wed, Oct 7, 2020 at 4:14 AM  wrote:
>
> I am trying to follow the guide to using the KuduTestHarness in the
> Getting Started guide. I have created the following simple test case.
>
> ===
> import org.apache.kudu.test.KuduTestHarness;
>
> import static org.junit.Assert.assertTrue;
> import org.junit.Rule;
> import org.junit.Test;
>
> public class DemoTest{
> @Rule
> public KuduTestHarness harness=new KuduTestHarness();
>
> @Test
> public void testDemo(){
> assertTrue(true);
> }
> }
> ===
>
> But I get the following errors in the console log.
>
> ===
> 2020-10-07 11:50:01,060 [cluster stderr printer] INFO
> org.apache.kudu.test.cluster.MiniKuduCluster - E1007 11:50:01.059237 17257
> block_cache.cc:99] Block cache capacity exceeds the memory pressure
> threshold (53687

Re: Cache error running KuduTestHarness

2020-10-07 Thread Alexey Serbin
Hi,

I haven't looked at the issue with the builder ignoring the settings you
added, but as a working example of adding custom flags to Kudu master and
tablet servers you can take a look at:
https://github.com/apache/kudu/blob/90aa4fa7d1527f376803440a4642668e3d798748/java/kudu-client/src/test/java/org/apache/kudu/client/TestNegotiation.java#L48-L52
and
https://github.com/apache/kudu/blob/90aa4fa7d1527f376803440a4642668e3d798748/java/kudu-client/src/test/java/org/apache/kudu/client/TestTimeouts.java#L94
(I guess in this context a custom MiniKuduClusterBuilder isn't necessary).

It seems in case of running tests at machines with less memory it's worth
setting --block_cache_type to something low for both master and tablet
servers (like 128MB: --block_cache_type=134217728) because master and
tserver will eat up some memory once started.  An alternative option is to
add --force_block_cache_capacity flag to both master's and tserver's flags.


HTH,

Alexey

On Wed, Oct 7, 2020 at 4:14 AM  wrote:

> I am trying to follow the guide to using the KuduTestHarness in the
> Getting Started guide. I have created the following simple test case.
>
> ===
> import org.apache.kudu.test.KuduTestHarness;
>
> import static org.junit.Assert.assertTrue;
> import org.junit.Rule;
> import org.junit.Test;
>
> public class DemoTest{
> @Rule
> public KuduTestHarness harness=new KuduTestHarness();
>
> @Test
> public void testDemo(){
> assertTrue(true);
> }
> }
> ===
>
> But I get the following errors in the console log.
>
> ===
> 2020-10-07 11:50:01,060 [cluster stderr printer] INFO
> org.apache.kudu.test.cluster.MiniKuduCluster - E1007 11:50:01.059237 17257
> block_cache.cc:99] Block cache capacity exceeds the memory pressure
> threshold (536870912 bytes vs. 498776800 bytes). This will cause
> instability and harmful flushing behavior. Lower --block_cache_capacity_mb
> or raise --memory_limit_hard_bytes.
>
> 2020-10-07 11:50:01,060 [cluster stderr printer] INFO
> org.apache.kudu.test.cluster.MiniKuduCluster - E1007 11:50:01.059262 17257
> flags.cc:441] Detected inconsistency in command-line flags; exiting
>
> 2020-10-07 11:50:01,100 [main] DEBUG
> org.apache.kudu.test.cluster.MiniKuduCluster - Response: error {
>   code: RUNTIME_ERROR
>   message: "failed to start masters: Unable to start Master at index 0:
> /tmp/kudu-binary-jar1893943400146501302/kudu-binary-1.13.0-linux-x86_64/bin/kudu-master:
> process exited with non-zero status 1"
> }
> ===
>
> I tried adding a flag to the base builder, but it does not have any
> affect. The new flag does not show up in the list of flags in the logs.
>
> ===
> import org.apache.kudu.test.cluster.MiniKuduCluster.MiniKuduClusterBuilder;
>
> ...
>
> static{
> MiniKuduClusterBuilder
> builder=KuduTestHarness.getBaseClusterBuilder();
> builder.addMasterServerFlag("--block_cache_capacity_mb=498776800");
> }
> ...
> ===
>
> Can someone point me in the right direction for solving this problem.
>
> Thanks
> Steve Hindmarch
>


Re: Kudu - Azure Integration Script

2020-09-03 Thread Alexey Serbin
Hi,

I'm not aware of such a thing as "Azure integration for Kudu" at this point
(what it would entail, BTW?).   Maybe, somebody else can chime in if they
have some sort of Azure-specific content that they find useful.

But as for the scripts to automatically start Kudu servers after rebooting
a node, it can be done via standard rc.d/systemd subsystem.  The upstream
Kudu repository doesn't contain such scripts because Apache Kudu Project is
focusing on source-only releases and doesn't distribute any packages
specific to various Linux OS versions and distributions.

The good news is that you can find the startup scripts elsewhere from other
vendors.  Examples of rc.d scripts for older Kudu releases for RHEL/CentOS
7 distros can be found in RPM package at
https://archive.cloudera.com/cdh6/6.3.2/redhat7/yum/RPMS/x86_64/kudu-tserver-1.10.0+cdh6.3.2-1605554.el7.x86_64.rpm.
You can also find systemd scripts for RPM-based Linux distros at
https://github.com/MartinWeindel/kudu-rpm  The idea is that you can build
your own RPM/Debian packages out of binaries that you build from the
released source code and install via corresponding packaging system of a
particular Linux OS version and distro.  In that case, there is no VM
specifics to bother about and it will work at any VM or hardware where your
Linux OS is running.

However, if you have some Azure-specific automation scripts, please
remember that contributions are always welcome!  :)  The guidelines for
posting patches can be found at
https://kudu.apache.org/docs/contributing.html

Thank you!


Kind regards,

Alexey

On Thu, Sep 3, 2020 at 8:40 AM Shubham Jain (IN) 
wrote:

> Hi Team,
>
> I was working on Kudu for a long time by installing kudu on a Azure VM
> cluster as a single node.
>
> However, on the official link  https://kudu.apache.org/ , I was unable to
> find the Azure integration.
>
> I would request you to if possible could you please share the JSON script
> which I can add in the VM script so that Kudu server is automatically up
> and running while rebooting theAzure VM.
>
> Please confirm if you can provide any such script, or please let me know
> if you need any further information from my side.
>
> Regards,
> Shubham Jain
>
> The information transmitted, including any attachments, is intended only
> for the person or entity to which it is addressed and may contain
> confidential and/or privileged material. Any review, retransmission,
> dissemination or other use of, or taking of any action in reliance upon,
> this information by persons or entities other than the intended recipient
> is prohibited, and all liability arising therefrom is disclaimed. If you
> received this in error, please contact the sender and delete the material
> from any computer. Please note if the e-mail address include "TPR", the
> sender of this e-mail is a third party resource, and not an employee, who
> has been specifically authorized to correspond routine matters related to
> the project only. For any clarification with regard to any non-routine or
> engagement specific deliverables please contact the assigned project
> manager/ project partner.


Re: setFaultTolerant ordering guarantees

2020-07-25 Thread Alexey Serbin
I meant in addition to API exposed via KuduScanner::OrderMode (
https://kudu.apache.org/cpp-client-api/classkudu_1_1client_1_1KuduScanner.html#a3d6c79325c9da9741d0accf1b43bf7f9
for Kudu C++ client API and corresponding Java counterpart), I don't think
there are other ways documented in the API to have returned tablet rows
ordered.


Thanks,

Alexey

On Sat, Jul 25, 2020 at 2:53 PM Alexey Serbin  wrote:

> Hi Petar,
>
> Yes, you are right: fault-tolerant scans sort their results in primary key
> order (note: within a tablet only; this sort is not global).  I'm not sure
> there are other explicit guarantees exposed in the API in that regard.
>
>
> Kind regards,
>
> Alexey
>
> On Thu, Jul 23, 2020 at 4:07 PM Petar Nikolov 
> wrote:
>
>> Hi all,
>>
>> I have a use case where I'd like to scan a single tablet and get back
>> data sorted by primary key.
>> My understanding is that Kudu scans read DiskRowSets, which are ordered,
>> but if PK range spans more than 1 (not yet compacted) rowset, ordering
>> would not be guaranteed.
>>
>> I saw a "setFaultTolerant" option in Kudu Java API. This adds the benefit
>> of having a resumable scan and (implicitly?) sorts by PK?
>> It is not explicitly mentioned in the Javadoc, so I'd like to know if I
>> can rely on this API for ordering guarantees or is there another more
>> suitable API?
>>
>> Best,
>> Petar
>>
>


Re: setFaultTolerant ordering guarantees

2020-07-25 Thread Alexey Serbin
Hi Petar,

Yes, you are right: fault-tolerant scans sort their results in primary key
order (note: within a tablet only; this sort is not global).  I'm not sure
there are other explicit guarantees exposed in the API in that regard.


Kind regards,

Alexey

On Thu, Jul 23, 2020 at 4:07 PM Petar Nikolov 
wrote:

> Hi all,
>
> I have a use case where I'd like to scan a single tablet and get back data
> sorted by primary key.
> My understanding is that Kudu scans read DiskRowSets, which are ordered,
> but if PK range spans more than 1 (not yet compacted) rowset, ordering
> would not be guaranteed.
>
> I saw a "setFaultTolerant" option in Kudu Java API. This adds the benefit
> of having a resumable scan and (implicitly?) sorts by PK?
> It is not explicitly mentioned in the Javadoc, so I'd like to know if I
> can rely on this API for ordering guarantees or is there another more
> suitable API?
>
> Best,
> Petar
>


Re: Why is it slow to write Kudu with 100+ threads?

2020-06-09 Thread Alexey Serbin
Hi,

Thank  you for the stats.

I guess one crucial point is using proper flush mode for Kudu sessions.
Make sure it's AUTO_FLUSH_BACKGROUND, not AUTO_FLUSH_SYNC.

Another important point is the number of RPC workers: by default it's 20,
but given that your server has 28 cores (I guess it's 2 times more if
counting in hyperthreading, right?), you could try to increase
--rpc_num_service_threads up to 30 or even 40.  More RPC workers will be
able to clear the RPC queue faster if there is enough hardware resources
available.  I guess with too many writer threads some requests in RPC queue
time out eventually and there might be RPC queue overflows, and Kudu client
automatically retries failed write requests, but as the result the overall
write performance will degrade compared with the case of no retries.

In addition, make sure the cluster is well balanced, so it's much less
chance for hot-spotting.  I'd run 'kudu cluster rebalance' prior to running
the benchmarks.

Also, I'd take a look at the IO statistics reported by iostat (e.g., run
`iostat -dx 1` for some time and see the numbers for write/read IO stats
and most importantly, the IO bandwidth utilization).  If you see iostat
reporting bandwidth utilization close to 100%, then consider adding a
separate SSD drive for tablet write-ahead logs (WAL).

A very good starting point for troubleshooting is looking into the tablet
servers' logs and /metrics page -- warning messages might give you some
insights what's going wrong, if anything.


Best regards,

Alexey

On Tue, Jun 9, 2020 at 8:47 AM Ray Liu (rayliu)  wrote:

> We also observed when we have less threads for writing, the speed is not
> that bad(about a few thousand records/ second).
>
> This sentence may not be correct.
>
>
>
> We just ran another test to reduce the thread number from 10 to 2.
>
> The speed is much slower.
>
>
>
> We’re using Flink Kudu sink to write to Kudu cluster.
>
> https://github.com/apache/bahir-flink/blob/8723e6b01dd5568f318204abdf7f7a07b32ff70d/flink-connector-kudu/src/main/java/org/apache/flink/connectors/kudu/connector/writer/KuduWriter.java#L89
>
> Basically, it calls KuduSession.apply with each element needs to be
> inserted.
>
>
>
>
>
>
>
> *From: *"Ray Liu (rayliu)" 
> *Reply-To: *"user@kudu.apache.org" 
> *Date: *Tuesday, June 9, 2020 at 23:31
> *To: *"user@kudu.apache.org" 
> *Subject: *Why is it slow to write Kudu with 100+ threads?
>
>
>
> We have a Kudu cluster with 5 tablet servers each has 28 CPU cores, 160GB
> RAM and 2TB SSD.
>
>
>
> The RPC queue length we set is 500.
>
>
>
> We now write to 10 tables at the same time.
>
>
>
> We’re using 10 threads each to write(simply insert) to 8 out of these 10
> tables.
>
>
>
> We have 5 task (each task with 10 threads)  to upsert corresponding fields
> for the rest two tables.
>
>
>
> For example, for one of these two tables we have 5 fields(a,b,c,d,e) with
> `key` fields as primary key .
>
>
>
> 1 task(10 thread) is  running upsert (key, a)
>
> 1 task(10 thread) is  running upsert (key, b)
>
> 1 task(10 thread) is  running upsert (key, c)
>
> 1 task(10 thread) is  running upsert (key, d)
>
> 1 task(10 thread) is  running upsert (key, e)
>
>
>
> Now we observed that writes are very slow(less than 1000 thousand
> records/second).
>
> We also observed when we have less threads for writing, the speed is not
> that bad(about a few thousand records/ second).
>
>
>
> Here’s the CPU utilization report for Kudu threads.
>
>
>
> Threads: 724 total,  15 running, 709 sleeping,   0 stopped,   0 zombie
>
> %Cpu(s): 18.5 us,  8.3 sy,  0.0 ni, 67.6 id,  4.9 wa,  0.0 hi,  0.7 si,
> 0.0 st
>
> KiB Mem : 1648+total,  1776956 free, 12737900 used, 15037401+buff/cache
>
> KiB Swap:  3145724 total,  3004924 free,   140800 used. 15048467+avail Mem
>
>
>
>   PID USER  PR  NIVIRTRESSHR S %CPU %MEM TIME+ COMMAND
>
> 14000 kudu  20   0   70.0g  12.7g   1.9g R 72.8  8.1  26083:49
> MaintenanceMgr
>
> 13992 kudu  20   0   70.0g  12.7g   1.9g R 53.2  8.1   4067:40 rpc
> reactor-139
>
> 13995 kudu  20   0   70.0g  12.7g   1.9g R 42.9  8.1   3996:32 rpc
> reactor-139
>
> 13993 kudu  20   0   70.0g  12.7g   1.9g R 39.9  8.1   3167:19 rpc
> reactor-139
>
> 14231 kudu  20   0   70.0g  12.7g   1.9g S 11.3  8.1 142:14.00 rpc
> worker-1423
>
> 14242 kudu  20   0   70.0g  12.7g   1.9g S 11.3  8.1 107:12.97 rpc
> worker-1424
>
> 14226 kudu  20   0   70.0g  12.7g   1.9g S 10.6  8.1 109:21.71 rpc
> worker-1422
>
> 14274 kudu  20   0   70.0g  12.7g   1.9g S 10.6  8.1  95:54.12 rpc
> worker-1427
>
> 14216 kudu  20   0   70.0g  12.7g   1.9g S 10.0  8.1 136:26.72 rpc
> worker-1421
>
> 14221 kudu  20   0   70.0g  12.7g   1.9g S 10.0  8.1 129:04.78 rpc
> worker-1422
>
> 14253 kudu  20   0   70.0g  12.7g   1.9g S  9.3  8.1 104:26.75 rpc
> worker-1425
>
> 14250 kudu  20   0   70.0g  12.7g   1.9g S  8.6  8.1 145:44.18 rpc
> worker-1425
>
> 14224 kudu  20   0   70.0g  12.7g   1.9g S  7.3  8.1 

Re: Why does partition keys have to be in the primary keys?

2020-05-06 Thread Alexey Serbin
Hi,

The restriction on the partitioning key to be composed of primary key
columns significantly simplifies the design and implementation.

However, I'm not sure I understand why the rules of partitioning come to
play here.  To me it looks like the main question is about the schema for
the table, i.e. what should be the primary key.  If different pipelines use
different values for the 'day' field, but one result record is expected,
does it imply that pipelines need to update already existing records?  If
so, then maybe use UPSERT instead of INSERT for those pipelines?

I would start with trying to understand what's the primary key for the
table to satisfy the requirements.  Once it's clear, I'd think about the
partitioning rules for the table.


Thanks,

Alexey

On Wed, May 6, 2020 at 4:54 AM Ray Liu (rayliu)  wrote:

> We have two pipelines writing to the same table, and that table is ranged
> partitioned by “day” field.
>
>
> Each pipeline fills some of the fields in the table with the same key.
>
>
>
> But the “day” field in these two pipelines may be different.
>
>
>
> Because range partition keys must exist in primary keys, so there will be
> two records in the result table.
>
>
>
> What we want is one complete record.
>
>
>
> So my question is why does partition keys have to be in the primary keys?
>
>
>
> Is there any workaround for this?
>


Re: Implications/downside of increasing rpc_service_queue_length

2020-05-01 Thread Alexey Serbin
I guess the point about the low-latency requests was that long RPC queues
might add extra latency to request handling, and the latency might be
unpredictably long.  E.g., if the queue is almost full and a new RPC
request is added, the request will be dispatched to one of the available
service threads only after dispatching already enqueued ones.  And the
number of service threads in the service thread pool is limited.


Thanks,

Alexey

On Thu, Apr 30, 2020 at 11:17 AM Mauricio Aristizabal 
wrote:

> Thanks Todd. Better late than never indeed, appreciate it very much.
>
> Yes, precisely, we are dealing with very spikey ingest.
>
> Immediate issue has been addressed though: we extended the spark
> KuduContext so we could build our own AsyncKuduClient and
> increase defaultOperationTimeoutMs from default 30s to 120s and that has
> eliminated the client timeouts.
>
> One followup question: not sure I understand your comment re/ low-latency
> requests - if data was ingested it is already in MemStore and therefore
> available to clients, so whether queued or not, it should not make a
> difference on data availability right? except maybe slow down scans/queries
> a bit since they have to read more data from MemStore and uncompacted
> RowStores?
>
> thanks again,
>
> -m
>
> On Mon, Apr 20, 2020 at 9:38 AM Todd Lipcon  wrote:
>
>> Hi Mauricio,
>>
>> Sorry for the late reply on this one. Hope "better late than never" is
>> the case here :)
>>
>> As you implied in your email, the main issue with increasing queue length
>> to deal with queue overflows is that it only helps with momentary spikes.
>> According to queueing theory (and intuition) if the rate of arrival of
>> entries into a queue is faster than the rate of processing items in that
>> queue, then the queue length will grow. If this is a transient phenomenon
>> (eg a quick burst of requests) then having a larger queue capacity will
>> prevent overflows, but if this is a persistent phenomenon, then there is no
>> length of queue that is sufficient to prevent overflows. The one exception
>> here is that if the number of potential concurrent queue entries is itself
>> bounded (eg because there is a bounded number of clients).
>>
>> According to the above theory, the philosophy behind the default short
>> queue is that longer queues aren't a real solution if the cluster is
>> overloaded. That said, if you think that the issues are just transient
>> spikes rather than a capacity overload, it's possible that bumping the
>> queue length (eg to 100) can help here.
>>
>> In terms of things to be aware of: having a longer queue means that the
>> amount of memory taken by entries in the queue is increased proportionally.
>> Currenlty, that memory is not tracked as part of Kudu's Memtracker
>> infrastructure, but it does get accounted for in the global heap and can
>> push the serve into "memory pressure" mode where requests will start
>> getting rejected, rowsets will get flushed, etc. I would recommend that if
>> you increase your queues you make sure that you have a relatively larger
>> memory limit allocated to your tablet servers and watch out for log
>> messages and metrics indicating persistent memory pressure (particularly in
>> the 80%+ range where things start getting dropped a lot).
>>
>> Long queues are also potentially an issue in terms of low-latency
>> requests. The longer the queue (in terms of items) the longer the latency
>> of elements waiting in that queue. If you have some element of latency
>> SLAs, you should monitor them closely as you change queue length
>> configuration.
>>
>> Hope that helps
>>
>> -Todd
>>
>>
>
> --
> Mauricio Aristizabal
> Architect - Data Pipeline
> mauri...@impact.com | 323 309 4260
> https://impact.com
> 
> 
> 
> 
>
> 
>


Re: [ANNOUNCE] Welcoming Bankim Bhavsar as Kudu committer and PMC member

2020-04-21 Thread Alexey Serbin
Congratulations Bankim!  Great to see these valuable contributions, keep it
up!


Best regards,

Alexey

On Sat, Apr 18, 2020 at 11:04 PM Hao Hao  wrote:

> Congrats Bankim! Well deserved!
>
> Best,
> Hao
>
> On Sat, Apr 18, 2020 at 5:45 PM Andrew Wong  wrote:
>
>> Congratulations Bankim! Keep up the great work 
>>
>> On Sat, Apr 18, 2020 at 3:28 PM Adar Dembo  wrote:
>>
>>> Hi Kudu community,
>>>
>>> I'm happy to announce that the Kudu PMC has voted to add Bankim
>>> Bhavsar as a new committer and PMC member.
>>>
>>> Bankim has been actively writing Kudu code for the last six months or
>>> so. Aside from various bug fixes, his major contribution has been to
>>> replace the existing Bloom filter predicate code with a more
>>> full-featured implementation that should also be more robust and
>>> efficient. One of the challenges here has been integration with Apache
>>> Impala, and providing a common abstraction that can be used by both
>>> codebases. This work is still ongoing but is drawing to a close pretty
>>> soon.
>>>
>>> Please join me in congratulating Bankim!
>>>
>>


Re: some troubles about kudu cluster

2020-04-15 Thread Alexey Serbin
Hi,

Those messages from the Kudu Java client say that something is wrong with
the specified server (UUID 0178e667f8a8474caace936b7539e746).  I would take
a look into the kudu-tserver logs at the node where the server was running.


/Alexey

On Tue, Apr 14, 2020 at 2:19 AM evan <564740...@qq.com> wrote:

>
>
> Hello,everyone,our kudu cluster frequently occur errors like the
> following,and then the data can’t  insert into kudu table either,some one
> can help me,give me some idea to solve these troubles
>
> Some logs as following:
>
> New I/O worker #44] INFO org.apache.kudu.client.AsyncKuduClient - Removing
> server 0178e667f8a8474caace936b7539e746 from this tablet's cache
> 447417206ba54484984ec2fd9eff428c
>
> [New I/O worker #52] INFO org.apache.kudu.client.AsyncKuduClient -
> Removing server 0178e667f8a8474caace936b7539e746 from this tablet's cache
> 447417206ba54484984ec2fd9eff428c
>
> [New I/O worker #25] INFO org.apache.kudu.client.AsyncKuduClient -
> Removing server 0178e667f8a8474caace936b7539e746 from this tablet's cache
> 447417206ba54484984ec2fd9eff428c
>
> [New I/O worker #52] INFO org.apache.kudu.client.AsyncKuduClient -
> Removing server 0178e667f8a8474caace936b7539e746 from this tablet's cache
> 447417206ba54484984ec2fd9eff428c
>
> [New I/O worker #44] INFO org.apache.kudu.client.AsyncKuduClient -
> Removing server 0178e667f8a8474caace936b7539e746 from this tablet's cache
> 447417206ba54484984ec2fd9eff428c
>
> [New I/O worker #52] INFO org.apache.kudu.client.AsyncKuduClient -
> Removing server 0178e667f8a8474caace936b7539e746 from this tablet's cache
> 447417206ba54484984ec2fd9eff428c
>
> [New I/O worker #25] INFO org.apache.kudu.client.AsyncKuduClient -
> Removing server 0178e667f8a8474caace936b7539e746 from this tablet's cache
> 447417206ba54484984ec2fd9eff428c
>
> [New I/O worker #52] INFO org.apache.kudu.client.AsyncKuduClient -
> Removing server
>
>
>
>
>
>
>
> 发送自 Windows 10 版邮件 应用
>
>
>


Re: Will multiple transactions write to Kudu concurrently cause deadlock in Kudu?

2020-04-13 Thread Alexey Serbin
At this point, Kudu doesn't support multi-row transactions, so I'm not sure
how deadlock is possible.

On Mon, Apr 13, 2020 at 3:29 AM Ray Liu (rayliu)  wrote:

> Just found this ticket which answers my question.
>
> https://issues.apache.org/jira/browse/KUDU-47
>
>
>
> I’ll try it out anyways.
>
>
>
> *From: *"Ray Liu (rayliu)" 
> *Reply-To: *"user@kudu.apache.org" 
> *Date: *Monday, April 13, 2020 at 18:22
> *To: *"user@kudu.apache.org" 
> *Subject: *Will multiple transactions write to Kudu concurrently cause
> deadlock in Kudu?
>
>
>
> I’ve read the Kudu documents about Kudu transaction semantics, but I still
> got one question.
>
>
> https://kudu.apache.org/docs/transaction_semantics.html#_single_tablet_write_operations
>
>
>
>
>
> For example, I have a use case described like this document.
>
>
> https://www.postgresql.org/docs/current/explicit-locking.html#LOCKING-DEADLOCKS
>
>
>
>
>
> Will I encounter deadlock in Kudu?
>


Re: hash and range partition uneven distribution for one tablet server

2020-03-26 Thread Alexey Serbin
Hi,

Do you mean you still see uneven distribution of leader replicas?


Thanks,

Alexey

On Thu, Mar 26, 2020 at 7:56 PM Fisk Xia  wrote:

> Hi,
>
> Thanks for you time and attention.
>
> To further elaborate the situation, we are having replication factor = 1.
> We have tried running Kudu Rebalancer Tool, we still seeing uneven
> distribution as mentioned.
>
> Please advice us if there is any alternative to improve the situation
> other than running the Kudu Rebalancer Tool.
>
> Thank you.
>
>
> On 2020/03/25 04:43:13, Adar Lieber-Dembo  wrote:
>
> What you're seeing sort of makes sense given that partition assignment>
> uses "power of 2" selection process: two servers are chosen at random,>
> and the one with the fewer partitions is selected as the recipient of>
> the new partition. Given enough partitions, this algorithm should>
> result in an even distribution of partitions across servers. But since>
> you're only assigning 5 (or 15, if the replication factor is 3)>
> partitions to 5 servers, there may be some skew.>
>
> Have you tried running the Kudu rebalancer tool? That's "kudu cluster>
> rebalance". It'll redistribute your partitions to minimize skew across>
> tservers.>
>
> All that said, we currently don't have a mechanism to distribute>
> tablet leaders evenly across the cluster, so you may still see>
> hotspotting on writes if one server happens to host more leaders than>
> the others and if those leaders are servicing a high write load.>
>
> On Tue, Mar 24, 2020 at 9:09 PM 夏天松  wrote:>
>
>
> I have a hot data insert problem when using kudu. If I use both hash and
> range partition, all buckets will be unevenly distributed.>
> Kudu cluster distribution:  3 master and 5 tablet server>
>
>
> My create table sql:>
> CREATE TABLE tmp.sales_by_year (>
> device_id STRING NOT NULL,>
> update_date STRING NOT NULL,>
> update_time STRING NOT NULL,>
> object_name STRING NOT NULL,>
> attribute_name STRING NOT NULL,>
> present_value STRING NULL,>
> PRIMARY KEY (device_id, update_date, update_time, object_name,
> attribute_name)>
> )>
> PARTITION BY HASH (device_id) PARTITIONS 5, RANGE (update_date) (>
> PARTITION '2020-03-21'<= VALUES < '2020-03-22'>
> )>
> STORED AS KUDU;>
>
>
> Then I hope when update_date = '2020-03-21' , every tablet server has one
> partition , but the real distribution is not like this. The real
> distribution is that some machines have no partitions, and some have 2 or 3
> partitions. This situation leads to high CPU usage on some machines when
> writing large amounts of time series data.>
>
>
> Please help me, how can i solve this problem.>
>
>


Re: Kudu/Spark LIMIT support

2020-01-23 Thread Alexey Serbin
Yes, your observations match what's in the code: Kudu Spark bindings don't
support scanner row limits, but Kudu Java, C++ and Python clients do
support that.  And indeed,
https://issues.apache.org/jira/browse/KUDU-16 contains
relevant information on the status of this feature, missing

As of my knowledge, nobody currently works on implementing scanner
limits for kudu-spark.  However, patches are always welcome!


Kind regards,

Alexey

On Mon, Jan 20, 2020 at 10:38 PM Pavel Martynov  wrote:

> Hi, folks!
>
> For testing purposes, I need to read a small chunk of rows of a big table
> (~12 blns rows) on my dev machine. So I started driver with "local[4]"
> executors and wrote a code like:
>
> sparkSession.sqlContext.read.options(Map(
>   "kudu.master" -> "master",
>   "kudu.table" -> "thebigtable",
>   "kudu.splitSizeBytes" -> SplitSize512Mb
> )).format("kudu").load
>   .limit(1000)
>   .select($"col1", $"col2", $"col3")
>
> My expectation: only 1000 rows should be actually read from Kudu in very
> fast manner.
>
> Actually observed: Spark started 4 parallel scanners for one of the
> tablets and looks like this scanning process scanning the whole tablet
> (which is ~2.4 blns rows) and scanning time is really big.
>
> Is this expected behavior?
>
> I found this closed ticket https://issues.apache.org/jira/browse/KUDU-16 with
> comments on Spark: "No support on the Spark side, but AFAICT, support for
> limits given our current Scala bindings is somewhat unnatural.".
>
> Kudu ver 1.11.1.
>
> --
> with best regards, Pavel Martynov
>


Re: [ANNOUNCE] Welcoming Yifan Zhang as Kudu committer and PMC member

2020-01-07 Thread Alexey Serbin
Congratulations Yifan and keep the great work going! :)


/Alexey

On Tue, Jan 7, 2020 at 10:59 AM Hao Hao  wrote:

> Congratulations!
>
> On Tue, Jan 7, 2020 at 10:02 AM Grant Henke  wrote:
>
>> Congratulations!
>>
>> On Tue, Jan 7, 2020 at 12:20 AM 赖迎春  wrote:
>>
>>> Congratulations Yifan!
>>>
>>> Yingchun Lai
>>>
>>>
>>> Andrew Wong  于2020年1月7日周二 下午2:15写道:
>>>
>>> > Congratulations Yifan! Well done, and keep up the great work!
>>> >
>>> > On Mon, Jan 6, 2020 at 9:39 PM Adar Lieber-Dembo 
>>> > wrote:
>>> >
>>> >> Hi Kudu community,
>>> >>
>>> >> I'm happy to announce that the Kudu PMC has voted to add Yifan Zhang
>>> >> as a new committer and PMC member.
>>> >>
>>> >> Yifan's contributions have included:
>>> >> - Filtering tables and tablets in ksck.
>>> >> - Tooling to remove or alter columns.
>>> >> - Adding a mean gauge to the metrics subsystem.
>>> >> - The addition to the rebalancer tool to ignore the health of a tablet
>>> >> server, and subsequent improvements to move replicas from such tablet
>>> >> servers, which is a valuable building block for tserver
>>> >> decommissioning.
>>> >> - Most recently, deduplicating RPCs sent by Kudu masters to tablet
>>> >> servers.
>>> >>
>>> >> Please join me in congratulating Yifan!
>>> >>
>>> >
>>> >
>>> > --
>>> > Andrew Wong
>>> >
>>>
>>
>>
>> --
>> Grant Henke
>> Software Engineer | Cloudera
>> gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke
>>
>


[ANNOUNCE] Apache Kudu 1.11.1 Released

2019-11-20 Thread Alexey Serbin
The Apache Kudu team is happy to announce the release of Kudu 1.11.1!

Kudu is an open source storage engine for structured data which
supports low-latency random access together with efficient analytical
access patterns. It supports many integrations with other data analytics
projects both inside and outside of the Apache Software Foundation.

Apache Kudu 1.11.1 is a bug fix release. Please see the release notes
for details:
  https://kudu.apache.org/releases/1.11.1/docs/release_notes.html

The Apache Kudu project only publishes source code releases. To build
Kudu 1.11.1, follow these steps:
  - Download the Kudu 1.11.1 source release:
  https://kudu.apache.org/releases/1.11.1
  - Follow the instructions in the documentation to build Kudu 1.11.1
from source:

https://kudu.apache.org/releases/1.11.1/docs/installation.html#build_from_source

For your convenience, binary JAR files for the Kudu Java client library,
Spark DataSource, Flume sink, and other Java integrations are published
to the ASF Maven repository and are now available:
https://search.maven.org/search?q=g:org.apache.kudu%20AND%20v:1.11.1

The Python client source is also available on PyPI:
  https://pypi.org/project/kudu-python/

Additionally, experimental Docker images are published to Docker Hub:
  https://hub.docker.com/r/apache/kudu

Regards,
The Apache Kudu team


[ANNOUNCE] Apache Kudu 1.10.1 Released

2019-11-20 Thread Alexey Serbin
The Apache Kudu team is happy to announce the release of Kudu 1.10.1!

Kudu is an open source storage engine for structured data which
supports low-latency random access together with efficient analytical
access patterns. It supports many integrations with other data analytics
projects both inside and outside of the Apache Software Foundation.

Apache Kudu 1.10.1 is a bug fix release. Please see the release notes
for details:
  https://kudu.apache.org/releases/1.10.1/docs/release_notes.html

The Apache Kudu project only publishes source code releases. To build
Kudu 1.10.1, follow these steps:
  - Download the Kudu 1.10.1 source release:
  https://kudu.apache.org/releases/1.10.1
  - Follow the instructions in the documentation to build Kudu 1.10.1
from source:

https://kudu.apache.org/releases/1.10.1/docs/installation.html#build_from_source

For your convenience, binary JAR files for the Kudu Java client library,
Spark DataSource, Flume sink, and other Java integrations are published
to the ASF Maven repository and are now available:
https://search.maven.org/search?q=g:org.apache.kudu%20AND%20v:1.10.1

The Python client source is also available on PyPI:
  https://pypi.org/project/kudu-python/

Regards,
The Apache Kudu team


[ANNOUNCE] Apache Kudu 1.11.0 Released

2019-11-01 Thread Alexey Serbin
 The Apache Kudu team is happy to announce the release
of Kudu 1.11.0!

Kudu is an open source storage engine for structured data which
supports low-latency random access together with efficient analytical
access patterns. It is designed within the context of the Apache Hadoop
ecosystem and supports many integrations with other data analytics
projects both inside and outside of the Apache Software Foundation.

Apache Kudu 1.11.0 is a minor release that offers several new features,
improvements, optimizations, and bug fixes. Please see the release notes
for details:
  https://kudu.apache.org/releases/1.11.0/docs/release_notes.html

The Apache Kudu project only publishes source code releases. To build
Kudu 1.11.0, follow these steps:
  - Download the Kudu 1.11.0 source release:
  https://kudu.apache.org/releases/1.11.0
  - Follow the instructions in the documentation to build Kudu 1.11.0
from source:

https://kudu.apache.org/releases/1.11.0/docs/installation.html#build_from_source

For your convenience, binary JAR files for the Kudu Java client library,
Spark DataSource, Flume sink, and other Java integrations are published
to the ASF Maven repository and are now available:
https://search.maven.org/search?q=g:org.apache.kudu%20AND%20v:1.11.0

The Python client source is also available on PyPI:
  https://pypi.org/project/kudu-python/

Additionally, experimental Docker images are published to Docker Hub:
  https://hub.docker.com/r/apache/kudu

NOTE: as it was found after the release artifacts had already been
  published, the kudu-binary JAR artifact in Kudu 1.11.0 doesn't
  comply with the ASF 3rd-party license policy [1] since it includes
  the libnuma dynamic library which is licensed under LGPL v2.1.
  As it turned out, the same is true for the kudu-binary JAR
  artifact released in July with Kudu 1.10.0. See [2] and [3] for
details.

  The inadvertent inclusion of an LGPL library will be addressed
  ASAP by releasing Kudu 1.10.1 and Kudu 1.11.1 patch releases
  adhering to the ASF 3rd party license policy.

References:
  [1] https://www.apache.org/legal/resolved.html
  [2] https://issues.apache.org/jira/browse/KUDU-2990
  [3] https://issues.apache.org/jira/browse/LEGAL-487


Regards,
The Apache Kudu team


Re: "Too many open files" error

2019-10-10 Thread Alexey Serbin
Hi,

I think you could try to set the limit for the number of open files to
unlimited and see how it goes when you start tablet server.

I think the best way forward is to add tablet servers into the cluster.
Ideally, you want to have your data replicated, consider creating tables
with replication factor 3 and having at least 4 tablet servers in your
cluster.  Once you added a new tablet servers, don't forget to run the
rebalancer tool (kudu cluster rebalance ...)


HTH,

Alexey

On Mon, Oct 7, 2019 at 2:31 AM Faraz Mateen  wrote:

> Alexey,
>
> Thank you for the response. Having too many partitions is exactly what the
> problem is. When I restart the tserver, it tries to open files against each
> tablet and eventually crashes.
>
> Is there a way to get around this and recover my data? Is there any config
> I can change to run the tserver? Or can I add a new tablet server and
> migrate existing tablets?
>
> On Sat, Oct 5, 2019 at 10:05 PM Alexey Serbin 
> wrote:
>
>> Hi,
>>
>> Most likely the issue happened because of high number of tablet replicas
>> at the tablet server.  In case of high spike of in the input data rate,
>> higher compaction activity might require more than usual number of file
>> descriptors, since more files are opened.
>>
>> How many tablet replicas does that tablet server have?  It's not
>> recommended to have too many:
>> https://kudu.apache.org/docs/known_issues.html#_scale
>>
>> To understand what has happened, you need to take a look into the logs of
>> the tablet server.  This might be useful:
>> https://kudu.apache.org/docs/troubleshooting.html
>>
>> Overall, if there is only one (?) tablet server in the whole Kudu
>> cluster, why to have 39 partitions per table?  I guess that's some sort of
>> proof-of-concept/toy setup, but anyways.  Since all the tablet replicas end
>> up at the same single tablet server, I don't see benefits from partitioning
>> in that setup.  For the tablet server, it simply means x-times increased
>> number of open file descriptors and increased memory usage.
>>
>>
>> Kind regards,
>>
>> Alexey
>>
>> On Fri, Oct 4, 2019 at 4:21 AM Faraz Mateen  wrote:
>>
>>> Hi all,
>>>
>>> I am facing a problem with my kudu setup where tablet server crashes
>>> with "too many open files" error.
>>> The setup consists of a single master and a single tablet server. Tables
>>> created are such that there are 39 partitions per table. However not all
>>> partitions have data that corresponds to them.
>>> Yesterday my tserver crashed and when I am trying to restart the
>>> tserver, it fails with the error:
>>>
>>> I1004 03:50:39.896301  5669 ts_tablet_manager.cc:1173] T
>>> cab85f15f06748d0b59161d9f3da55f7 P ee14d248ac994d0eb60dbb0db4ab3b09:
>>> Registered tablet (data state: TABLET_DATA_READY)
>>> W1004 03:50:39.923184  5687 os-util.cc:165] could not read
>>> /proc/self/status: IO error: /proc/self/status: Too many open files (error
>>> 24)
>>> I1004 03:50:39.939460  5669 ts_tablet_manager.cc:1173] T
>>> d8d68ce6f6ea49479c00d29709869f1f P ee14d248ac994d0eb60dbb0db4ab3b09:
>>> Registered tablet (data state: TABLET_DATA_READY)
>>>
>>> I have already modified ulimit of the machine:
>>>
>>> root@vm-3:~# ulimit -a
>>> core file size  (blocks, -c) 0
>>> data seg size   (kbytes, -d) unlimited
>>> scheduling priority (-e) 0
>>> file size   (blocks, -f) unlimited
>>> pending signals (-i) 63923
>>> max locked memory   (kbytes, -l) 16384
>>> max memory size (kbytes, -m) unlimited
>>> open files  (-n) 65535
>>> pipe size(512 bytes, -p) 8
>>> POSIX message queues (bytes, -q) 819200
>>> real-time priority  (-r) 0
>>> stack size  (kbytes, -s) 8192
>>> cpu time   (seconds, -t) unlimited
>>> max user processes  (-u) 65535
>>> virtual memory  (kbytes, -v) unlimited
>>> file locks  (-x) unlimited
>>>
>>> *Set up Details:*
>>> Single master and tserver setup on a single VM.
>>> 4 cores, 550GB hard disk, 16GB RAM
>>> Kudu version 1.8 on ubuntu, installed through debian packages.
>>> Before crash, data was being inserted in kudu at a very high rate. RAM
>>> usage was around 87% and disk usage was around 84 percent.
>>>
>>> Here is what I have tried so far:
>>> 1- Set ulimit -n to 65535.
>>> 2- Reboot the vm to get rid of stale processes.
>>> 3- Set block_manager_max_open_files to 32000 in tserver flag file.
>>>
>>> What I want to know now is:
>>> 1- Why am I hitting this problem? Is this due to low resources on the VM
>>> or high number of tablets on a single tserver?
>>> 2- How can I get around this problem, recover my data and kudu services?
>>>
>>> Would really appreciate some help on this.
>>> --
>>> Faraz Mateen
>>>
>>
>
> --
> Faraz Mateen
>


Re: [ANNOUNCE] Welcoming Lifu He, Yao Xu, and Yao Zhang as Kudu committers and PMC members

2019-08-29 Thread Alexey Serbin
Congratulations and thank you guys!  It's great to see those awesome
contributions coming from the new members of the Kudu community.  Excellent
work, keep it up!

/Alexey

On Mon, Aug 26, 2019 at 5:06 PM 赖迎春  wrote:

> Congratulations!
>
> Grant Henke 于2019年8月27日 周二05:10写道:
>
>> Congratulations! And thank you all for your contributions. Looking forward
>> to continuing to work together.
>>
>>
>> On Mon, Aug 26, 2019 at 11:09 AM Raymond Blanc 
>> wrote:
>>
>> > Congratulations Keep up the great work!
>> >
>> >
>> > > On Aug 26, 2019, at 12:42 PM, Hao Hao 
>> > wrote:
>> > >
>> > > Congratulations!
>> > >
>> > > On Mon, Aug 26, 2019 at 10:33 AM Andrew Wong
>> > > >
>> > > wrote:
>> > >
>> > >> Congratulations everyone! Keep up the great work!
>> > >>
>> > >> On Sun, Aug 25, 2019 at 9:40 PM Adar Lieber-Dembo > >
>> > >> wrote:
>> > >>
>> > >>> Hi Kudu community,
>> > >>>
>> > >>> I'm happy to announce that the Kudu PMC has voted to add Lifu He,
>> Yao
>> > >>> Xu, and Yao Zhang as new committers and PMC members.
>> > >>>
>> > >>> Lifu has worked on a variety of patches, from dead container
>> deletion
>> > >>> in the block manager, to metric aggregation in the master, to
>> several
>> > >>> performance improvements in the rowset interval tree and DMS. He's
>> > >>> currently working on a (very exciting and very ambitious) prototype
>> > >>> for secondary indexing via bitmap indexes. In short, Lifu's
>> > >>> contributions are varied, and has also been helping other users on
>> > >>> Slack and WeChat.
>> > >>>
>> > >>> Yao (Xu)'s largest contribution thus far has been the "split key
>> > >>> range" functionality which improves scan parallelism in Spark jobs.
>> > >>> He's also made a number of other contributions, such as the extra
>> > >>> configuration properties framework for tables, and a new tablet
>> > >>> placement policy based on user-defined dimension labels.
>> > >>>
>> > >>> Yao (Zhang)'s contributions have included server-side bloom filter
>> > >>> predicate support and, more recently, performance improvements for
>> > >>> updates.
>> > >>>
>> > >>> Lifu works at NetEase (one of the largest Internet and video game
>> > >>> companies in the world) where he helps operate their Kudu clusters.
>> > >>> Both Yao Xu and Yao Zhang work at Ant Financial (Alipay Inc.) where
>> > >>> they also help operate their very large Kudu deployments. All three
>> > >>> have been instrumental in growing Kudu's presence within China as
>> well
>> > >>> as helping new Chinese users come up to speed with Kudu.
>> > >>>
>> > >>> Please join me in congratulating Lifu, Yao, and Yao!
>> > >>>
>> > >>
>> > >>
>> > >> --
>> > >> Andrew Wong
>> > >>
>> >
>>
>>
>> --
>> Grant Henke
>> Software Engineer | Cloudera
>> gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke
>>
> --
> Yingchun Lai
>


Re: impala with kudu write become very slow

2019-07-19 Thread Alexey Serbin
Hi,

It's hard to say what might be the problem without additional information.
Could you clarify on the following questions:

1.  What was the rate of write operations for the 270M rows you mentioned?
Was that regular 50K rows/sec or something else?
2.  Do you still observe the slowness or it's already gone?  (if that was
just a spike in ops rate, Kudu tablet servers might be rejecting writes due
to the memory pressure when not being able to flush the data on disk fast
enough).
3.  What was special about those 270M rows?  Maybe, the order of keys
changed somehow?  Kudu ingests data much faster if row keys come in kind of
sequential order: so if the prior write batches had that property, but the
recent 270M chunk didn't, it might be the case.
4.  Was there any additional concurrent activity on the nodes where Kudu
tablet servers run?


Thanks,

Alexey

On Thu, Jul 18, 2019 at 7:48 PM Tim Armstrong 
wrote:

> Also including the Kudu list in case someone there recognises the problem.
>
> On Thu, Jul 18, 2019 at 8:05 AM lk_hadoop  wrote:
>
>> I0718 18:42:22.677520 51139 coordinator.cc:357] starting execution on 5
>> backends for query_id=2e4a3fbec0d7d721:2ec73c1c
>> I0718 18:42:22.679605 12873 impala-internal-service.cc:44]
>> ExecQueryFInstances(): query_id=2e4a3fbec0d7d721:2ec73c1c
>> I0718 18:42:22.679620 12873 query-exec-mgr.cc:46] StartQueryFInstances()
>> query_id=2e4a3fbec0d7d721:2ec73c1c
>> coord=realtimeanalysis-kudu-04-10-8-50-58:22000
>> I0718 18:42:22.679625 12873 query-state.cc:178] Buffer pool limit for
>> 2e4a3fbec0d7d721:2ec73c1c: 17179869184
>> I0718 18:42:22.679675 12873 initial-reservations.cc:60] Successfully
>> claimed initial reservations (4.00 MB) for query
>> 2e4a3fbec0d7d721:2ec73c1c
>> I0718 18:42:22.679769 51332 query-state.cc:309] StartFInstances():
>> query_id=2e4a3fbec0d7d721:2ec73c1c #instances=2
>> I0718 18:42:22.680196 51332 query-state.cc:322] descriptor table for
>> query=2e4a3fbec0d7d721:2ec73c1c
>> tuples:
>> Tuple(id=2 size=567 slots=[Slot(id=52 type=INT col_path=[] offset=464
>> null=(offset=563 mask=20) slot_idx=29 field_idx=-1), Slot(id=53 type=STRING
>> col_path=[] offset=0 null=(offset=560 mask=1) slot_idx=0 field_idx=-1),
>> Slot(id=54 type=STRING col_path=[] offset=16 null=(offset=560 mask=2)
>> slot_idx=1 field_idx=-1), Slot(id=55 type=STRING col_path=[] offset=32
>> null=(offset=560 mask=4) slot_idx=2 field_idx=-1), Slot(id=56 type=STRING
>> col_path=[] offset=48 null=(offset=560 mask=8) slot_idx=3 field_idx=-1),
>> Slot(id=57 type=STRING col_path=[] offset=64 null=(offset=560 mask=10)
>> slot_idx=4 field_idx=-1), Slot(id=58 type=STRING col_path=[] offset=80
>> null=(offset=560 mask=20) slot_idx=5 field_idx=-1), Slot(id=59 type=STRING
>> col_path=[] offset=96 null=(offset=560 mask=40) slot_idx=6 field_idx=-1),
>> Slot(id=60 type=STRING col_path=[] offset=112 null=(offset=560 mask=80)
>> slot_idx=7 field_idx=-1), Slot(id=61 type=STRING col_path=[] offset=128
>> null=(offset=561 mask=1) slot_idx=8 field_idx=-1), Slot(id=62 type=STRING
>> col_path=[] offset=144 null=(offset=561 mask=2) slot_idx=9 field_idx=-1),
>> Slot(id=63 type=STRING col_path=[] offset=160 null=(offset=561 mask=4)
>> slot_idx=10 field_idx=-1), Slot(id=64 type=INT col_path=[] offset=468
>> null=(offset=563 mask=40) slot_idx=30 field_idx=-1), Slot(id=65 type=INT
>> col_path=[] offset=472 null=(offset=563 mask=80) slot_idx=31 field_idx=-1),
>> Slot(id=66 type=INT col_path=[] offset=476 null=(offset=564 mask=1)
>> slot_idx=32 field_idx=-1), Slot(id=67 type=INT col_path=[] offset=480
>> null=(offset=564 mask=2) slot_idx=33 field_idx=-1), Slot(id=68 type=STRING
>> col_path=[] offset=176 null=(offset=561 mask=8) slot_idx=11 field_idx=-1),
>> Slot(id=69 type=STRING col_path=[] offset=192 null=(offset=561 mask=10)
>> slot_idx=12 field_idx=-1), Slot(id=70 type=STRING col_path=[] offset=208
>> null=(offset=561 mask=20) slot_idx=13 field_idx=-1), Slot(id=71 type=STRING
>> col_path=[] offset=224 null=(offset=561 mask=40) slot_idx=14 field_idx=-1),
>> Slot(id=72 type=STRING col_path=[] offset=240 null=(offset=561 mask=80)
>> slot_idx=15 field_idx=-1), Slot(id=73 type=STRING col_path=[] offset=256
>> null=(offset=562 mask=1) slot_idx=16 field_idx=-1), Slot(id=74 type=INT
>> col_path=[] offset=484 null=(offset=564 mask=4) slot_idx=34 field_idx=-1),
>> Slot(id=75 type=INT col_path=[] offset=488 null=(offset=564 mask=8)
>> slot_idx=35 field_idx=-1), Slot(id=76 type=INT col_path=[] offset=492
>> null=(offset=564 mask=10) slot_idx=36 field_idx=-1), Slot(id=77 type=INT
>> col_path=[] offset=496 null=(offset=564 mask=20) slot_idx=37 field_idx=-1),
>> Slot(id=78 type=INT col_path=[] offset=500 null=(offset=564 mask=40)
>> slot_idx=38 field_idx=-1), Slot(id=79 type=INT col_path=[] offset=504
>> null=(offset=564 mask=80) slot_idx=39 field_idx=-1), Slot(id=80 type=INT
>> col_path=[] offset=508 null=(offset=565 mask=1) slot_idx=40 field_idx=-1),
>> Slot(id=81 

Re: is this mean the disk read rate was too slow

2019-07-15 Thread Alexey Serbin
Hi,

What was the expectation for the scan operation's timing w.r.t. the size of
the result set?  Did you see it was much faster in past?  I would start
with making sure the primary key of the table has indeed the columns used
in the predicate.  Also, if there has been 'trickle inserts' running
against the table for a long time, it might be
https://issues.apache.org/jira/browse/KUDU-1400


Probably, a good starting point would be running SUMMARY and EXPLAIN for
the query in impala-shell:

https://www.cloudera.com/documentation/enterprise/latest/topics/impala_explain_plan.html#perf_profile

If you see times for SCAN KUDU is much higher than you expect, most likely
it's either too much data being read or KUDU-1400.  Also, check the logs of
tserver at one of the machines where the scan is running: if case of slow
scan operations there should be traces for scan operations, search for 'Created
scanner ' pattern or the UUID of the scanner in the logs.


Kind regards,

Alexey


On Mon, Jul 15, 2019 at 2:45 AM lk_hadoop  wrote:

> hi,all:
>My impala+kudu cluster suddenly become slow , I doubt  about the
> disk not work well, I saw some scan infomation from kudu's web :
> xxx:8050/scans
>
>
> 7a2bb2a7e62d4614b423f26c3117b49e
> 
> 1934b20b8ab34ab98f1deb43c3eba4b2 Complete
>
> *SELECT* membership_card_id,
>tbill_code,
>goods_id,
>goods_name,
>paid_in_amt,
>profit,
>dates
>   *FROM* impala::TEST.SALE_BASE_FACT_WITH_MEMBERSHIP_20190626
>  *WHERE* PRIMARY KEY >= 
>*AND* PRIMARY KEY < 
>
> {username='hive'} at 10.8.50.58:46682 19.3 s 37 min
> column cells read bytes read blocks read
> membership_card_id 381.10k 3.85M 18
> tbill_code 372.82k 5.99M 27
> goods_id 426.24k 281.3K 8
> dates 426.24k 8.5K 8
> business_id 426.24k 2.7K 8
> goods_name 426.24k 291.1K 8
> paid_in_amt 376.93k 1019.6K 24
> profit 376.93k 1.14M 24
> total 3.21M 12.55M 125is this mean the disk read rate was too slow ?
>
>
> 2019-07-15
> --
> lk_hadoop
>


Re: Single value range partitions using the Java API

2019-02-20 Thread Alexey Serbin
Hi Nabeelah,

If you are looking at some hints how to deduce range partition bounds the
Impala-like way just from a single tuple, one starting point I could see is
https://github.com/apache/impala/blob/b8a8edddcb727a28c2d15bdb3533a32454364ade/fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java#L178

Let me know if you need any other help with that.


HTH,

Alexey

On Wed, Feb 20, 2019 at 5:28 AM Nabeelah Harris 
wrote:

> It doesn't seem that I am able to use  'incrementColumn()' since it's not
> public. I did however go ahead and add a suffix of 'Character.MAX_VALUE' to
> the string, and now the single value range partitions are being created
> perfectly.
>
> Thank you!
>
>
>
> On Wed, Feb 20, 2019 at 1:48 PM helifu  wrote:
>
>> Hi,
>>
>> It seems the range should be:
>>   [(123, 'abc'), (123, 'abc\0'))
>> ^ ^
>> | |
>>lower_boundupper_bound
>>
>> And the function 'incrementColumn()' here may help:
>>
>> https://github.com/apache/kudu/blob/3e3bd1ccbc2b4b070c733b36b1971de63977428b/java/kudu-client/src/main/java/org/apache/kudu/client/PartialRow.java#L1311
>>
>>
>> 何李夫
>> 2018-10-24 15:17:53
>>
>> -邮件原件-
>> 发件人: user-return-1611-hzhelifu=corp.netease@kudu.apache.org
>>  代表 Nabeelah
>> Harris
>> 发送时间: 2019年2月20日 18:27
>> 收件人: user@kudu.apache.org
>> 主题: Single value range partitions using the Java API
>>
>> Hi there
>>
>> Using Impala to interact with Kudu, one is able to add range partitions
>> with single values, i.e 'VALUE = (123, “abc”)'. How would I go about
>> creating the same type of range partition using the Java API?
>>
>> When adding a new range partition, the Java API for
>> 'AlterTableOptions.addRangePartition' expects lower and upper bound
>> ‘PartialRow’ objects, where the upper bound must explicitly be greater than
>> the lower bound.
>>
>> Nabeelah=
>>
>>
>
> --
> Nabeelah Harris
> nabeelah.har...@impact.com |
> https://impact.com
> 
> 
> 
> 
>


Re: strange behavior of getPendingErrors

2018-11-17 Thread Alexey Serbin
https://issues.apache.org/jira/browse/KUDU-2625 is the JIRA to track this
issue.  Feel free to add details, comments, etc.


Thanks,

Alexey

On Sat, Nov 17, 2018 at 7:13 AM Alexey Serbin  wrote:

> Hey Todd,
>
> Yes, that behavior is a bit strange especially given the fact that the
> behavior differs when it comes to duplicate rows and other errors that
> happen at later stages of 'applying' write operations at the server side.
>
> I'll open a JIRA item about this issue.  If anyone disagrees, we can
> resolve the JIRA item as needed.
>
>
> Thanks,
>
> Alexey
>
> On Sat, Nov 17, 2018 at 12:01 AM Todd Lipcon  wrote:
>
>> Hey Alexey,
>>
>> I think your explanation makes sense from an implementation perspective.
>> But, I think we should treat this behavior as a bug. From the user
>> perspective, such an error is a per-row data issue and should only affect
>> the row with the problem, not some arbitrary subset of rows in the batch
>> which happened to share a partition.
>>
>> Does anyone disagree?
>>
>> Todd
>>
>> On Fri, Nov 16, 2018, 9:28 PM Alexey Serbin >
>>> Hi Boris,
>>>
>>> Kudu clients (both Java and C++ ones) send write operations to
>>> corresponding tablet servers in batches when using the AUTO_FLUSH_BACKGROUND
>>> and MANUAL_FLUSH modes.  When a tablet server receives a Write RPC
>>> (WriteRequestPB is the corresponding type of the parameter), it decodes the
>>> operations from the batch:
>>> https://github.com/apache/kudu/blob/master/src/kudu/tablet/local_tablet_writer.h#L97
>>>
>>> While decoding operations from a batch, various constraints are being
>>> checked.  One of those is checking for nulls in non-nullable columns.  If
>>> there is a row in the batch that violates the non-nullable constraint, the
>>> whole batch is rejected.
>>>
>>> That's exactly what happened in your example: a batch to one tablet
>>> consisted of 3 rows one of which had a row with violation of the
>>> non-nullable constraint for the dt_tm column, so the whole batch of 3
>>> operations was rejected.  You can play with different partition schemes:
>>> e.g., in case of 10 hashed partitions it might happen that only 2
>>> operations would be rejected, in case of 30 partitions -- just the single
>>> key==2 row could be rejected.
>>>
>>> BTW, that might also happen if using the MANUAL_FLUSH mode.  However,
>>> with the AUTO_FLUSH_SYNC mode, the client sends operations in batches of
>>> size 1.
>>>
>>>
>>> Kind regards,
>>>
>>> Alexey
>>>
>>> On Fri, Nov 16, 2018 at 7:24 PM Boris Tyukin 
>>> wrote:
>>>
>>>> Hi Todd,
>>>>
>>>> We are on Kudu 1.5 still and I used Kudu client 1.7
>>>>
>>>> Thanks,
>>>> Boris
>>>>
>>>> On Fri, Nov 16, 2018, 17:07 Todd Lipcon >>>
>>>>> Hi Boris,
>>>>>
>>>>> This is interesting. Just so we're looking at the same code, what
>>>>> version of the kudu-client dependency have you specified, and what version
>>>>> of the server?
>>>>>
>>>>> -Todd
>>>>>
>>>>> On Fri, Nov 16, 2018 at 1:12 PM Boris Tyukin 
>>>>> wrote:
>>>>>
>>>>>> Hey guys,
>>>>>>
>>>>>> I am playing with Kudu Java client (wow it is fast), using mostly
>>>>>> code from Kudu Java example.
>>>>>>
>>>>>> While learning about exceptions during rows inserts, I stumbled upon
>>>>>> something I could not explain.
>>>>>>
>>>>>> If I insert 10 rows into a brand new Kudu table
>>>>>> (AUTO_FLUSH_BACKGROUND mode) and I make one row to be "bad" intentionally
>>>>>> (one column cannot be NULL), I actually get 3 rows that cannot be 
>>>>>> inserted
>>>>>> into Kudu, not 1 as I was expected.
>>>>>>
>>>>>> But if I do session.flush() after every single insert, I get only one
>>>>>> error row (but this ruins the purpose of AUTO_FLUSH_BACKGROUND mode).
>>>>>>
>>>>>> Any ideas one? We cannot afford losing data and need to track all
>>>>>> rows which cannot be inserted.
>>>>>>
>>>>>> AUTO_FLUSH mode works much better and I do not have an issue like
>>>>>> above, 

Re: strange behavior of getPendingErrors

2018-11-17 Thread Alexey Serbin
Hey Todd,

Yes, that behavior is a bit strange especially given the fact that the
behavior differs when it comes to duplicate rows and other errors that
happen at later stages of 'applying' write operations at the server side.

I'll open a JIRA item about this issue.  If anyone disagrees, we can
resolve the JIRA item as needed.


Thanks,

Alexey

On Sat, Nov 17, 2018 at 12:01 AM Todd Lipcon  wrote:

> Hey Alexey,
>
> I think your explanation makes sense from an implementation perspective.
> But, I think we should treat this behavior as a bug. From the user
> perspective, such an error is a per-row data issue and should only affect
> the row with the problem, not some arbitrary subset of rows in the batch
> which happened to share a partition.
>
> Does anyone disagree?
>
> Todd
>
> On Fri, Nov 16, 2018, 9:28 PM Alexey Serbin 
>> Hi Boris,
>>
>> Kudu clients (both Java and C++ ones) send write operations to
>> corresponding tablet servers in batches when using the AUTO_FLUSH_BACKGROUND
>> and MANUAL_FLUSH modes.  When a tablet server receives a Write RPC
>> (WriteRequestPB is the corresponding type of the parameter), it decodes the
>> operations from the batch:
>> https://github.com/apache/kudu/blob/master/src/kudu/tablet/local_tablet_writer.h#L97
>>
>> While decoding operations from a batch, various constraints are being
>> checked.  One of those is checking for nulls in non-nullable columns.  If
>> there is a row in the batch that violates the non-nullable constraint, the
>> whole batch is rejected.
>>
>> That's exactly what happened in your example: a batch to one tablet
>> consisted of 3 rows one of which had a row with violation of the
>> non-nullable constraint for the dt_tm column, so the whole batch of 3
>> operations was rejected.  You can play with different partition schemes:
>> e.g., in case of 10 hashed partitions it might happen that only 2
>> operations would be rejected, in case of 30 partitions -- just the single
>> key==2 row could be rejected.
>>
>> BTW, that might also happen if using the MANUAL_FLUSH mode.  However,
>> with the AUTO_FLUSH_SYNC mode, the client sends operations in batches of
>> size 1.
>>
>>
>> Kind regards,
>>
>> Alexey
>>
>> On Fri, Nov 16, 2018 at 7:24 PM Boris Tyukin 
>> wrote:
>>
>>> Hi Todd,
>>>
>>> We are on Kudu 1.5 still and I used Kudu client 1.7
>>>
>>> Thanks,
>>> Boris
>>>
>>> On Fri, Nov 16, 2018, 17:07 Todd Lipcon >>
>>>> Hi Boris,
>>>>
>>>> This is interesting. Just so we're looking at the same code, what
>>>> version of the kudu-client dependency have you specified, and what version
>>>> of the server?
>>>>
>>>> -Todd
>>>>
>>>> On Fri, Nov 16, 2018 at 1:12 PM Boris Tyukin 
>>>> wrote:
>>>>
>>>>> Hey guys,
>>>>>
>>>>> I am playing with Kudu Java client (wow it is fast), using mostly code
>>>>> from Kudu Java example.
>>>>>
>>>>> While learning about exceptions during rows inserts, I stumbled upon
>>>>> something I could not explain.
>>>>>
>>>>> If I insert 10 rows into a brand new Kudu table (AUTO_FLUSH_BACKGROUND
>>>>> mode) and I make one row to be "bad" intentionally (one column cannot be
>>>>> NULL), I actually get 3 rows that cannot be inserted into Kudu, not 1 as I
>>>>> was expected.
>>>>>
>>>>> But if I do session.flush() after every single insert, I get only one
>>>>> error row (but this ruins the purpose of AUTO_FLUSH_BACKGROUND mode).
>>>>>
>>>>> Any ideas one? We cannot afford losing data and need to track all rows
>>>>> which cannot be inserted.
>>>>>
>>>>> AUTO_FLUSH mode works much better and I do not have an issue like
>>>>> above, but then it is way slower than AUTO_FLUSH_BACKGROUND.
>>>>>
>>>>> My code is below. It is in Groovy, but I think you will get an idea :)
>>>>> https://gist.github.com/boristyukin/8703d2c6ec55d6787843aa133920bf01
>>>>>
>>>>> Here is output from my test code that hopefully illustrates my
>>>>> confusion - out of 10 rows inserted, 9 should be good and 1 bad, but it
>>>>> turns out Kudu flagged 3 as bad:
>>>>>
>>>>> Created table kudu_groovy_example
>>>>> Inserting 10 rows in AUTO_FLUSH_BACKGROUND flu

Re: Index question

2018-11-01 Thread Alexey Serbin
One more bit which might be relevant in this context: there is a
work-in-progress patch https://gerrit.cloudera.org/#/c/10983/ addressing
KUDU-1291.  That's not secondary indices per se, but that might help in
some cases where the prefix component of the primary key is of low
cardinality.

On Thu, Nov 1, 2018 at 10:35 AM Adar Lieber-Dembo  wrote:

> Secondary indexing is a feature request that comes up fairly often.
> KUDU-2613 is the tracking JIRA, but there's no real content in there.
> Better to look at KUDU-2038, for which there's a work in progress
> patch for bitmap indexing (https://gerrit.cloudera.org/c/11722/) that
> you can also follow. It's hard to anticipate how quickly that patch is
> moving or when it'll be merged.
>
> On Thu, Nov 1, 2018 at 8:22 AM Yariv Moshe 
> wrote:
> >
> > Hi,
> >
> > We doing POC for our system, based on kudu and have questions about
> indexing.
> >
> > As far as I know we can define only one compound index.
> >
> >
> >
> > We need the ability to define more than one index.
> >
> > Is it on your roadmap? If so, when?
> >
> >
> >
> > Btw, We using kudu over Cloudera.
> >
> >
> >
> > Tnx,
> >
> > Yariv
>


Re: Install kudu-1.6 in Ubuntu-14.04 via apt-get

2018-09-03 Thread Alexey Serbin

Hi,

I think there are Kudu 1.6.0 trusty deb packages at:
http://archive.cloudera.com/cdh5/ubuntu/trusty/amd64/cdh/pool/contrib/k/kudu

As per your question of installing Kudu without Cloudera Manager, you 
can always build Kudu from source:

  https://kudu.apache.org/docs/installation.html#build_from_source

Information about Apache Kudu releases can be found at
  https://kudu.apache.org/releases/

Also, why not to upgrade the already installed nodes to 1.7.0 verison?  
I think upgrading to 1.7.0 might be a better way of expanding your 
cluster: there have been many improvements in 1.7.0 since 1.6.0.



/Alexey

On 9/3/18 7:08 AM, Quanlong Huang wrote:

Hi all,

We have a kudu cluster in version 1.6.0-cdh5.14.2. It's not managed by 
Cloudera Manager. When we want to scale out the cluster (following the 
doc: https://kudu.apache.org/releases/1.6.0/docs/installation.html, 
install by apt-get), we found that the newly installed nodes are in 
version 1.7.0+cdh5.15.1. Looks like this is the only version available 
for Ubuntu:


$ apt-cache policy kudu
kudu:
Installed: 1.7.0+cdh5.15.1+0-1.cdh5.15.1.p0.4~trusty-cdh5.15.1
Candidate: 1.7.0+cdh5.15.1+0-1.cdh5.15.1.p0.4~trusty-cdh5.15.1
Version table:
 *** 1.7.0+cdh5.15.1+0-1.cdh5.15.1.p0.4~trusty-cdh5.15.1 0
501 http://archive.cloudera.com/cdh5/ubuntu/trusty/amd64/cdh/ 
trusty-cdh5/contrib amd64 Packages

100 /var/lib/dpkg/status

Is there any way to install kudu-1.6 without Cloudera Manager? The 
servers are in Ubuntu-14.04. We can only find available deb files at 
http://archive.cloudera.com/kudu/ubuntu/trusty/amd64/kudu/pool/contrib/k/kudu, 
but there's no kudu-1.6. Another possible way is to extract and 
install kudu-1.6 in the CDH parcel: 
http://archive.cloudera.com/cdh5/parcels/5.14.2/, but that's quite 
painful.


Any helps will be appreciated.

Thanks,
Quanlong




Re: Data inconsistency after restart

2017-12-07 Thread Alexey Serbin

Hi Petter,

Before going too deep in attempts to find the place where the data was 
lost, I just wanted to make sure we definitely know that the data was 
delivered from the client to the server side.


Did you verified the client didn't report any errors during data 
ingestion?  Most likely you did, but I just wanted to make sure. BTW, 
what kind of client did you use for the data ingestion?


Thanks!


Kind regards,

Alexey


On 12/6/17 3:56 PM, Andrew Wong wrote:

Hi Petter,

Before we shut down we could only see the following in the logs.
I.e., no sign that ingestion was still ongoing.


Interesting. Just to be sure, was that seen on one tserver, or did you 
see them across all of them?


But if the maintenance_manager performs important jobs that are
required to ensure that all data is inserted then I can understand
why we ended up with inconsistent data.


The maintenance manager's role is somewhat orthogonal to writes: data 
is first written to the on-disk write-ahead log and also kept 
in-memory to be accessible by scans. The maintenance manager 
periodically shuttles this in-memory data to disk, among various other 
tasks like cleaning up WAL segments, compacting rowsets, etc. Given 
that, a lack of maintenance ops shouldn't cause incorrectness in data, 
even after restarting.


I would assume this means that it does not guarantee consistency
if new data is inserted but should give valid (and same) results
if no new data is inserted?


Right, if /all/ tservers a truly caught up and done processing the 
writes, with no tablet copies going on, and with no new data coming 
in, then the results should be consistent.



Hope this helped,
Andrew

On Wed, Dec 6, 2017 at 7:33 AM, Boris Tyukin > wrote:


this is smart, we are doing the same thing but the best part that
attracts me to Kudu is replacing our main HDFS storage with Kudu
to enable near RT use cases and not to deal with HBase and a
Lambda architecture mess so reliability and scalability is a big
deal for us as we are looking to move most of our data to Kudu.

On Wed, Dec 6, 2017 at 9:58 AM, Petter von Dolwitz (Hem)
> wrote:

Hi Boris,

we do not have a Cloudera contract at the moment. Until we
gained more Kudu experience we keep our master data in parquet
format so that we can rebuild Kudu-tables upon errors. We are
still in the early learning phase.

Br,
Petter



2017-12-06 14:35 GMT+01:00 Boris Tyukin >:

this is definitely concerning thread for us looking to use
Impala for storing mission-critical company data. Petter,
are you paid Cloudera customer btw? I wonder if you opened
support ticket as well

On Wed, Dec 6, 2017 at 7:26 AM, Petter von Dolwitz (Hem)
> wrote:

Thanks for your reply Andrew!

>How did you verify that all the data was inserted and
how did you find some data missing?
This was done using Impala. We counted the rows for
groups representing the chunks we inserted.

>Following up on what I posted, take a look at

https://kudu.apache.org/docs/transaction_semantics.html#_read_operations_scans

.
It seems definitely possible that not all of the rows
had finished inserting when counting, or that the
scans were sent to a stale replica.
Before we shut down we could only see the following in
the logs. I.e., no sign that ingestion was still ongoing.


kudu-tserver.ip-xx-yyy-z-nnn.root.log.INFO.20171201-065232.90314:I1201
07:27:35.010694 90793 maintenance_manager.cc:383] P
a38902afefca4a85a5469d149df9b4cb: we have exceeded our
soft memory limit (current capacity is 67.52%). 
However, there are no ops currently runnable which

would free memory.

Also the (cloudera) metric
total_kudu_rows_inserted_rate_across_kudu_replicas
showed zero.

Still it seems like some data became inconsistent
after restart. But if the maintenance_manager performs
important jobs that are required to ensure that all
data is inserted then I can understand why we ended up
with inconsistent data. But, if I understand you
correct,  you are saying that these jobs are not
critical 

Re: Unable to access Kudu table created using Spark via Impala

2017-10-13 Thread Alexey Serbin

Hi Nitin,

Impala needs to know about Kudu tables which were created 'externally' 
(i.e. not via Impala).  Have you run that 'CREATE EXTERNAL TABLE ...' 
via Impala shell already?  If not, you need to do so.  More information 
on the topic can be found at:

http://kudu.apache.org/docs/kudu_impala_integration.html#_querying_an_existing_kudu_table_in_impala


Best regards,

Alexey


On 10/13/17 12:10 PM, Nitin Agarwal wrote:
Hi, I used Spark and KuduContext to create a table in Kudu. I can see 
the table in Kudu UI and can access it through Spark. However I am 
unable to access that table via Impala. I have issued invalidate 
metadata in Impala but I am still unable to list the table.


So what do I need to do to access this table via Impala?

I am using Kudu version kudu 1.4.0-cdh5.12.1 with Spark 1.6.x

This is what I see in Kudu UI for my table

|CREATE EXTERNAL TABLE `nums.telephone_number` STORED AS KUDU 
TBLPROPERTIES( 'kudu.table_name' = 'nums.telephone_number', 
'kudu.master_addresses' = 'ff58-29.idc1.level3.com:7051 
,ff58-30.idc1.level3.com:7051 
,ff58-34.idc1.level3.com:7051');|

|Nitin|




Re: The Error message

2017-09-29 Thread Alexey Serbin

Hi Khursheed,

I don't think there is a mistake from your side here, just some packages 
are missing on your machine and some intermittent failure from github 
HTTP server.


It seems the error from line 5 is about absence of 'git' command at your 
VM.  The last error about 'line 1: 404:' looks like an intermittent HTTP 
error 404 from github server.


Also, it's worth addressing your e-mail messages to 
user@kudu.apache.org, so a broader audience would be able to help you 
and share their knowledge.



Best regards,

Alexey

On 9/29/17 4:04 AM, khan alam wrote:

Dear Alexey,

Please find the error message.Now tell me where is my mistake,even 
after following the step


Inline image 1

Regards
Khursheed




Re: impala + kudu

2017-09-26 Thread Alexey Serbin

That I don't know.  Apparently, if it worked for a different user,
there might be some issue with configuration of the target machine
or some intermittent network failure.  It doesn't seem to be an
issue with the Kudu quickstart VM, though.

I would suggest to take a look at troubleshooting/configuration guides 
specific to your local machine.

At least, make 'ping github.com' work at your machine for starters.


Best regards,

Alexey


On 9/26/17 9:49 AM, khan alam wrote:
Hi Alexey the same machine when. Download the same stuff from 
different user it was fine,why would this happens?


Regards
khursheed

On Sep 26, 2017 7:43 PM, "Alexey Serbin" <aser...@cloudera.com 
<mailto:aser...@cloudera.com>> wrote:


Hi Khursheed,

It seems the issue is with hostname resolution, at least. You need
to have Internet access
with DNS resolver properly configured at the machine where you run
those instructions.


Best regards,

Alexey


On 9/26/17 8:54 AM, khan alam wrote:

Dear Alexey,

Please find the input,as issue with the github.

Regards
Khursheed





On Tue, Sep 26, 2017 at 7:31 AM, Alexey Serbin
<aser...@cloudera.com <mailto:aser...@cloudera.com>
<mailto:aser...@cloudera.com <mailto:aser...@cloudera.com>>>
wrote:

Hi,

What instructions did you use to get quickstart Kudu VM?
It's recommended to use instruction at
https://kudu.apache.org/docs/quickstart.html
<https://kudu.apache.org/docs/quickstart.html>
<https://kudu.apache.org/docs/quickstart.html
<https://kudu.apache.org/docs/quickstart.html>>

It's supposed the instructions will get you working VM up
and running.
At what step that failed and what was the error message?

It might be helpful to check the troubleshooting section
first:
https://kudu.apache.org/docs/quickstart.html#_footnote_1
<https://kudu.apache.org/docs/quickstart.html#_footnote_1>
<https://kudu.apache.org/docs/quickstart.html#_footnote_1
<https://kudu.apache.org/docs/quickstart.html#_footnote_1>>


Best regards,

Alexey


On 9/25/17 2:52 PM, khan alam wrote:

Dear Users,

Please help me with the basics on kudu installation i am
little lost in the midst not sure of how to proceed i
have the
virtual machine down loaded.

it does not have linux/ubuntu as mentioned in the kudu
apache
site.

Guidance from you all folks is much appreciated.

i believe there is a vm set up for impala with kudu
can some
one help me with that.

regards
Khursheed








Re: impala + kudu

2017-09-26 Thread Alexey Serbin

Hi Khursheed,

It seems the issue is with hostname resolution, at least.  You need to 
have Internet access
with DNS resolver properly configured at the machine where you run those 
instructions.



Best regards,

Alexey


On 9/26/17 8:54 AM, khan alam wrote:

Dear Alexey,

Please find the input,as issue with the github.

Regards
Khursheed





On Tue, Sep 26, 2017 at 7:31 AM, Alexey Serbin <aser...@cloudera.com 
<mailto:aser...@cloudera.com>> wrote:


Hi,

What instructions did you use to get quickstart Kudu VM?
It's recommended to use instruction at
https://kudu.apache.org/docs/quickstart.html
<https://kudu.apache.org/docs/quickstart.html>

It's supposed the instructions will get you working VM up and running.
At what step that failed and what was the error message?

It might be helpful to check the troubleshooting section first:
https://kudu.apache.org/docs/quickstart.html#_footnote_1
<https://kudu.apache.org/docs/quickstart.html#_footnote_1>


Best regards,

Alexey


On 9/25/17 2:52 PM, khan alam wrote:

Dear Users,

Please help me with the basics on kudu installation i am
little lost in the midst not sure of how to proceed i have the
virtual machine down loaded.

it does not have linux/ubuntu as mentioned in the kudu apache
site.

Guidance from you all folks is much appreciated.

i believe there is a vm set up for impala with kudu can some
one help me with that.

regards
Khursheed







Re: impala + kudu

2017-09-25 Thread Alexey Serbin

Hi,

What instructions did you use to get quickstart Kudu VM?
It's recommended to use instruction at
  https://kudu.apache.org/docs/quickstart.html

It's supposed the instructions will get you working VM up and running.
At what step that failed and what was the error message?

It might be helpful to check the troubleshooting section first:
  https://kudu.apache.org/docs/quickstart.html#_footnote_1


Best regards,

Alexey

On 9/25/17 2:52 PM, khan alam wrote:

Dear Users,

Please help me with the basics on kudu installation i am little lost 
in the midst not sure of how to proceed i have the virtual machine 
down loaded.


it does not have linux/ubuntu as mentioned in the kudu apache site.

Guidance from you all folks is much appreciated.

i believe there is a vm set up for impala with kudu can some one help 
me with that.


regards
Khursheed




Re: Configure Impala for Kudu on Separate Cluster

2017-08-15 Thread Alexey Serbin

Ben,

As Todd mentioned, it might be some network connectivity problem. I 
would suspect some issues with connectivity between the node where the 
Impala shell is running and the Kudu master node.


To start troubleshooting, I would verify that the node where you run the 
Impala shell (that's 172.35.120.191, right?) can establish a TCP 
connection to the master RPC end-point.  E.g., try to run from the 
command shell at 172.35.120.191:


  telnet 172.35.121.101 7051

Would it succeed?

Also, if running multi-master Kudu cluster, it might happen that masters 
cannot communicate with each other.  To troubleshoot that, I would try 
to establish a TCP connection to the RPC end-point of the master at one 
node from another master node.  E.g., if using telnet, from 
, in the command-line shell:


  telnet  7051
  (just substitute  and  with appropriate 
hostnames/IP addresses).




Best regards,

Alexey


On 8/15/17 9:53 PM, Benjamin Kim wrote:

Todd,

Caused by: org.apache.kudu.client.NoLeaderMasterFoundException: Master 
config (prod-dc1-datanode151.pdc1i.gradientx.com:7051 
) has no 
leader. Exceptions received: 
org.apache.kudu.client.RecoverableException: [Peer Kudu Master - 
prod-dc1-datanode151.pdc1i.gradientx.com:7051 
] Connection 
reset on [id: 0x6232f33f, /172.35.120.191:47848 
 :> /172.35.121.101:7051 
]


We got this error trying to use Kudu from within the cluster. Do you 
know what this means?


Cheers,
Ben


On Tue, Aug 15, 2017 at 12:40 AM Todd Lipcon > wrote:


Is there a possibility that the remote node (prod-dc1-datanode151)
is firewalled off from whatever host you are submitting the query
to? The error message is admittedly pretty bad, but it basically
means it's getting "connection refused", indicating that either
there is no master running on that host or it has been blocked (eg
an iptables REJECT rule)

-Todd

On Mon, Aug 14, 2017 at 10:36 PM, Benjamin Kim > wrote:

Hi Todd,

I tried to create a Kudu table using impala shell, and I got
this error.

create table my_first_table
(
  id bigint,
  name string,
  primary key(id)
)
partition by hash partitions 16
stored as kudu;
Query: create table my_first_table
(
  id bigint,
  name string,
  primary key(id)
)
partition by hash partitions 16
stored as kudu
ERROR: ImpalaRuntimeException: Error creating Kudu table
'impala::default.my_first_table'
CAUSED BY: NonRecoverableException: Too many attempts:
KuduRpc(method=ListTables, tablet=null, attempt=101,
DeadlineTracker(timeout=18, elapsed=178226), Traces: [0ms]
querying master, [1ms] Sub rpc: ConnectToMaster sending RPC to
server master-prod-dc1-datanode151.pdc1i.gradientx.com:7051
,
[2ms] Sub rpc: ConnectToMaster received from server
master-prod-dc1-datanode151.pdc1i.gradientx.com:7051

response Network error: [Peer
master-prod-dc1-datanode151.pdc1i.gradientx.com:7051
]
Connection closed, [5ms] delaying RPC due to Service
unavailable: Master config
(prod-dc1-datanode151.pdc1i.gradientx.com:7051
) has no
leader. Exceptions received:
org.apache.kudu.client.RecoverableException: [Peer
master-prod-dc1-datanode151.pdc1i.gradientx.com:7051
]
Connection closed, [21ms] querying master, [22ms] Sub rpc:
ConnectToMaster sending RPC to server
master-prod-dc1-datanode151.pdc1i.gradientx.com:7051
,
[24ms] Sub rpc: ConnectToMaster received from server
master-prod-dc1-datanode151.pdc1i.gradientx.com:7051

response Network error: [Peer
master-prod-dc1-datanode151.pdc1i.gradientx.com:7051
]
Connection closed, [26ms] delaying RPC due to Service
unavailable: Master config
(prod-dc1-datanode151.pdc1i.gradientx.com:7051
) has no
leader. Exceptions received:
org.apache.kudu.client.RecoverableException: [Peer

Re: tserver died by clock unsync.

2017-06-16 Thread Alexey Serbin

Hi Jason,

I think the workaround you mentioned (i.e. replacing LOG(FATAL) with 
LOG(WARNING) in the cited code snippet) is not safe at all.  If 
ntp_gettime() returns TIME_ERROR code, that means the 'now_usec' 
variable might be left uninitialized, and the code relying on the 
HybridClock::NowWithError() method would get some garbage instead of 
wall clock usec value.  That might lead to serious issues elsewhere up 
the chain, and it's hard to predict what would happen.  If you are 
lucky, a tserver will crash just later on, if not -- you'll get 
undefined behavior and data corruption which would be very hard to track 
and fix.


Instead of running your tservers with that unsafe change, I would 
recommend to track down the issue with the NTP in your cluster. Make 
sure there isn't other clock drives on your machines besides ntpd (e.g., 
make sure nobody runs ntpdate manually and ntpdate is not executed by a 
cron job, etc.).  If your local network experiences internet outages for 
long periods of time, one suggestion might be running NTP server on a 
stable machine (or two) within your local network.  Your local NTP 
servers would source time from 5-7 public NTP servers of stratum 2 or 3 
from the internet.  In their turn, the NTP servers at your Kudu nodes 
would use your internal NTP server(s) as a source. Also, it would make 
sense to take a look at some 'NTP best practice' guides you could find 
elsewhere on the Internet -- hopefully, you could find some ideas how to 
tailor those for you case.


Hope this helps.


Kind regards,

Alexey


On 6/16/17 1:59 AM, Jason Heo wrote:

Hi.

Congrat. Apache Kudu 1.4.0

To prevent tserver from dying accidentally, I've changed LOG(FATAL) 
 
to LOG(WARNING)


I wanted to know it is safe to continue if ntp_gettime() in 
GetClockTime 
 
returns TIME_ERROR


Could anyone can help me?

Regards,

Jason



2017-06-15 12:40 GMT+09:00 Jason Heo >:


Hi,

I'm using Apache Kudu 1.4.0

Yesterday, 6 tservers die at the same time. Following message is
logged for each tserver.


F0614 14:58:32.868551 111454 hybrid_clock.cc:227]

Couldn't get the current time: Clock unsynchronized.

Status: Service unavailable:

Error reading clock. Clock considered unsynchronized


We are already using ntpd, and in /var/log/messages, ntpd related
message is logged.

Jun 14 14:58:38 hostname ntpdate[10231]: step time server ip_addr
offset -0.000168 sec


We use our own ntp service. I don't know what's the exact reason,
but It's suspicious that our ntp service is malfunctioned or
network is not good temporarily.

The problem is that this could happen again and again.

So, I'm considering modifying source code of Kudu from LOG(FATAL)
to LOG(WARN) so that tserver does not exit on unsync.

  uint64_t now_usec;

  uint64_t error_usec;

  Status s = WalltimeWithError(_usec, _usec);

  if (PREDICT_FALSE(!s.ok())) {

LOG(FATAL)<< Substitute("Couldn't get the current time: Clock
unsynchronized. "

"Status: $0", s.ToString());

  }



So, I question is that is it OK modifying LOG(FATAL) to LOG(WARN)
of above code? and wanted to know this can preventing from dying
of tserver when clock unsynced?

Thanks.

Jason,

Regard






Re: I got an "authentication token expired" error.

2017-06-14 Thread Alexey Serbin

Hi Jason,

It seems your Java Kudu client hit the authn token expiration issue.  As 
you mentioned, that's a well known issue and it is described in the 
docs.  Just FYI, the Kudu C++ client starting 1.4.0 automatically 
re-acquires authn token when needed, and I hope the Java client will do 
so as well in next release.  If you are interested in details, the issue 
is tracked by https://issues.apache.org/jira/browse/KUDU-2013 and is 
being actively worked on (there is a WIP patch for that).


As for a temporary workaround, you could try one the following:

  * Set authn token expiration time to some huge value, i.e. run the 
Kudu masters with custom value for the --auth_token_validity_seconds 
flag.  The default is 7 days (604800 seconds); you could try to set it 
to, say, 300 days: '--auth_token_validity_seconds=2592'. That would 
be a good option if your use-case requires a secure Kudu cluster with 
authentication.  For this workaround, once Kudu masters are restarted 
with new flags, you also need to restart your Java clients to acquire a 
new token with longer TTL if you don't want them to hit the issue in one 
week.


  * Disable RPC authentication and encryption, i.e. run both Kudu 
masters and tablet servers with '--rpc_authentication=disabled 
--rpc_encryption=disabled' flags (you need to disable both authn and 
encryption).  That would be an option if your use-case does not require 
a secure Kudu cluster.  In this case you don't need to restart your Java 
clients once you restarted Kudu server-side components.


Hope this helps.


Kind regards,

Alexey


On 6/14/17 2:44 AM, Jason Heo wrote:

Hi.

I'm using Apache Kudu 1.4.0.

And I have a long running Java Daemon which is a kudu client at the 
same time.


Today (7 days has been past since the Java Daemon has been started) I 
suddenly got an following error.



W0614 15:29:11.934401 62459 negotiation.cc:310] Unauthorized 
connection attempt: Server connection negotiation failed: server 
connection from ip_addr:56604: authentication token expired


W0614 15:29:11.956845 62459 negotiation.cc:310] Unauthorized 
connection attempt: Server connection negotiation failed: server 
connection from ip_addr:56606: authentication token expired


...

...

...

W0614 17:47:18.347970 74099 negotiation.cc:310] Unauthorized 
connection attempt: Server connection negotiation failed: server 
connection from ip_addr:56172: authentication token expired


W0614 17:47:20.488306 74100 negotiation.cc:310] Unauthorized 
connection attempt: Server connection negotiation failed: server 
connection from ip_addr:56180: authentication token expired




Kudu is started with this options

--unlock_experimental_flags=true

...

--superuser_acl=user1,user2


The Java Daemon is started with user2 account.

How can I prevent from happening this error again.

I've read this manual. 
https://kudu.apache.org/docs/security.html#known-limitations


It says that "so long-lived clients in secure clusters are not 
supported" Then Should I set `--rpc-authentication=disable`?


Thanks.

Regards,

Jason




Re: Help start kudu error: Bad status: Invalid argument: Tried to update clock beyond the max. error.

2017-05-02 Thread Alexey Serbin

Hi,

It seems the clock among the machines in the cluster is not synchronized 
as expected.  It might be because of NTP configuration issues.  There is 
some information to start troubleshooting with: 
http://kudu.apache.org/docs/troubleshooting.html#ntp


That error might appear during tablet bootstrap (so it might happen to 
both masters and tservers).


What is output of the 'ntptime' command if running at the servers?  
Also, what is 'ntpq -p localhost' output is?



Best regards,

Alexey


On 5/2/17 12:12 AM,  wrote:
Since the kudu cluster machine is powered down, I need to restart 
kudu-master and kudu-tserver.
The cluster has three master and three tserver, one of the master and 
three tserver start error, error message: Bad status: Invalid 
argument: Tried to update clock beyond the max. Error.
I tried to set max_clock_sync_error_usec larger, but still the same 
mistake.

I do not know what to do to solve it.
Kudu-master start log:

Log file created at: 2017/05/02 14:50:53
Running on machine: hadoopname01vl
Log line format: [IWEF]mmdd hh:mm:ss.uu threadid file:line] msg
I0502 14:50:53.479116  5474 master_main.cc:60] Master server 
non-default flags:

--fs_data_dirs=/app/kudu/master
--fs_wal_dir=/app/kudu/master
--master_addresses=hadoopname01vl:7051,hadoopdata04vl:7051,hadoopname02vl:7051
--max_clock_sync_error_usec=15
--heap_profile_path=/tmp/kudu-master.5474
--flagfile=/etc/kudu/conf/master.gflagfile
--fromenv=log_dir
--log_dir=/app/kudu/log
Master server version:
kudu 1.2.0-cdh5.10.0
revision 01748528baa06b78e04ce9a799cc60090a821162
build type RELEASE
built by jenkins at 23 Jan 2017 23:49:02 PST on 
kudu-centos66-17b9.vpc.cloudera.com

build id 2017-01-23_23-14-17
I0502 14:50:53.479230  5474 mem_tracker.cc:140] MemTracker: hard 
memory limit is 2.988239 GB
I0502 14:50:53.479236  5474 mem_tracker.cc:142] MemTracker: soft 
memory limit is 1.792943 GB
I0502 14:50:53.480358  5474 master_main.cc:67] Initializing master 
server...
I0502 14:50:53.480466  5474 hybrid_clock.cc:177] HybridClock 
initialized. Resolution in nanos?: 1 Wait times tolerance adjustment: 
1.0005 Current error: 1109553
I0502 14:50:53.481259  5474 env_posix.cc:1284] Not raising process 
file limit of 131072; it is already as high as it can go
I0502 14:50:53.481281  5474 file_cache.cc:401] Constructed file cache 
lbm with capacity 65536
I0502 14:50:53.482020  5474 log_block_manager.cc:1336] Data dir 
/app/kudu/master/data is on an ext4 filesystem vulnerable to KUDU-1508 
with block size 4096
I0502 14:50:53.482035  5474 log_block_manager.cc:1346] Limiting 
containers on data directory /app/kudu/master/data to 2721 blocks
I0502 14:50:53.484666  5474 fs_manager.cc:251] Opened local 
filesystem: /app/kudu/master

uuid: "4811dfb33ff444d2b3416d7bbe3c9a38"
format_stamp: "Formatted at 2017-02-20 07:35:54 on hadoopname01vl"
I0502 14:50:53.501610  5474 master_main.cc:70] Starting Master server...
I0502 14:50:53.505748  5474 rpc_server.cc:164] RPC server started. 
Bound to: 0.0.0.0:7051
I0502 14:50:53.505798  5474 webserver.cc:126] Starting webserver on 
0.0.0.0:8051
I0502 14:50:53.505807  5474 webserver.cc:131] Document root: 
/usr/lib/kudu/www
I0502 14:50:53.505928  5474 webserver.cc:221] Webserver started. Bound 
to: http://0.0.0.0:8051/
I0502 14:50:53.506609  5543 sys_catalog.cc:119] Verifying existing 
consensus state
I0502 14:50:53.507067  5543 tablet_bootstrap.cc:381] T 
 P 4811dfb33ff444d2b3416d7bbe3c9a38: 
Bootstrap starting.
I0502 14:50:53.507866  5543 tablet_bootstrap.cc:540] T 
 P 4811dfb33ff444d2b3416d7bbe3c9a38: 
Time spent opening tablet: real 0.001s  user 0.000s sys 0.000s
I0502 14:50:53.507894  5543 tablet_bootstrap.cc:560] T 
 P 4811dfb33ff444d2b3416d7bbe3c9a38: 
Previous recovery directory found at 
/app/kudu/master/wals/.recovery: 
Replaying log files from this location instead of 
/app/kudu/master/wals/
I0502 14:50:53.507917  5543 tablet_bootstrap.cc:567] T 
 P 4811dfb33ff444d2b3416d7bbe3c9a38: 
Deleting old log files from previous recovery attempt in 
/app/kudu/master/wals/
I0502 14:50:53.509835  5543 log_util.cc:316] Log segment 
/app/kudu/master/wals/.recovery/wal-1 
has no footer. This segment was likely being written when the server 
previously shut down.
I0502 14:50:53.509851  5543 log_reader.cc:160] Log segment 
/app/kudu/master/wals/.recovery/wal-1 
was likely left in-progress after a previous crash. Will try to 
rebuild footer by scanning data.
I0502 14:50:53.548249  5543 log_util.cc:570] Scanning 
/app/kudu/master/wals/.recovery/wal-1 
for valid entry headers following offset 7156830...

I0502 14:50:53.564885  5543 log_util.cc:607] Found no log 

Re: Security Roadmap

2017-03-18 Thread Alexey Serbin
You can get some information on security-related features in upcoming 
Kudu 1.3.0 release at 
https://github.com/apache/kudu/blob/master/docs/release_notes.adoc#rn_1.3.0_new_features


In the long run, there are plans to add fine-grained authorization (ACLs 
for table/column-level access, ACLs for different operations, etc.).  
Integration with Apache Sentry might be an option as well.  As usual, no 
promises on exactly when or what release it's going to happen.



Best regards,

Alexey

On 3/18/17 10:15 AM, Benjamin Kim wrote:

I’m curious as to what security features we can expect coming in the near and 
far future for Kudu. If there is some documentation for this, please let me 
know.

Cheers,
Ben





Re: [Benchmarking]

2017-03-14 Thread Alexey Serbin
On Tue, Mar 14, 2017 at 11:25 AM, Alexey Serbin <aser...@cloudera.com>
wrote:

> Hi,
>
> It seems that sort of benchmark is not a trivial undertaking.  I'm sure
> there is a lot to consider while doing that sort of benchmark.  Probably,
> more senior members of the Kudu team could suggest something else, but
> right away I can suggest the following:
>
> 1. Consider using real hardware machines while doing the benchmark, not
> VMs.  Make sure the databases store their data on the same media when doing
> the comparison.
>
> 2. Make sure your benchmark schema is supported by both Kudu and
> PostgreSQL.  Probably, to perform the benchmark you would need to tweak
> your existing schema little bit.  Kudu supports a subset of types available
> in PostreSQL.  Also, pay attention to primary keys/indices and partitions
> if you running read/scan comparisons. Overall, in this context it's worth
> reading this document first: https://kudu.apache.org/docs/
> schema_design.html
>
> 3. Kudu is supposed to shine when working with huge amount of data spread
> across multiple machines in a cluster.  Are you about to use clustered
> setup for PostgreSQL as well?  May be worth considering to try clustered
> setup for PostgreSQL as well.
>
> 4. While creating Kudu tables, use just a single replica -- additional
> replicas add some latency for write operations because the write operation
> is considered successful only when by majority of existing replicas.  Also,
> since I didn't see
>

​Oops, something happened with those words.​  I meant

... only when acknowledged by the majority of existing replicas.  I'm
suggesting to use just a single replica since I didn't see anything
mentioned about replication for the PostgreSQL.
​

> 5. Consider placing WAL for both Kudu and PostgreSQL on an SSD -- this
> lowers latencies for DML operations.  I know that's so at least for Kudu,
> and I would expect that's true for PostgreSQL as well.
>
> 6. Pay some attention to run-time resource limits in effect while running
> those benchmarks:
>   https://www.postgresql.org/docs/9.6/static/runtime-config-resource.html
>   https://kudu.apache.org/docs/configuration_reference.html (search for
> flags containing 'memory' and 'cache_size' in their names)
>
>
> As for inserting your existing data into Kudu, consider using Impala:
> https://kudu.apache.org/docs/kudu_impala_integration.html
>
>
> Best regards,
>
> Alexey
>
> On Tue, Mar 14, 2017 at 8:01 AM, paulo faria <ziko...@hotmail.com> wrote:
>
>> HI
>>
>>
>> Im doing a benchmark of Kudu(and other timeseriesdbs) Versus PostgresQL
>> 9.6.
>> Done ur VM demo tutorial already.
>>
>>
>> But now I would like to compare those 2. I already got the Postgresql
>> enviroment set (with some tables + data (1GB per table to test)) on a
>> remote server.
>> 1)What is ur advice for a query(reads) performance compare?
>> 2)Any way to convert(or migrate) the postgres structure to the Kudu? I
>> got my database on HUE Impala so i can query over there and download the
>> data also from there.
>>
>>
>> Any tips are apreciated
>>
>> Best Regards
>>
>> Paulo Faria
>>
>>
>


Re: [Benchmarking]

2017-03-14 Thread Alexey Serbin
Hi,

It seems that sort of benchmark is not a trivial undertaking.  I'm sure
there is a lot to consider while doing that sort of benchmark.  Probably,
more senior members of the Kudu team could suggest something else, but
right away I can suggest the following:

1. Consider using real hardware machines while doing the benchmark, not
VMs.  Make sure the databases store their data on the same media when doing
the comparison.

2. Make sure your benchmark schema is supported by both Kudu and
PostgreSQL.  Probably, to perform the benchmark you would need to tweak
your existing schema little bit.  Kudu supports a subset of types available
in PostreSQL.  Also, pay attention to primary keys/indices and partitions
if you running read/scan comparisons. Overall, in this context it's worth
reading this document first: https://kudu.apache.org/docs/schema_design.html

3. Kudu is supposed to shine when working with huge amount of data spread
across multiple machines in a cluster.  Are you about to use clustered
setup for PostgreSQL as well?  May be worth considering to try clustered
setup for PostgreSQL as well.

4. While creating Kudu tables, use just a single replica -- additional
replicas add some latency for write operations because the write operation
is considered successful only when by majority of existing replicas.  Also,
since I didn't see

5. Consider placing WAL for both Kudu and PostgreSQL on an SSD -- this
lowers latencies for DML operations.  I know that's so at least for Kudu,
and I would expect that's true for PostgreSQL as well.

6. Pay some attention to run-time resource limits in effect while running
those benchmarks:
  https://www.postgresql.org/docs/9.6/static/runtime-config-resource.html
  https://kudu.apache.org/docs/configuration_reference.html (search for
flags containing 'memory' and 'cache_size' in their names)


As for inserting your existing data into Kudu, consider using Impala:
https://kudu.apache.org/docs/kudu_impala_integration.html


Best regards,

Alexey

On Tue, Mar 14, 2017 at 8:01 AM, paulo faria  wrote:

> HI
>
>
> Im doing a benchmark of Kudu(and other timeseriesdbs) Versus PostgresQL
> 9.6.
> Done ur VM demo tutorial already.
>
>
> But now I would like to compare those 2. I already got the Postgresql
> enviroment set (with some tables + data (1GB per table to test)) on a
> remote server.
> 1)What is ur advice for a query(reads) performance compare?
> 2)Any way to convert(or migrate) the postgres structure to the Kudu? I got
> my database on HUE Impala so i can query over there and download the data
> also from there.
>
>
> Any tips are apreciated
>
> Best Regards
>
> Paulo Faria
>
>


Re: What does RowSet Compaction Duration means?

2017-03-14 Thread Alexey Serbin
Hi Jason,

As I understand, that 'milliseconds / second' cryptic unit means 'number of
units / for sampling (or averaging) interval'.

I.e., they capture that metric reading (expressed in milliseconds) every
second, subtract previous value from the current value, and declare the
result as the result measurement at current time.  If not capturing every
second, then it's about measuring every X seconds, do the subtraction of
the previous from the current measurement, and then divide by X.

For a single tablet, the 'compact_rs_duration' metric stands for 'Time
spent compacting RowSets'.  As I understand, that
'total_kudu_compact_rs_duration_sum_rate_across_kudu_replicas' is
sum/accumulation of those measurements for all existing replicas of the
specified tablet across Kudu cluster.

I suspect you have the replication factor of 5 for that tablet, and at some
point all replicas become busy with rowset compaction all the time.

Compactions on tables are run in the background.  Compactions on different
tables run independently.  So, if you have some other activity doing
inserts/updates on tableB, then it's natural to see compaction happen on
tabletB as well.


Best regards,

Alexey

On Tue, Mar 14, 2017 at 12:50 AM, Jason Heo  wrote:

> Hi.
>
> I'm stuck with performance degradation on compaction happens.
>
> My Duration is "4956.71 milliseconds / second" What does this mean? I
> can't figure it out.
>
> Here is the captured image: http://imgur.com/WU9sRRq
>
> When I'm doing bulk indexing on tableA, sometimes compaction happens over
> tableB. Is this situation is natural?
>
> Thanks.
>