Hi Todd,
Thanks for the reply! I used a single Kafka consumer to pull the data.
For Kudu, I was doing something very simple that basically just follow the
example here
<https://github.com/cloudera/kudu-examples/blob/master/java/java-sample/src/main/java/org/kududb/examples/sample/Sample.java>
.
In specific:
loop {
Insert insert = kuduTable.newInsert();
PartialRow row = insert.getRow();
// fill the columns
kuduSession.apply(insert)
}
I didn't specify the flushing mode, so it will pick up the AUTO_FLUSH_SYNC
as default?
should I use MANUAL_FLUSH?
Thanks,
Chao
On Mon, Oct 30, 2017 at 10:39 PM, Todd Lipcon <[email protected]> wrote:
> Hey Chao,
>
> Nice to hear you are checking out Kudu.
>
> What are you using to consume from Kafka and write to Kudu? Is it possible
> that it is Java code and you are using the SYNC flush mode? That would
> result in a separate round trip for each record and thus very low
> throughput.
>
> Todd
>
> On Oct 30, 2017 10:23 PM, "Chao Sun" <[email protected]> wrote:
>
> Hi,
>
> We are evaluating Kudu (version kudu 1.3.0-cdh5.11.1, revision
> af02f3ea6d9a1807dcac0ec75bfbca79a01a5cab) on a 8-node cluster.
> The data are coming from Kafka at a rate of around 30K / sec, and hash
> partitioned into 128 buckets. However, with default settings, Kudu can only
> consume the topics at a rate of around 1.5K / second. This is a direct
> ingest with no transformation on the data.
>
> Could this because I was using the default configurations? also we are
> using Kudu on HDD - could that also be related?
>
> Any help would be appreciated. Thanks.
>
> Best,
> Chao
>
>
>