Hi Todd,

Thanks for the reply! I used a single Kafka consumer to pull the data.
For Kudu, I was doing something very simple that basically just follow the
example here
<https://github.com/cloudera/kudu-examples/blob/master/java/java-sample/src/main/java/org/kududb/examples/sample/Sample.java>
.
In specific:

loop {
  Insert insert = kuduTable.newInsert();
  PartialRow row = insert.getRow();
  // fill the columns
  kuduSession.apply(insert)
}

I didn't specify the flushing mode, so it will pick up the AUTO_FLUSH_SYNC
as default?
should I use MANUAL_FLUSH?

Thanks,
Chao

On Mon, Oct 30, 2017 at 10:39 PM, Todd Lipcon <[email protected]> wrote:

> Hey Chao,
>
> Nice to hear you are checking out Kudu.
>
> What are you using to consume from Kafka and write to Kudu? Is it possible
> that it is Java code and you are using the SYNC flush mode? That would
> result in a separate round trip for each record and thus very low
> throughput.
>
> Todd
>
> On Oct 30, 2017 10:23 PM, "Chao Sun" <[email protected]> wrote:
>
> Hi,
>
> We are evaluating Kudu (version kudu 1.3.0-cdh5.11.1, revision
> af02f3ea6d9a1807dcac0ec75bfbca79a01a5cab) on a 8-node cluster.
> The data are coming from Kafka at a rate of around 30K / sec, and hash
> partitioned into 128 buckets. However, with default settings, Kudu can only
> consume the topics at a rate of around 1.5K / second. This is a direct
> ingest with no transformation on the data.
>
> Could this because I was using the default configurations? also we are
> using Kudu on HDD - could that also be related?
>
> Any help would be appreciated. Thanks.
>
> Best,
> Chao
>
>
>

Reply via email to