Re: PARTITION_PARALLEL Vs Regular mode

Pramod Immaneni Thu, 22 Dec 2016 11:07:53 -0800

Arvindan,

When you had the MxN case with 100 kafka consumers sending to 120 parquet
writers what was the cpu utilization of the parquet containers. Was it
close to 100% or did you have spare cycles? I am trying to determine if it
is an IO bottleneck or processing.


Thanks

On Thu, Dec 22, 2016 at 10:50 AM, Arvindan Thulasinathan <
[email protected]> wrote:

> Hi,
>   We have an Apex Application which has a DAG structure like this:
>
> KafkaConsumer —>  ParquetWriter
>
> The KafkaConsumer is running at a scale where we have 100 containers for
> consumer consuming from a Kafka-Cluster with an incoming rate of 300K
> msg/sec and each message is about 1KB (Each message is a highly nested Avro
> message). We arrived at the 100 container number for the consumer in order
> to keep-up with the incoming traffic. The ParquetWriter is a bit CPU
> intensive in our case and we thought we may require about 120 -
> 130Containers.
>
>
> We have 3 different observations here:
>   1. Using 100 KafkaConsumer and 120 ParquetWriter Without Partition
> Parallel:
>                  In this case, Apex automatically introduces a
> pass-through unifier.  In this scenario, we have seen that invariably
> ParquetWriter processes tuples at a lesser rate than KafkaConsumer’s emit
> rate. That is, if Consumer emits Tuples at the rate of 20 million/sec, the
> ParquetWriter will only write at the rate of 17 million/sec. Also, it
> invariably leads to backpressure and makes the consumer consume at a lower
> rate. I have tried going beyond 120 containers as well and
> I believe a possible reason could be - Unifier and Writer are in the same
> container and presumably share the same core. And hence they are slower? Is
> this observation correct? I tried tuning by increasing the
> Writer.Inputport.QUEUE_SIZE to 10K. The queue is not even getting half
> full, but still the back-pressure is created for the consumer. Is there any
> additional tune-up that I can do, to:
>    A. Make the writer process tuples at almost the same pace as Consumer
> without backpressure on the Consumer
>
>   2. Using Partition Parallel with 120 KafkaConsumer and 120 ParquetWriter:
>         In this scenario as well, we have seen that ParquetWriter
> processes tuples at a lesser rate than KafkaConsumer’s emit rate. That is,
> if Consumer emits Tuples at the rate of 20 million/sec, the ParquetWriter
> will only write at the rate of 19 million/sec. This behavior is true even
> if we keep increasing the Consumers and writers to 130 or 140 containers. I
> believe this behavior is because we think the wrier is a bit CPU intensive.
>
> 3. Using a different DAG structure like this:
>
>      KafkaConsumer —> ParquetWriter1
>               |
>               |——————> ParquetWriter2
>
>   In this case, both ParquetWriter1 and ParquetWriter2 are same writers.
> But each have a Partition_Parallel relationship with KafkaConsumer. Each is
> of 100 containers. So 300 containers in total. The KafkaConsumer sends 1
> half of messages to ParquetWriter1 and the other half to ParquetWriter2.
> That is, out of 20 million/sec emitted by Consumer, 10Million is handled by
> each Writer group per sec. This setup works fine for our traffic. But, in
> the place where we would have required 20-30 additional Writer contianers,
> we are using 100 because partition_parallel seems to avoid all the slowness
> that we faced in approach[1].
>
> We wanted to know if there is a way to optimize approach [1] or [2] and
> hence use lesser containers than go with [3].
>
>
> Thanks,
> Aravindan
>
>
>
>
>
>
>

Re: PARTITION_PARALLEL Vs Regular mode

Reply via email to