Hello Simon,

It is what it looks like yes, but I’ve set topology.flush.tuple.freq.millis to 
10, no change. Moreover, if I send let’s say 20000 lines, it will anyway take a 
lot of time to be fully processed.




From: Simon Elliston Ball [mailto:[email protected]]
Sent: Thursday, May 16, 2019 11:20
To: [email protected]
Subject: Re: Very low throuput on topologies

If you’re sending low volumes to test, you may be waiting on the batch timeout, 
ie not triggering flushing a batch by volume, but waiting for the timeout, 
which may explain your latency.

Simon

On Thu, 16 May 2019 at 10:04, 
<[email protected]<mailto:[email protected]>> wrote:
Hello Michael,

So, using curl and the API, I’ve been able to collect some statistics. 
Currently, it is a test platform with nearly no activity. I’ve setup a basic 
parser, with the following topology:

-          6 ackers (I’ve 6 kafka partitions per topic)

-          Spout // = 6

-          Spout # of tasks = 6

-          Parser // = 24

-          Parser # of tasks = 24

I inject one line of logs with Nifi on my sensor topic. As a reminder, it needs 
roughly 10 s to be visible on the enrichments topic. Here are some statistics:

  "spouts": [
    {
      "emitted": 1160,
      "spoutId": "kafkaSpout",
      "requestedMemOnHeap": 128,
      "errorTime": null,
      "tasks": 6,
      "errorHost": "",
      "failed": 0,
      "completeLatency": "3963.078",
      "executors": 6,
      "encodedSpoutId": "kafkaSpout",
      "transferred": 1160,
      "errorPort": "",
      "requestedMemOffHeap": 0,
      "errorLapsedSecs": null,
      "acked": 1020,
      "requestedCpu": 10,
      "lastError": "",
      "errorWorkerLogLink": ""
    }

This completeLatency looks very high doesn’t it?

And for bolt:
    {
      "emitted": 0,
      "requestedMemOnHeap": 128,
      "errorTime": null,
      "tasks": 12,
      "errorHost": "",
      "failed": 0,
      "boltId": "parserBolt",
      "executors": 12,
      "processLatency": "832.962",
      "executeLatency": "1.391",
      "transferred": 0,
      "errorPort": "",
      "requestedMemOffHeap": 0,
      "errorLapsedSecs": null,
      "acked": 3680,
      "requestedCpu": 10,
      "encodedBoltId": "parserBolt",
      "lastError": "",
      "executed": 3680,
      "capacity": "0.003",
      "errorWorkerLogLink": ""
    }

So, my understanding is that it takes a lot of time to ack tuples, but I don’t 
know where to go now.  As said below, I’ve tried the tweaks mentioned here: 
https://github.com/apache/storm/blob/master/docs/Performance.md but no change. 
It looks like that we are trying to fill a bucket, and send data after a given 
timeout if the bucket is not full. But I don’t see any timeout that looks like 
10 or 20 secondes in storm configuration.
As a reminder, I’ve Kerberos enabled on my platform, but everything seems to 
work fine except Metron ingestion.

Thanks for your help,

Stéphane

From: Michael Miklavcic 
[mailto:[email protected]<mailto:[email protected]>]
Sent: Wednesday, May 15, 2019 16:03
To: [email protected]<mailto:[email protected]>
Subject: Re: Very low throuput on topologies

You could use curl from the cli. But if this is something you're testing out on 
your local machine, I'd probably start without Kerberos enabled and work the 
perf knobs there first. You should be able to see the "complete latency" from 
the Storm UI on each running topology.

On Wed, May 15, 2019 at 1:33 AM 
<[email protected]<mailto:[email protected]>> wrote:
Hello Nick,

Thanks for your answer. By the way, the problem already happens before 
indexing, at the parser level. It takes many time to go from sensor topic to 
“enrichments” topic, and again many seconds to go from “enrichments” topic to 
“indexing” topic.

I’ve tried the recommendations described here: 
https://github.com/apache/storm/blob/master/docs/Performance.md but no change. 
The problem with Kerberos is that it is no longer possible to access Storm UI 
without some tweaks that are blocked by administrator on my computer.


From: Nick Allen [mailto:[email protected]<mailto:[email protected]>]
Sent: Tuesday, May 14, 2019 23:39
To: [email protected]<mailto:[email protected]>
Subject: Re: Very low throuput on topologies

Have you increased the indexing "batch_size"?  That is the first knob to start 
tuning.

https://github.com/apache/metron/tree/master/metron-platform/metron-indexing#sensor-indexing-configuration



On Tue, May 14, 2019 at 10:26 AM 
<[email protected]<mailto:[email protected]>> wrote:
Hello happy metron users,

I’ve a Metron cluster based on Hortonworks CP, and I’ve setup Kerberos on the 
top of it, as you all probably have done since we deal with security ☺

It seems that everything is working fine, Kerberos, ranger,… but I’m facing an 
issue regarding the overall throuput.

I feed my cluster with Nifi, here is what I do:
        Test 1:

-          Send 2 lines of logs to Kafka sensor topic with Nifi

-          Use Kafka CLI consumer to read messages from sensor topic: response 
is immediate

-          Use kafka CLI consumer to read messages from enrichment topic: 
messages are coming after nearly 20 s
      Test 2:

-          Send 200 lines of logs to Kafka sensor topic with Nifi

-          Use Kafka CLI consumer to read messages from sensor topic: response 
is immediate

-          Use kafka CLI consumer to read messages from enrichment topic: some 
messages are coming immediately, but it seems they come 10 by 10 (nearly), with 
many seconds between each flow


It’s probably related to Storm configuration, but I don’t know where to go now. 
I’ve tried to change various parameters like topology.max.spout.pending 
(currently set to 500), but no improvement

Thanks for your help

Stéphane


_________________________________________________________________________________________________________________________



Ce message et ses pieces jointes peuvent contenir des informations 
confidentielles ou privilegiees et ne doivent donc

pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce 
message par erreur, veuillez le signaler

a l'expediteur et le detruire ainsi que les pieces jointes. Les messages 
electroniques etant susceptibles d'alteration,

Orange decline toute responsabilite si ce message a ete altere, deforme ou 
falsifie. Merci.



This message and its attachments may contain confidential or privileged 
information that may be protected by law;

they should not be distributed, used or copied without authorisation.

If you have received this email in error, please notify the sender and delete 
this message and its attachments.

As emails may be altered, Orange is not liable for messages that have been 
modified, changed or falsified.

Thank you.

_________________________________________________________________________________________________________________________



Ce message et ses pieces jointes peuvent contenir des informations 
confidentielles ou privilegiees et ne doivent donc

pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce 
message par erreur, veuillez le signaler

a l'expediteur et le detruire ainsi que les pieces jointes. Les messages 
electroniques etant susceptibles d'alteration,

Orange decline toute responsabilite si ce message a ete altere, deforme ou 
falsifie. Merci.



This message and its attachments may contain confidential or privileged 
information that may be protected by law;

they should not be distributed, used or copied without authorisation.

If you have received this email in error, please notify the sender and delete 
this message and its attachments.

As emails may be altered, Orange is not liable for messages that have been 
modified, changed or falsified.

Thank you.

_________________________________________________________________________________________________________________________



Ce message et ses pieces jointes peuvent contenir des informations 
confidentielles ou privilegiees et ne doivent donc

pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce 
message par erreur, veuillez le signaler

a l'expediteur et le detruire ainsi que les pieces jointes. Les messages 
electroniques etant susceptibles d'alteration,

Orange decline toute responsabilite si ce message a ete altere, deforme ou 
falsifie. Merci.



This message and its attachments may contain confidential or privileged 
information that may be protected by law;

they should not be distributed, used or copied without authorisation.

If you have received this email in error, please notify the sender and delete 
this message and its attachments.

As emails may be altered, Orange is not liable for messages that have been 
modified, changed or falsified.

Thank you.
--
--
simon elliston ball
@sireb

_________________________________________________________________________________________________________________________

Ce message et ses pieces jointes peuvent contenir des informations 
confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce 
message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages 
electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou 
falsifie. Merci.

This message and its attachments may contain confidential or privileged 
information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete 
this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been 
modified, changed or falsified.
Thank you.

Reply via email to