Yes, this what I’m currently playing with to find the best batch size. Thanks 
for pointing the link

Stéphane


From: Michael Miklavcic [mailto:[email protected]]
Sent: Tuesday, May 21, 2019 16:12
To: [email protected]
Subject: Re: Very low throuput on topologies

Also take a look at this - 
https://github.com/apache/metron/tree/master/metron-platform/metron-writer#bulk-message-writing

On Tue, May 21, 2019 at 7:47 AM 
<[email protected]<mailto:[email protected]>> wrote:
Thanks Nick.



From: Nick Allen [mailto:[email protected]<mailto:[email protected]>]
Sent: Tuesday, May 21, 2019 14:15
To: [email protected]<mailto:[email protected]>
Subject: Re: Very low throuput on topologies

> In the link you mention below, it is said that in case batchTimeout is not 
> set, it will fall down to a fraction of topology.message.timeout.secs storm 
> parameter. Do you how we can get this fraction?

By default, the `batchTimeout` is set to 1/2 of the 
`topology.message.timeout.secs`.

https://github.com/apache/metron/blob/51d1c812c1e45f57da8c27fe37fd13797707884e/metron-platform/metron-writer/metron-writer-storm/src/main/java/org/apache/metron/writer/bolt/BatchTimeoutHelper.java#L118-L122




On Mon, May 20, 2019 at 11:03 AM 
<[email protected]<mailto:[email protected]>> wrote:
Hello Nick,

You are right, it was related to batchSize and batchTimeout settings, but I was 
confused about the place it was, I was tweaking the Indexing ones. But now, 
I’ve understood a little bit better about these settings and I can see their 
effects.

By the way, I still have one question: in the link you mention below, it is 
said that in case batchTimeout is not set, it will fall down to a fraction of 
topology.message.timeout.secs storm parameter. Do you how we can get this 
fraction?

Thanks for your help ☺


From: Nick Allen [mailto:[email protected]<mailto:[email protected]>]
Sent: Thursday, May 16, 2019 15:08
To: [email protected]<mailto:[email protected]>
Subject: Re: Very low throuput on topologies

Ok, now I understand a little better.  You are sending low volumes of telemetry 
just for testing.

I think Simon is on the right track.  There is also a batchSize and 
batchTimeout setting for Parsers that you can find at the link below.

https://metron.apache.org/current-book/metron-platform/metron-parsers/index.html






On Thu, May 16, 2019 at 8:12 AM 
<[email protected]<mailto:[email protected]>> wrote:
Hello Simon,

If you talk about this: 
https://github.com/apache/metron/tree/master/metron-platform/metron-indexing#sensor-indexing-configuration

My settings are (I’ve tried many changes here):
{
      "hdfs": {
            "batchSize": 10,
            "batchTimeout": 1,
            "enabled": true,
            "index": "ansi"
      },
      "elasticsearch": {
            "batchSize": 10,
            "batchTimeout": 1,
            "enabled": true,
            "index": "ansi"
      },
      "solr": {
            "batchSize": 10,
            "enabled": false,
            "index": "ansi"
      }
}

What is not clear for me is how these settings can influence the 
random_indexing and batch_indexing topologies. Once again, the first delay I 
see is between the source topic and the “enrichments” topic. Unless I’m wrong, 
data are moved between these 2 topics by the parsing topology, without any 
indexing action.

Stéphane


From: Simon Elliston Ball 
[mailto:[email protected]<mailto:[email protected]>]
Sent: Thursday, May 16, 2019 11:40
To: [email protected]<mailto:[email protected]>
Subject: Re: Very low throuput on topologies

Can you share your Metron batch size config and timeouts?

On Thu, 16 May 2019 at 11:31, 
<[email protected]<mailto:[email protected]>> wrote:
Hello Simon,

It is what it looks like yes, but I’ve set topology.flush.tuple.freq.millis to 
10, no change. Moreover, if I send let’s say 20000 lines, it will anyway take a 
lot of time to be fully processed.




From: Simon Elliston Ball 
[mailto:[email protected]<mailto:[email protected]>]
Sent: Thursday, May 16, 2019 11:20
To: [email protected]<mailto:[email protected]>
Subject: Re: Very low throuput on topologies

If you’re sending low volumes to test, you may be waiting on the batch timeout, 
ie not triggering flushing a batch by volume, but waiting for the timeout, 
which may explain your latency.

Simon

On Thu, 16 May 2019 at 10:04, 
<[email protected]<mailto:[email protected]>> wrote:
Hello Michael,

So, using curl and the API, I’ve been able to collect some statistics. 
Currently, it is a test platform with nearly no activity. I’ve setup a basic 
parser, with the following topology:

-          6 ackers (I’ve 6 kafka partitions per topic)

-          Spout // = 6

-          Spout # of tasks = 6

-          Parser // = 24

-          Parser # of tasks = 24

I inject one line of logs with Nifi on my sensor topic. As a reminder, it needs 
roughly 10 s to be visible on the enrichments topic. Here are some statistics:

  "spouts": [
    {
      "emitted": 1160,
      "spoutId": "kafkaSpout",
      "requestedMemOnHeap": 128,
      "errorTime": null,
      "tasks": 6,
      "errorHost": "",
      "failed": 0,
      "completeLatency": "3963.078",
      "executors": 6,
      "encodedSpoutId": "kafkaSpout",
      "transferred": 1160,
      "errorPort": "",
      "requestedMemOffHeap": 0,
      "errorLapsedSecs": null,
      "acked": 1020,
      "requestedCpu": 10,
      "lastError": "",
      "errorWorkerLogLink": ""
    }

This completeLatency looks very high doesn’t it?

And for bolt:
    {
      "emitted": 0,
      "requestedMemOnHeap": 128,
      "errorTime": null,
      "tasks": 12,
      "errorHost": "",
      "failed": 0,
      "boltId": "parserBolt",
      "executors": 12,
      "processLatency": "832.962",
      "executeLatency": "1.391",
      "transferred": 0,
      "errorPort": "",
      "requestedMemOffHeap": 0,
      "errorLapsedSecs": null,
      "acked": 3680,
      "requestedCpu": 10,
      "encodedBoltId": "parserBolt",
      "lastError": "",
      "executed": 3680,
      "capacity": "0.003",
      "errorWorkerLogLink": ""
    }

So, my understanding is that it takes a lot of time to ack tuples, but I don’t 
know where to go now.  As said below, I’ve tried the tweaks mentioned here: 
https://github.com/apache/storm/blob/master/docs/Performance.md but no change. 
It looks like that we are trying to fill a bucket, and send data after a given 
timeout if the bucket is not full. But I don’t see any timeout that looks like 
10 or 20 secondes in storm configuration.
As a reminder, I’ve Kerberos enabled on my platform, but everything seems to 
work fine except Metron ingestion.

Thanks for your help,

Stéphane

From: Michael Miklavcic 
[mailto:[email protected]<mailto:[email protected]>]
Sent: Wednesday, May 15, 2019 16:03
To: [email protected]<mailto:[email protected]>
Subject: Re: Very low throuput on topologies

You could use curl from the cli. But if this is something you're testing out on 
your local machine, I'd probably start without Kerberos enabled and work the 
perf knobs there first. You should be able to see the "complete latency" from 
the Storm UI on each running topology.

On Wed, May 15, 2019 at 1:33 AM 
<[email protected]<mailto:[email protected]>> wrote:
Hello Nick,

Thanks for your answer. By the way, the problem already happens before 
indexing, at the parser level. It takes many time to go from sensor topic to 
“enrichments” topic, and again many seconds to go from “enrichments” topic to 
“indexing” topic.

I’ve tried the recommendations described here: 
https://github.com/apache/storm/blob/master/docs/Performance.md but no change. 
The problem with Kerberos is that it is no longer possible to access Storm UI 
without some tweaks that are blocked by administrator on my computer.


From: Nick Allen [mailto:[email protected]<mailto:[email protected]>]
Sent: Tuesday, May 14, 2019 23:39
To: [email protected]<mailto:[email protected]>
Subject: Re: Very low throuput on topologies

Have you increased the indexing "batch_size"?  That is the first knob to start 
tuning.

https://github.com/apache/metron/tree/master/metron-platform/metron-indexing#sensor-indexing-configuration



On Tue, May 14, 2019 at 10:26 AM 
<[email protected]<mailto:[email protected]>> wrote:
Hello happy metron users,

I’ve a Metron cluster based on Hortonworks CP, and I’ve setup Kerberos on the 
top of it, as you all probably have done since we deal with security ☺

It seems that everything is working fine, Kerberos, ranger,… but I’m facing an 
issue regarding the overall throuput.

I feed my cluster with Nifi, here is what I do:
        Test 1:

-          Send 2 lines of logs to Kafka sensor topic with Nifi

-          Use Kafka CLI consumer to read messages from sensor topic: response 
is immediate

-          Use kafka CLI consumer to read messages from enrichment topic: 
messages are coming after nearly 20 s
      Test 2:

-          Send 200 lines of logs to Kafka sensor topic with Nifi

-          Use Kafka CLI consumer to read messages from sensor topic: response 
is immediate

-          Use kafka CLI consumer to read messages from enrichment topic: some 
messages are coming immediately, but it seems they come 10 by 10 (nearly), with 
many seconds between each flow


It’s probably related to Storm configuration, but I don’t know where to go now. 
I’ve tried to change various parameters like topology.max.spout.pending 
(currently set to 500), but no improvement

Thanks for your help

Stéphane


_________________________________________________________________________________________________________________________



Ce message et ses pieces jointes peuvent contenir des informations 
confidentielles ou privilegiees et ne doivent donc

pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce 
message par erreur, veuillez le signaler

a l'expediteur et le detruire ainsi que les pieces jointes. Les messages 
electroniques etant susceptibles d'alteration,

Orange decline toute responsabilite si ce message a ete altere, deforme ou 
falsifie. Merci.



This message and its attachments may contain confidential or privileged 
information that may be protected by law;

they should not be distributed, used or copied without authorisation.

If you have received this email in error, please notify the sender and delete 
this message and its attachments.

As emails may be altered, Orange is not liable for messages that have been 
modified, changed or falsified.

Thank you.

_________________________________________________________________________________________________________________________



Ce message et ses pieces jointes peuvent contenir des informations 
confidentielles ou privilegiees et ne doivent donc

pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce 
message par erreur, veuillez le signaler

a l'expediteur et le detruire ainsi que les pieces jointes. Les messages 
electroniques etant susceptibles d'alteration,

Orange decline toute responsabilite si ce message a ete altere, deforme ou 
falsifie. Merci.



This message and its attachments may contain confidential or privileged 
information that may be protected by law;

they should not be distributed, used or copied without authorisation.

If you have received this email in error, please notify the sender and delete 
this message and its attachments.

As emails may be altered, Orange is not liable for messages that have been 
modified, changed or falsified.

Thank you.

_________________________________________________________________________________________________________________________



Ce message et ses pieces jointes peuvent contenir des informations 
confidentielles ou privilegiees et ne doivent donc

pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce 
message par erreur, veuillez le signaler

a l'expediteur et le detruire ainsi que les pieces jointes. Les messages 
electroniques etant susceptibles d'alteration,

Orange decline toute responsabilite si ce message a ete altere, deforme ou 
falsifie. Merci.



This message and its attachments may contain confidential or privileged 
information that may be protected by law;

they should not be distributed, used or copied without authorisation.

If you have received this email in error, please notify the sender and delete 
this message and its attachments.

As emails may be altered, Orange is not liable for messages that have been 
modified, changed or falsified.

Thank you.
--
--
simon elliston ball
@sireb

_________________________________________________________________________________________________________________________



Ce message et ses pieces jointes peuvent contenir des informations 
confidentielles ou privilegiees et ne doivent donc

pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce 
message par erreur, veuillez le signaler

a l'expediteur et le detruire ainsi que les pieces jointes. Les messages 
electroniques etant susceptibles d'alteration,

Orange decline toute responsabilite si ce message a ete altere, deforme ou 
falsifie. Merci.



This message and its attachments may contain confidential or privileged 
information that may be protected by law;

they should not be distributed, used or copied without authorisation.

If you have received this email in error, please notify the sender and delete 
this message and its attachments.

As emails may be altered, Orange is not liable for messages that have been 
modified, changed or falsified.

Thank you.
--
--
simon elliston ball
@sireb

_________________________________________________________________________________________________________________________



Ce message et ses pieces jointes peuvent contenir des informations 
confidentielles ou privilegiees et ne doivent donc

pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce 
message par erreur, veuillez le signaler

a l'expediteur et le detruire ainsi que les pieces jointes. Les messages 
electroniques etant susceptibles d'alteration,

Orange decline toute responsabilite si ce message a ete altere, deforme ou 
falsifie. Merci.



This message and its attachments may contain confidential or privileged 
information that may be protected by law;

they should not be distributed, used or copied without authorisation.

If you have received this email in error, please notify the sender and delete 
this message and its attachments.

As emails may be altered, Orange is not liable for messages that have been 
modified, changed or falsified.

Thank you.

_________________________________________________________________________________________________________________________



Ce message et ses pieces jointes peuvent contenir des informations 
confidentielles ou privilegiees et ne doivent donc

pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce 
message par erreur, veuillez le signaler

a l'expediteur et le detruire ainsi que les pieces jointes. Les messages 
electroniques etant susceptibles d'alteration,

Orange decline toute responsabilite si ce message a ete altere, deforme ou 
falsifie. Merci.



This message and its attachments may contain confidential or privileged 
information that may be protected by law;

they should not be distributed, used or copied without authorisation.

If you have received this email in error, please notify the sender and delete 
this message and its attachments.

As emails may be altered, Orange is not liable for messages that have been 
modified, changed or falsified.

Thank you.

_________________________________________________________________________________________________________________________



Ce message et ses pieces jointes peuvent contenir des informations 
confidentielles ou privilegiees et ne doivent donc

pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce 
message par erreur, veuillez le signaler

a l'expediteur et le detruire ainsi que les pieces jointes. Les messages 
electroniques etant susceptibles d'alteration,

Orange decline toute responsabilite si ce message a ete altere, deforme ou 
falsifie. Merci.



This message and its attachments may contain confidential or privileged 
information that may be protected by law;

they should not be distributed, used or copied without authorisation.

If you have received this email in error, please notify the sender and delete 
this message and its attachments.

As emails may be altered, Orange is not liable for messages that have been 
modified, changed or falsified.

Thank you.

_________________________________________________________________________________________________________________________

Ce message et ses pieces jointes peuvent contenir des informations 
confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce 
message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages 
electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou 
falsifie. Merci.

This message and its attachments may contain confidential or privileged 
information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete 
this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been 
modified, changed or falsified.
Thank you.

Reply via email to