> In the link you mention below, it is said that in case batchTimeout is
not set, it will fall down to a fraction of topology.message.timeout.secs storm
parameter. Do you how we can get this fraction?

By default, the `batchTimeout` is set to 1/2 of the
`topology.message.timeout.secs`.

https://github.com/apache/metron/blob/51d1c812c1e45f57da8c27fe37fd13797707884e/metron-platform/metron-writer/metron-writer-storm/src/main/java/org/apache/metron/writer/bolt/BatchTimeoutHelper.java#L118-L122




On Mon, May 20, 2019 at 11:03 AM <[email protected]> wrote:

> Hello Nick,
>
>
>
> You are right, it was related to batchSize and batchTimeout settings, but
> I was confused about the place it was, I was tweaking the Indexing ones.
> But now, I’ve understood a little bit better about these settings and I can
> see their effects.
>
>
>
> By the way, I still have one question: in the link you mention below, it
> is said that in case batchTimeout is not set, it will fall down to a
> fraction of topology.message.timeout.secs storm parameter. Do you how we
> can get this fraction?
>
>
>
> Thanks for your help J
>
>
>
>
>
> *From:* Nick Allen [mailto:[email protected]]
> *Sent:* Thursday, May 16, 2019 15:08
> *To:* [email protected]
> *Subject:* Re: Very low throuput on topologies
>
>
>
> Ok, now I understand a little better.  You are sending low volumes of
> telemetry just for testing.
>
>
>
> I think Simon is on the right track.  There is also a batchSize and
> batchTimeout setting for Parsers that you can find at the link below.
>
>
>
>
> https://metron.apache.org/current-book/metron-platform/metron-parsers/index.html
>
>
>
>
>
>
>
>
>
>
>
>
>
> On Thu, May 16, 2019 at 8:12 AM <[email protected]> wrote:
>
> Hello Simon,
>
>
>
> If you talk about this:
> https://github.com/apache/metron/tree/master/metron-platform/metron-indexing#sensor-indexing-configuration
>
>
>
> My settings are (I’ve tried many changes here):
>
> {
>
>       "hdfs": {
>
>             "batchSize": 10,
>
>             "batchTimeout": 1,
>
>             "enabled": true,
>
>             "index": "ansi"
>
>       },
>
>       "elasticsearch": {
>
>             "batchSize": 10,
>
>             "batchTimeout": 1,
>
>             "enabled": true,
>
>             "index": "ansi"
>
>       },
>
>       "solr": {
>
>             "batchSize": 10,
>
>             "enabled": false,
>
>             "index": "ansi"
>
>       }
>
> }
>
>
>
> What is not clear for me is how these settings can influence the
> random_indexing and batch_indexing topologies. Once again, the first delay
> I see is between the source topic and the “enrichments” topic. Unless I’m
> wrong, data are moved between these 2 topics by the parsing topology,
> without any indexing action.
>
>
>
> Stéphane
>
>
>
>
>
> *From:* Simon Elliston Ball [mailto:[email protected]]
> *Sent:* Thursday, May 16, 2019 11:40
> *To:* [email protected]
> *Subject:* Re: Very low throuput on topologies
>
>
>
> Can you share your Metron batch size config and timeouts?
>
>
>
> On Thu, 16 May 2019 at 11:31, <[email protected]> wrote:
>
> Hello Simon,
>
>
>
> It is what it looks like yes, but I’ve set
> topology.flush.tuple.freq.millis to 10, no change. Moreover, if I send
> let’s say 20000 lines, it will anyway take a lot of time to be fully
> processed.
>
>
>
>
>
>
>
>
>
> *From:* Simon Elliston Ball [mailto:[email protected]]
> *Sent:* Thursday, May 16, 2019 11:20
> *To:* [email protected]
> *Subject:* Re: Very low throuput on topologies
>
>
>
> If you’re sending low volumes to test, you may be waiting on the batch
> timeout, ie not triggering flushing a batch by volume, but waiting for the
> timeout, which may explain your latency.
>
>
>
> Simon
>
>
>
> On Thu, 16 May 2019 at 10:04, <[email protected]> wrote:
>
> Hello Michael,
>
>
>
> So, using curl and the API, I’ve been able to collect some statistics.
> Currently, it is a test platform with nearly no activity. I’ve setup a
> basic parser, with the following topology:
>
> -          6 ackers (I’ve 6 kafka partitions per topic)
>
> -          Spout // = 6
>
> -          Spout # of tasks = 6
>
> -          Parser // = 24
>
> -          Parser # of tasks = 24
>
>
>
> I inject one line of logs with Nifi on my sensor topic. As a reminder, it
> needs roughly 10 s to be visible on the enrichments topic. Here are some
> statistics:
>
>
>
>   "spouts": [
>
>     {
>
>       "emitted": 1160,
>
>       "spoutId": "kafkaSpout",
>
>       "requestedMemOnHeap": 128,
>
>       "errorTime": null,
>
>       "tasks": 6,
>
>       "errorHost": "",
>
>       "failed": 0,
>
> *      "completeLatency": "3963.078",*
>
>       "executors": 6,
>
>       "encodedSpoutId": "kafkaSpout",
>
>       "transferred": 1160,
>
>       "errorPort": "",
>
>       "requestedMemOffHeap": 0,
>
>       "errorLapsedSecs": null,
>
>       "acked": 1020,
>
>       "requestedCpu": 10,
>
>       "lastError": "",
>
>       "errorWorkerLogLink": ""
>
>     }
>
>
>
> This completeLatency looks very high doesn’t it?
>
>
>
> And for bolt:
>
>     {
>
>       "emitted": 0,
>
>       "requestedMemOnHeap": 128,
>
>       "errorTime": null,
>
>       "tasks": 12,
>
>       "errorHost": "",
>
>       "failed": 0,
>
>       "boltId": "parserBolt",
>
>       "executors": 12,
>
>       "processLatency": "832.962",
>
>       "executeLatency": "1.391",
>
>       "transferred": 0,
>
>       "errorPort": "",
>
>       "requestedMemOffHeap": 0,
>
>       "errorLapsedSecs": null,
>
>       "acked": 3680,
>
>       "requestedCpu": 10,
>
>       "encodedBoltId": "parserBolt",
>
>       "lastError": "",
>
>       "executed": 3680,
>
>       "capacity": "0.003",
>
>       "errorWorkerLogLink": ""
>
>     }
>
>
>
> So, my understanding is that it takes a lot of time to ack tuples, but I
> don’t know where to go now.  As said below, I’ve tried the tweaks mentioned
> here: https://github.com/apache/storm/blob/master/docs/Performance.md but
> no change. It looks like that we are trying to fill a bucket, and send data
> after a given timeout if the bucket is not full. But I don’t see any
> timeout that looks like 10 or 20 secondes in storm configuration.
>
> As a reminder, I’ve Kerberos enabled on my platform, but everything seems
> to work fine except Metron ingestion.
>
>
>
> Thanks for your help,
>
>
>
> Stéphane
>
>
>
> *From:* Michael Miklavcic [mailto:[email protected]]
> *Sent:* Wednesday, May 15, 2019 16:03
> *To:* [email protected]
> *Subject:* Re: Very low throuput on topologies
>
>
>
> You could use curl from the cli. But if this is something you're testing
> out on your local machine, I'd probably start without Kerberos enabled and
> work the perf knobs there first. You should be able to see the "complete
> latency" from the Storm UI on each running topology.
>
>
>
> On Wed, May 15, 2019 at 1:33 AM <[email protected]> wrote:
>
> Hello Nick,
>
>
>
> Thanks for your answer. By the way, the problem already happens before
> indexing, at the parser level. It takes many time to go from sensor topic
> to “enrichments” topic, and again many seconds to go from “enrichments”
> topic to “indexing” topic.
>
>
>
> I’ve tried the recommendations described here:
> https://github.com/apache/storm/blob/master/docs/Performance.md but no
> change. The problem with Kerberos is that it is no longer possible to
> access Storm UI without some tweaks that are blocked by administrator on my
> computer.
>
>
>
>
>
> *From:* Nick Allen [mailto:[email protected]]
> *Sent:* Tuesday, May 14, 2019 23:39
> *To:* [email protected]
> *Subject:* Re: Very low throuput on topologies
>
>
>
> Have you increased the indexing "batch_size"?  That is the first knob to
> start tuning.
>
>
>
>
> https://github.com/apache/metron/tree/master/metron-platform/metron-indexing#sensor-indexing-configuration
>
>
>
>
>
>
>
> On Tue, May 14, 2019 at 10:26 AM <[email protected]> wrote:
>
> Hello happy metron users,
>
>
>
> I’ve a Metron cluster based on Hortonworks CP, and I’ve setup Kerberos on
> the top of it, as you all probably have done since we deal with security J
>
>
>
> It seems that everything is working fine, Kerberos, ranger,… but I’m
> facing an issue regarding the overall throuput.
>
>
>
> I feed my cluster with Nifi, here is what I do:
>
>         Test 1:
>
> -          Send 2 lines of logs to Kafka sensor topic with Nifi
>
> -          Use Kafka CLI consumer to read messages from sensor topic:
> response is immediate
>
> -          Use kafka CLI consumer to read messages from enrichment topic:
> messages are coming after nearly 20 s
>
>       Test 2:
>
> -          Send 200 lines of logs to Kafka sensor topic with Nifi
>
> -          Use Kafka CLI consumer to read messages from sensor topic:
> response is immediate
>
> -          Use kafka CLI consumer to read messages from enrichment topic:
> some messages are coming immediately, but it seems they come 10 by 10
> (nearly), with many seconds between each flow
>
>
>
>
>
> It’s probably related to Storm configuration, but I don’t know where to go
> now. I’ve tried to change various parameters like
> topology.max.spout.pending (currently set to 500), but no improvement
>
>
>
> Thanks for your help
>
>
>
> Stéphane
>
>
>
> _________________________________________________________________________________________________________________________
>
>
>
> Ce message et ses pieces jointes peuvent contenir des informations 
> confidentielles ou privilegiees et ne doivent donc
>
> pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu 
> ce message par erreur, veuillez le signaler
>
> a l'expediteur et le detruire ainsi que les pieces jointes. Les messages 
> electroniques etant susceptibles d'alteration,
>
> Orange decline toute responsabilite si ce message a ete altere, deforme ou 
> falsifie. Merci.
>
>
>
> This message and its attachments may contain confidential or privileged 
> information that may be protected by law;
>
> they should not be distributed, used or copied without authorisation.
>
> If you have received this email in error, please notify the sender and delete 
> this message and its attachments.
>
> As emails may be altered, Orange is not liable for messages that have been 
> modified, changed or falsified.
>
> Thank you.
>
> _________________________________________________________________________________________________________________________
>
>
>
> Ce message et ses pieces jointes peuvent contenir des informations 
> confidentielles ou privilegiees et ne doivent donc
>
> pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu 
> ce message par erreur, veuillez le signaler
>
> a l'expediteur et le detruire ainsi que les pieces jointes. Les messages 
> electroniques etant susceptibles d'alteration,
>
> Orange decline toute responsabilite si ce message a ete altere, deforme ou 
> falsifie. Merci.
>
>
>
> This message and its attachments may contain confidential or privileged 
> information that may be protected by law;
>
> they should not be distributed, used or copied without authorisation.
>
> If you have received this email in error, please notify the sender and delete 
> this message and its attachments.
>
> As emails may be altered, Orange is not liable for messages that have been 
> modified, changed or falsified.
>
> Thank you.
>
> _________________________________________________________________________________________________________________________
>
>
>
> Ce message et ses pieces jointes peuvent contenir des informations 
> confidentielles ou privilegiees et ne doivent donc
>
> pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu 
> ce message par erreur, veuillez le signaler
>
> a l'expediteur et le detruire ainsi que les pieces jointes. Les messages 
> electroniques etant susceptibles d'alteration,
>
> Orange decline toute responsabilite si ce message a ete altere, deforme ou 
> falsifie. Merci.
>
>
>
> This message and its attachments may contain confidential or privileged 
> information that may be protected by law;
>
> they should not be distributed, used or copied without authorisation.
>
> If you have received this email in error, please notify the sender and delete 
> this message and its attachments.
>
> As emails may be altered, Orange is not liable for messages that have been 
> modified, changed or falsified.
>
> Thank you.
>
> --
>
> --
>
> simon elliston ball
>
> @sireb
>
> _________________________________________________________________________________________________________________________
>
>
>
> Ce message et ses pieces jointes peuvent contenir des informations 
> confidentielles ou privilegiees et ne doivent donc
>
> pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu 
> ce message par erreur, veuillez le signaler
>
> a l'expediteur et le detruire ainsi que les pieces jointes. Les messages 
> electroniques etant susceptibles d'alteration,
>
> Orange decline toute responsabilite si ce message a ete altere, deforme ou 
> falsifie. Merci.
>
>
>
> This message and its attachments may contain confidential or privileged 
> information that may be protected by law;
>
> they should not be distributed, used or copied without authorisation.
>
> If you have received this email in error, please notify the sender and delete 
> this message and its attachments.
>
> As emails may be altered, Orange is not liable for messages that have been 
> modified, changed or falsified.
>
> Thank you.
>
> --
>
> --
>
> simon elliston ball
>
> @sireb
>
> _________________________________________________________________________________________________________________________
>
>
>
> Ce message et ses pieces jointes peuvent contenir des informations 
> confidentielles ou privilegiees et ne doivent donc
>
> pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu 
> ce message par erreur, veuillez le signaler
>
> a l'expediteur et le detruire ainsi que les pieces jointes. Les messages 
> electroniques etant susceptibles d'alteration,
>
> Orange decline toute responsabilite si ce message a ete altere, deforme ou 
> falsifie. Merci.
>
>
>
> This message and its attachments may contain confidential or privileged 
> information that may be protected by law;
>
> they should not be distributed, used or copied without authorisation.
>
> If you have received this email in error, please notify the sender and delete 
> this message and its attachments.
>
> As emails may be altered, Orange is not liable for messages that have been 
> modified, changed or falsified.
>
> Thank you.
>
> _________________________________________________________________________________________________________________________
>
> Ce message et ses pieces jointes peuvent contenir des informations 
> confidentielles ou privilegiees et ne doivent donc
> pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu 
> ce message par erreur, veuillez le signaler
> a l'expediteur et le detruire ainsi que les pieces jointes. Les messages 
> electroniques etant susceptibles d'alteration,
> Orange decline toute responsabilite si ce message a ete altere, deforme ou 
> falsifie. Merci.
>
> This message and its attachments may contain confidential or privileged 
> information that may be protected by law;
> they should not be distributed, used or copied without authorisation.
> If you have received this email in error, please notify the sender and delete 
> this message and its attachments.
> As emails may be altered, Orange is not liable for messages that have been 
> modified, changed or falsified.
> Thank you.
>
>

Reply via email to