Hello Simon, It is what it looks like yes, but I’ve set topology.flush.tuple.freq.millis to 10, no change. Moreover, if I send let’s say 20000 lines, it will anyway take a lot of time to be fully processed.
From: Simon Elliston Ball [mailto:[email protected]] Sent: Thursday, May 16, 2019 11:20 To: [email protected] Subject: Re: Very low throuput on topologies If you’re sending low volumes to test, you may be waiting on the batch timeout, ie not triggering flushing a batch by volume, but waiting for the timeout, which may explain your latency. Simon On Thu, 16 May 2019 at 10:04, <[email protected]<mailto:[email protected]>> wrote: Hello Michael, So, using curl and the API, I’ve been able to collect some statistics. Currently, it is a test platform with nearly no activity. I’ve setup a basic parser, with the following topology: - 6 ackers (I’ve 6 kafka partitions per topic) - Spout // = 6 - Spout # of tasks = 6 - Parser // = 24 - Parser # of tasks = 24 I inject one line of logs with Nifi on my sensor topic. As a reminder, it needs roughly 10 s to be visible on the enrichments topic. Here are some statistics: "spouts": [ { "emitted": 1160, "spoutId": "kafkaSpout", "requestedMemOnHeap": 128, "errorTime": null, "tasks": 6, "errorHost": "", "failed": 0, "completeLatency": "3963.078", "executors": 6, "encodedSpoutId": "kafkaSpout", "transferred": 1160, "errorPort": "", "requestedMemOffHeap": 0, "errorLapsedSecs": null, "acked": 1020, "requestedCpu": 10, "lastError": "", "errorWorkerLogLink": "" } This completeLatency looks very high doesn’t it? And for bolt: { "emitted": 0, "requestedMemOnHeap": 128, "errorTime": null, "tasks": 12, "errorHost": "", "failed": 0, "boltId": "parserBolt", "executors": 12, "processLatency": "832.962", "executeLatency": "1.391", "transferred": 0, "errorPort": "", "requestedMemOffHeap": 0, "errorLapsedSecs": null, "acked": 3680, "requestedCpu": 10, "encodedBoltId": "parserBolt", "lastError": "", "executed": 3680, "capacity": "0.003", "errorWorkerLogLink": "" } So, my understanding is that it takes a lot of time to ack tuples, but I don’t know where to go now. As said below, I’ve tried the tweaks mentioned here: https://github.com/apache/storm/blob/master/docs/Performance.md but no change. It looks like that we are trying to fill a bucket, and send data after a given timeout if the bucket is not full. But I don’t see any timeout that looks like 10 or 20 secondes in storm configuration. As a reminder, I’ve Kerberos enabled on my platform, but everything seems to work fine except Metron ingestion. Thanks for your help, Stéphane From: Michael Miklavcic [mailto:[email protected]<mailto:[email protected]>] Sent: Wednesday, May 15, 2019 16:03 To: [email protected]<mailto:[email protected]> Subject: Re: Very low throuput on topologies You could use curl from the cli. But if this is something you're testing out on your local machine, I'd probably start without Kerberos enabled and work the perf knobs there first. You should be able to see the "complete latency" from the Storm UI on each running topology. On Wed, May 15, 2019 at 1:33 AM <[email protected]<mailto:[email protected]>> wrote: Hello Nick, Thanks for your answer. By the way, the problem already happens before indexing, at the parser level. It takes many time to go from sensor topic to “enrichments” topic, and again many seconds to go from “enrichments” topic to “indexing” topic. I’ve tried the recommendations described here: https://github.com/apache/storm/blob/master/docs/Performance.md but no change. The problem with Kerberos is that it is no longer possible to access Storm UI without some tweaks that are blocked by administrator on my computer. From: Nick Allen [mailto:[email protected]<mailto:[email protected]>] Sent: Tuesday, May 14, 2019 23:39 To: [email protected]<mailto:[email protected]> Subject: Re: Very low throuput on topologies Have you increased the indexing "batch_size"? That is the first knob to start tuning. https://github.com/apache/metron/tree/master/metron-platform/metron-indexing#sensor-indexing-configuration On Tue, May 14, 2019 at 10:26 AM <[email protected]<mailto:[email protected]>> wrote: Hello happy metron users, I’ve a Metron cluster based on Hortonworks CP, and I’ve setup Kerberos on the top of it, as you all probably have done since we deal with security ☺ It seems that everything is working fine, Kerberos, ranger,… but I’m facing an issue regarding the overall throuput. I feed my cluster with Nifi, here is what I do: Test 1: - Send 2 lines of logs to Kafka sensor topic with Nifi - Use Kafka CLI consumer to read messages from sensor topic: response is immediate - Use kafka CLI consumer to read messages from enrichment topic: messages are coming after nearly 20 s Test 2: - Send 200 lines of logs to Kafka sensor topic with Nifi - Use Kafka CLI consumer to read messages from sensor topic: response is immediate - Use kafka CLI consumer to read messages from enrichment topic: some messages are coming immediately, but it seems they come 10 by 10 (nearly), with many seconds between each flow It’s probably related to Storm configuration, but I don’t know where to go now. I’ve tried to change various parameters like topology.max.spout.pending (currently set to 500), but no improvement Thanks for your help Stéphane _________________________________________________________________________________________________________________________ Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration, Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci. This message and its attachments may contain confidential or privileged information that may be protected by law; they should not be distributed, used or copied without authorisation. If you have received this email in error, please notify the sender and delete this message and its attachments. As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified. Thank you. _________________________________________________________________________________________________________________________ Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration, Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci. This message and its attachments may contain confidential or privileged information that may be protected by law; they should not be distributed, used or copied without authorisation. If you have received this email in error, please notify the sender and delete this message and its attachments. As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified. Thank you. _________________________________________________________________________________________________________________________ Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration, Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci. This message and its attachments may contain confidential or privileged information that may be protected by law; they should not be distributed, used or copied without authorisation. If you have received this email in error, please notify the sender and delete this message and its attachments. As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified. Thank you. -- -- simon elliston ball @sireb _________________________________________________________________________________________________________________________ Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration, Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci. This message and its attachments may contain confidential or privileged information that may be protected by law; they should not be distributed, used or copied without authorisation. If you have received this email in error, please notify the sender and delete this message and its attachments. As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified. Thank you.
