candlerb commented on issue #5684: Documentation: units for storageSize URL: https://github.com/apache/pulsar/issues/5684#issuecomment-555921411 I've made a simple test for this. Before run: ``` ubuntu@ldex-pulsar:~$ apache-pulsar-2.4.1/bin/pulsar-admin topics stats temp { "msgRateIn" : 0.0, "msgThroughputIn" : 0.0, "msgRateOut" : 0.0, "msgThroughputOut" : 0.0, "averageMsgSize" : 0.0, "storageSize" : 0, "publishers" : [ ], "subscriptions" : { }, "replication" : { }, "deduplicationStatus" : "Disabled" } ubuntu@ldex-pulsar:~$ du -sck apache-pulsar-2.4.1/data/ 2371392 apache-pulsar-2.4.1/data/ 2371392 total ``` Run program which publishes 500 x 1MB messages to topic "temp" (see below) After run: ``` ubuntu@ldex-pulsar:~$ apache-pulsar-2.4.1/bin/pulsar-admin topics stats temp { "msgRateIn" : 0.0, "msgThroughputIn" : 0.0, "msgRateOut" : 0.0, "msgThroughputOut" : 0.0, "averageMsgSize" : 0.0, "storageSize" : 0, "publishers" : [ ], "subscriptions" : { }, "replication" : { }, "deduplicationStatus" : "Disabled" } ubuntu@ldex-pulsar:~$ du -sck apache-pulsar-2.4.1/data/ 3348208 apache-pulsar-2.4.1/data/ 3348208 total ``` Weird. Stats are supposed to update every minute, but: ``` ubuntu@ldex-pulsar:~$ sleep 60; apache-pulsar-2.4.1/bin/pulsar-admin topics stats temp { "msgRateIn" : 0.0, "msgThroughputIn" : 0.0, "msgRateOut" : 0.0, "msgThroughputOut" : 0.0, "averageMsgSize" : 0.0, "storageSize" : 0, "publishers" : [ ], "subscriptions" : { }, "replication" : { }, "deduplicationStatus" : "Disabled" } ubuntu@ldex-pulsar:~$ apache-pulsar-2.4.1/bin/pulsar-admin topics stats persistent://public/default/temp { "msgRateIn" : 0.0, "msgThroughputIn" : 0.0, "msgRateOut" : 0.0, "msgThroughputOut" : 0.0, "averageMsgSize" : 0.0, "storageSize" : 0, "publishers" : [ ], "subscriptions" : { }, "replication" : { }, "deduplicationStatus" : "Disabled" } ubuntu@ldex-pulsar:~$ apache-pulsar-2.4.1/bin/pulsar-admin topics stats-internal persistent://public/default/temp { "entriesAddedCounter" : 500, "numberOfEntries" : 2632, "totalSize" : 2134036790, "currentLedgerEntries" : 500, "currentLedgerSize" : 500012872, "lastLedgerCreatedTimestamp" : "2019-11-20T09:24:19.756Z", "waitingCursorsCount" : 0, "pendingAddEntriesCount" : 0, "lastConfirmedEntry" : "122673:499", "state" : "LedgerOpened", "ledgers" : [ { "ledgerId" : 108697, "entries" : 1633, "size" : 1632046009, "offloaded" : false }, { "ledgerId" : 109540, "entries" : 499, "size" : 1977909, "offloaded" : false }, { "ledgerId" : 122673, "entries" : 0, "size" : 0, "offloaded" : false } ], "cursors" : { } } ``` If I give an invalid topic name to `stats` (e.g. `tempz`) it tells me the topic does not exist, so I must be looking at the right topic. The retention period on this namespaces is set to 1440 minutes, so the lack of consumers shouldn't be an issue; and `stats-internal` shows storage in the ledgers. So the problem more likely is: I don't understand what the "storageSize" parameter of "stats" is actually representing. -------- ``` from collections import defaultdict import pulsar import time NUM_MESSAGES = 500 MESSAGE_SIZE = 1_000_000 client = pulsar.Client('pulsar://localhost:6650') producer = client.create_producer('temp', producer_name='fred', compression_type=pulsar.CompressionType.NONE) sent = 0 bytes = 0 results = defaultdict(lambda: 0) def ack(res, msg): global sent, bytes sent += 1 bytes += len(msg.data()) results[str(res)] += 1 for i in range(NUM_MESSAGES): data = b"x" * MESSAGE_SIZE producer.send_async(data, callback=ack) print("Flushing...") producer.flush() print("Flush complete") for i in range(600): print("Sent: %d messages %d bytes" % (sent, bytes)) if sent == NUM_MESSAGES: break time.sleep(0.1) else: print("Never got all the messages!") print("Results: %r" % dict(results)) producer.close() client.close() ```
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
