2020-07-29 11:09:33 UTC - Ebere Abanonu: After setting the Retention Policy to
-1, -1 then I started getting the error below (When I query for the tables it
returns them):
```
"failureInfo": {
"type": "com.facebook.presto.spi.TableNotFoundException",
"message": "Table \u0027public/default.students\u0027 not found",
"cause": null,
"errorLocation": null,
"suppressed": [],
"stack": [
"org.apache.pulsar.sql.presto.PulsarMetadata.getTableMetadata(PulsarMetadata.java:308)",
"org.apache.pulsar.sql.presto.PulsarMetadata.getTableMetadata(PulsarMetadata.java:161)",
"com.facebook.presto.metadata.MetadataManager.getTableMetadata(MetadataManager.java:377)",
"com.facebook.presto.sql.analyzer.StatementAnalyzer$Visitor.visitTable(StatementAnalyzer.java:841)",
"com.facebook.presto.sql.analyzer.StatementAnalyzer$Visitor.visitTable(StatementAnalyzer.java:257)",
"com.facebook.presto.sql.tree.Table.accept(Table.java:53)",
"com.facebook.presto.sql.tree.AstVisitor.process(AstVisitor.java:27)",
"com.facebook.presto.sql.analyzer.StatementAnalyzer$Visitor.process(StatementAnalyzer.java:269)",
"com.facebook.presto.sql.analyzer.StatementAnalyzer$Visitor.analyzeFrom(StatementAnalyzer.java:1787)",
"com.facebook.presto.sql.analyzer.StatementAnalyzer$Visitor.visitQuerySpecification(StatementAnalyzer.java:944)",
"com.facebook.presto.sql.analyzer.StatementAnalyzer$Visitor.visitQuerySpecification(StatementAnalyzer.java:257)",
"com.facebook.presto.sql.tree.QuerySpecification.accept(QuerySpecification.java:127)",
"com.facebook.presto.sql.tree.AstVisitor.process(AstVisitor.java:27)",
"com.facebook.presto.sql.analyzer.StatementAnalyzer$Visitor.process(StatementAnalyzer.java:269)",
"com.facebook.presto.sql.analyzer.StatementAnalyzer$Visitor.process(StatementAnalyzer.java:279)",
"com.facebook.presto.sql.analyzer.StatementAnalyzer$Visitor.visitQuery(StatementAnalyzer.java:675)",
"com.facebook.presto.sql.analyzer.StatementAnalyzer$Visitor.visitQuery(StatementAnalyzer.java:257)",
"com.facebook.presto.sql.tree.Query.accept(Query.java:94)",
"com.facebook.presto.sql.tree.AstVisitor.process(AstVisitor.java:27)",
"com.facebook.presto.sql.analyzer.StatementAnalyzer$Visitor.process(StatementAnalyzer.java:269)",
"com.facebook.presto.sql.analyzer.StatementAnalyzer.analyze(StatementAnalyzer.java:243)",
"com.facebook.presto.sql.analyzer.Analyzer.analyze(Analyzer.java:72)",
"com.facebook.presto.sql.analyzer.Analyzer.analyze(Analyzer.java:64)",
"com.facebook.presto.execution.SqlQueryExecution.\u003Cinit\u003E(SqlQueryExecution.java:169)",
"com.facebook.presto.execution.SqlQueryExecution$SqlQueryExecutionFactory.createQueryExecution(SqlQueryExecution.java:677)",
"com.facebook.presto.execution.SqlQueryManager.createQueryInternal(SqlQueryManager.java:444)",
"com.facebook.presto.execution.SqlQueryManager.lambda$createQuery$3(SqlQueryManager.java:377)",
"java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)",
"java.util.concurrent.FutureTask.run(FutureTask.java:266)",
"java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)",
"java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)",
"java.lang.Thread.run(Thread.java:748)"```
----
2020-07-29 13:43:08 UTC - Joshua Decosta: @Addison Higham I think it might have
to do with the way authorization service’ canLookupAsync is calling
canCosumeAsync.
I’m always expecting there to be authenticationData. Which I’m the
canConsumeAsync call isn’t being passed in.
Do you know why that is? It’s passed in to the canProduceAsync block.
----
2020-07-29 13:59:00 UTC - Joshua Decosta: It’s definitely that. Because I
expected authenticationData to always be passed if it’s not I get a null
pointer and the return never happens which causes the timeouts ive been seeing.
Seems like a mistake to not pass the authenticationData along to the
authService’s canConsume async
----
2020-07-29 17:34:06 UTC - Addison Higham: hrm... compaction should be well
behaved the only thing that real grows unbounded is a map of the latest message
id for every key. If you have lots and lots of keys, that map can get fairly
big... so that might be a contributing factor, @Sijie Guo do you have any ideas?
----
2020-07-29 17:36:13 UTC - Addison Higham: @Walter were you able to figure it
all out? there isn't (AFAIK) a way to find ledgers that only have a single
bookie easily, you can look at the metadata in zookeeper directly or probably
easier would be to write some bash scripts to list the ledgermetadata for each
ledger and see the ensemble size
----
2020-07-29 17:42:37 UTC - Addison Higham: broker communicating to bookkeeper is
the important one, and to be clear, these are OS level keepalives, here is a
great article <https://blog.cloudflare.com/when-tcp-sockets-refuse-to-die/>
that explains the underlying problem. How that effects pulsar is that because
pulsar communicates to bookeeper using a connection pool, it just tries to
reuse connections as much as possible. the connection pool is also based on
hostname, so when a new bookkeeper gets a different UP but is used under the
same hostname, the connection pool doesn't get notified and still has
connections to the old IP address, which it keeps trying to use. Since
keepalive is active, the connection may not die for a long time due to OS level
defaults
----
2020-07-29 17:43:18 UTC - Addison Higham: I have previously used those same
values defined in that blog post to really help. This is something we want to
get fixed and should make it into the next release
----
2020-07-29 18:36:43 UTC - Addison Higham: @Joshua Decosta that seems like a
bug, see
<https://github.com/apache/pulsar/blob/77dccd2824699c9be96f518a8a1079df51612c8e/pulsar-broker-common/src/main/java/org/apache/pulsar/broker/authorization/AuthorizationService.java#L307>
for where I think that is happening. Note that `canProduceAsync` is passed the
authData
----
2020-07-29 20:19:09 UTC - Alex Pedersen: @Alex Pedersen has joined the channel
----
2020-07-29 21:13:29 UTC - Adriaan de Haan: I think the question is specifically
whether topics created after the subscription is started will be subscribed to
automatically ("future topics").
----
2020-07-29 22:12:07 UTC - Chris Hansen: ok, I’m sure I’m doing something wrong,
but I don’t know what it is. With standalone, I have a topic w/ two
subscriptions, one is a consumer (exclusive sub) and one is a function. When I
produce to the topic, neither gets any messages and when I try `peek-messages`
it says `Exception 'Message not found' occured while trying to peek Messages` .
However, if I create a new (third) subscription using `pulsar-admin`, then I
can see messages using `peek-messages` . `topics stats` shows 3 messages in and
0 messages out, with 0 messages on the subscriptions. Anyone know where I
should look next?
----
2020-07-29 22:12:43 UTC - Sijie Guo: Yes. The future topics will be
automatically subscribed.
----
2020-07-29 22:14:18 UTC - Sijie Guo: Can you show the topic stats?
----
2020-07-29 22:14:37 UTC - Chris Hansen: ```{
"msgRateIn" : 0.0,
"msgThroughputIn" : 0.0,
"msgRateOut" : 0.0,
"msgThroughputOut" : 0.0,
"bytesInCounter" : 200,
"msgInCounter" : 3,
"bytesOutCounter" : 0,
"msgOutCounter" : 0,
"averageMsgSize" : 0.0,
"storageSize" : 200,
"backlogSize" : 200,
"publishers" : [ ],
"subscriptions" : {
"test-sub" : {
"msgRateOut" : 0.0,
"msgThroughputOut" : 0.0,
"bytesOutCounter" : 0,
"msgOutCounter" : 0,
"msgRateRedeliver" : 0.0,
"msgBacklog" : 0,
"msgBacklogNoDelayed" : 0,
"blockedSubscriptionOnUnackedMsgs" : false,
"msgDelayed" : 0,
"unackedMessages" : 0,
"type" : "Exclusive",
"activeConsumerName" : "test-consumer",
"msgRateExpired" : 0.0,
"lastExpireTimestamp" : 0,
"lastConsumedFlowTimestamp" : 1596059945436,
"lastConsumedTimestamp" : 0,
"lastAckedTimestamp" : 0,
"consumers" : [ {
"msgRateOut" : 0.0,
"msgThroughputOut" : 0.0,
"bytesOutCounter" : 0,
"msgOutCounter" : 0,
"msgRateRedeliver" : 0.0,
"consumerName" : "test-consumer",
"availablePermits" : 1000,
"unackedMessages" : 0,
"blockedConsumerOnUnackedMsgs" : false,
"lastAckedTimestamp" : 0,
"lastConsumedTimestamp" : 0,
"metadata" : { },
"connectedSince" : "2020-07-29T21:59:05.428Z",
"clientVersion" : "2.5.2",
"address" : "/172.17.0.1:56058"
} ],
"isReplicated" : false
},
"public/default/SecretPrintingFunction" : {
"msgRateOut" : 0.0,
"msgThroughputOut" : 0.0,
"bytesOutCounter" : 0,
"msgOutCounter" : 0,
"msgRateRedeliver" : 0.0,
"msgBacklog" : 0,
"msgBacklogNoDelayed" : 0,
"blockedSubscriptionOnUnackedMsgs" : false,
"msgDelayed" : 0,
"unackedMessages" : 0,
"type" : "Shared",
"msgRateExpired" : 0.0,
"lastExpireTimestamp" : 0,
"lastConsumedFlowTimestamp" : 1596059952000,
"lastConsumedTimestamp" : 0,
"lastAckedTimestamp" : 0,
"consumers" : [ {
"msgRateOut" : 0.0,
"msgThroughputOut" : 0.0,
"bytesOutCounter" : 0,
"msgOutCounter" : 0,
"msgRateRedeliver" : 0.0,
"consumerName" : "84508",
"availablePermits" : 1000,
"unackedMessages" : 0,
"blockedConsumerOnUnackedMsgs" : false,
"lastAckedTimestamp" : 0,
"lastConsumedTimestamp" : 0,
"metadata" : {
"instance_id" : "0",
"application" : "pulsar-function",
"id" : "public/default/SecretPrintingFunction"
},
"connectedSince" : "2020-07-29T21:59:11.97Z",
"clientVersion" : "2.5.2",
"address" : "/127.0.0.1:55426"
} ],
"isReplicated" : false
},
"debug-sub" : {
"msgRateOut" : 0.0,
"msgThroughputOut" : 0.0,
"bytesOutCounter" : 0,
"msgOutCounter" : 0,
"msgRateRedeliver" : 0.0,
"msgBacklog" : 3,
"msgBacklogNoDelayed" : 3,
"blockedSubscriptionOnUnackedMsgs" : false,
"msgDelayed" : 0,
"unackedMessages" : 0,
"msgRateExpired" : 0.0,
"lastExpireTimestamp" : 0,
"lastConsumedFlowTimestamp" : 0,
"lastConsumedTimestamp" : 0,
"lastAckedTimestamp" : 0,
"consumers" : [ ],
"isReplicated" : false
}
},
"replication" : { },
"deduplicationStatus" : "Disabled"
}```
----
2020-07-29 22:17:16 UTC - Chris Hansen: also I see the function instance
created but `process` is never called
----
2020-07-29 22:20:51 UTC - Sijie Guo: the backlog of t`est-sub` and
`public/default/SecretPrintingFunction` are all 0 and `debug-sub` is 3.
----
2020-07-29 22:21:16 UTC - Sijie Guo: It seems that `debug-sub` is the third
subscription you mentioned.
----
2020-07-29 22:21:20 UTC - Chris Hansen: yes
----
2020-07-29 22:21:38 UTC - Sijie Guo: If so, that means both subscriptions have
already read and acked the messages.
----
2020-07-29 22:21:56 UTC - Sijie Guo: Can you also get `topics stats-internal`?
----
2020-07-29 22:22:22 UTC - Addison Higham: Hrm... I am not super familiar with
pulsar SQL... if you can add some more context it might be best to create a user
----
2020-07-29 22:24:17 UTC - Chris Hansen: hmm, I’m seeing the consumer hang
(seemingly) on the first call to receive and no topic output or logging from
the function (after initial creation). Getting stats-internal now.
----
2020-07-29 22:26:27 UTC - Chris Hansen: ```{
"entriesAddedCounter" : 3,
"numberOfEntries" : 3,
"totalSize" : 200,
"currentLedgerEntries" : 3,
"currentLedgerSize" : 200,
"lastLedgerCreatedTimestamp" : "2020-07-29T22:23:23.762Z",
"waitingCursorsCount" : 2,
"pendingAddEntriesCount" : 0,
"lastConfirmedEntry" : "12:2",
"state" : "LedgerOpened",
"ledgers" : [ {
"ledgerId" : 12,
"entries" : 0,
"size" : 0,
"offloaded" : false
} ],
"cursors" : {
"debug-sub" : {
"markDeletePosition" : "12:-1",
"readPosition" : "12:-1",
"waitingReadOp" : false,
"pendingReadOps" : 0,
"messagesConsumedCounter" : 0,
"cursorLedger" : 18,
"cursorLedgerLastEntry" : 1,
"individuallyDeletedMessages" : "[]",
"lastLedgerSwitchTimestamp" : "2020-07-29T22:25:07.769Z",
"state" : "Open",
"numberOfEntriesSinceFirstNotAckedMessage" : 0,
"totalNonContiguousDeletedMessagesRange" : 0,
"properties" : { }
},
"public%2Fdefault%2FSecretPrintingFunction" : {
"markDeletePosition" : "12:2",
"readPosition" : "12:3",
"waitingReadOp" : true,
"pendingReadOps" : 0,
"messagesConsumedCounter" : 3,
"cursorLedger" : 17,
"cursorLedgerLastEntry" : 0,
"individuallyDeletedMessages" : "[]",
"lastLedgerSwitchTimestamp" : "2020-07-29T22:23:31.831Z",
"state" : "Open",
"numberOfEntriesSinceFirstNotAckedMessage" : 1,
"totalNonContiguousDeletedMessagesRange" : 0,
"properties" : { }
},
"test-sub" : {
"markDeletePosition" : "12:2",
"readPosition" : "12:3",
"waitingReadOp" : true,
"pendingReadOps" : 0,
"messagesConsumedCounter" : 3,
"cursorLedger" : 14,
"cursorLedgerLastEntry" : 0,
"individuallyDeletedMessages" : "[]",
"lastLedgerSwitchTimestamp" : "2020-07-29T22:23:24.316Z",
"state" : "Open",
"numberOfEntriesSinceFirstNotAckedMessage" : 1,
"totalNonContiguousDeletedMessagesRange" : 0,
"properties" : { }
}
}
}```
----
2020-07-29 22:31:45 UTC - Chris Hansen: so it looks like the messages are being
delivered to both the consumer and function but I don’t understand how my
breakpoint never advances past `consumer.receive()` and I have less visibility
into the function (not using LocalRunner at the moment) but it doesn’t seem
like the `process` method is being called (logs show up for function creation
but no log for `process`)
----
2020-07-29 22:36:27 UTC - Joshua Decosta: ok, I’ll make a PR to fix it.
----
2020-07-29 22:39:19 UTC - Sijie Guo: the internal stats show that the messages
have been delivered and acked.
For function, can you run `functions stats` to get the stats of a function?
----
2020-07-29 22:44:38 UTC - Chris Hansen: I did restart my test suite (which
starts up a container) so here are the new stats-internal and function stats.
They don’t seem to match up if I understand correctly (3 consumed according to
stats-internal, 0 according to function stats).
topic stats-internal:
```{
"entriesAddedCounter" : 3,
"numberOfEntries" : 3,
"totalSize" : 200,
"currentLedgerEntries" : 3,
"currentLedgerSize" : 200,
"lastLedgerCreatedTimestamp" : "2020-07-29T22:40:27.365Z",
"waitingCursorsCount" : 2,
"pendingAddEntriesCount" : 0,
"lastConfirmedEntry" : "10:2",
"state" : "LedgerOpened",
"ledgers" : [ {
"ledgerId" : 10,
"entries" : 0,
"size" : 0,
"offloaded" : false
} ],
"cursors" : {
"debug-sub" : {
"markDeletePosition" : "10:-1",
"readPosition" : "10:-1",
"waitingReadOp" : false,
"pendingReadOps" : 0,
"messagesConsumedCounter" : 0,
"cursorLedger" : 11,
"cursorLedgerLastEntry" : 1,
"individuallyDeletedMessages" : "[]",
"lastLedgerSwitchTimestamp" : "2020-07-29T22:40:27.375Z",
"state" : "Open",
"numberOfEntriesSinceFirstNotAckedMessage" : 0,
"totalNonContiguousDeletedMessagesRange" : 0,
"properties" : { }
},
"public%2Fdefault%2FSecretPrintingFunction" : {
"markDeletePosition" : "10:2",
"readPosition" : "10:3",
"waitingReadOp" : true,
"pendingReadOps" : 0,
"messagesConsumedCounter" : 3,
"cursorLedger" : 18,
"cursorLedgerLastEntry" : 0,
"individuallyDeletedMessages" : "[]",
"lastLedgerSwitchTimestamp" : "2020-07-29T22:40:48.511Z",
"state" : "Open",
"numberOfEntriesSinceFirstNotAckedMessage" : 1,
"totalNonContiguousDeletedMessagesRange" : 0,
"properties" : { }
},
"test-sub" : {
"markDeletePosition" : "10:2",
"readPosition" : "10:3",
"waitingReadOp" : true,
"pendingReadOps" : 0,
"messagesConsumedCounter" : 3,
"cursorLedger" : 15,
"cursorLedgerLastEntry" : 0,
"individuallyDeletedMessages" : "[]",
"lastLedgerSwitchTimestamp" : "2020-07-29T22:40:41.264Z",
"state" : "Open",
"numberOfEntriesSinceFirstNotAckedMessage" : 1,
"totalNonContiguousDeletedMessagesRange" : 0,
"properties" : { }
}
}
}```
function stats
```{
"receivedTotal" : 0,
"processedSuccessfullyTotal" : 0,
"systemExceptionsTotal" : 0,
"userExceptionsTotal" : 0,
"avgProcessLatency" : null,
"1min" : {
"receivedTotal" : 0,
"processedSuccessfullyTotal" : 0,
"systemExceptionsTotal" : 0,
"userExceptionsTotal" : 0,
"avgProcessLatency" : null
},
"lastInvocation" : null,
"instances" : [ {
"instanceId" : 0,
"metrics" : {
"receivedTotal" : 0,
"processedSuccessfullyTotal" : 0,
"systemExceptionsTotal" : 0,
"userExceptionsTotal" : 0,
"avgProcessLatency" : null,
"1min" : {
"receivedTotal" : 0,
"processedSuccessfullyTotal" : 0,
"systemExceptionsTotal" : 0,
"userExceptionsTotal" : 0,
"avgProcessLatency" : null
},
"lastInvocation" : null,
"userMetrics" : { }
}
} ]
}```
----
2020-07-29 23:16:02 UTC - Joshua Decosta: @Addison Higham Should we expect the
subscription value to also always be passed into canConsume?
----
2020-07-29 23:27:12 UTC - Addison Higham: @Joshua Decosta in this case I don't
think we have a subscription, which is valid
----
2020-07-29 23:27:42 UTC - Joshua Decosta: ok so there are times when not having
a subscription is ok? if so that seems like it should be noted in the
AuthorizationProvider’s canConsumeAsync
----
2020-07-30 08:05:08 UTC - Takahiro Hozumi: Message deduplication is disabled by
default.
```<https://pulsar.apache.org/docs/en/cookbooks-deduplication/>
> By default, it is disabled on all namespaces.```
Do you know why?
Is there any downside?
I think that nobody want to avoid deduplication.
----
2020-07-30 08:39:20 UTC - zilinLiu: @zilinLiu has joined the channel
----