2019-02-14 15:21:16 UTC - Siva Prasad Rao Janapati: @jia zhai I read the Pulsar documentation and setup asynchronous replication between two clusters. ---- 2019-02-14 15:23:38 UTC - Matteo Merli: Have you set the replication clusters in the namespace policies ? `pulsar-admin namespaces policies $NAMESPACE`
Can you check the topic stats? That will tell the state of replicators as well. `pulsar-admin topics stats $TOPIC` ---- 2019-02-14 15:40:27 UTC - Karthik Palanivelu: Yes @Ali Ahmed I am using hostIP but I can run only one broker per node which I feel like not an ideal design if I were to consider resource utilization ---- 2019-02-14 15:45:31 UTC - Karthik Palanivelu: @Siva Prasad Rao Janapati I tested this earlier on kubernetes, was working. Can you please check the logs on the broker will let you know the issue? You can check whether the configurations are similar to the one I had from here - <https://medium.com/@pckeyan/apache-pulsar-geo-replication-ad4f0ca3224b> ---- 2019-02-14 15:47:41 UTC - Siva Prasad Rao Janapati: @Karthik Palanivelu I gone through your article as well and cross verified the settings. I did all of them on my clusters. I checked brokers log but I am not seeing any thing related to replication ---- 2019-02-14 15:48:54 UTC - Matteo Merli: @Siva Prasad Rao Janapati can you paste here the namespace policies and the topic stats? ---- 2019-02-14 15:49:05 UTC - Matteo Merli: That should help understanding the issue ---- 2019-02-14 15:54:39 UTC - Siva Prasad Rao Janapati: I gave same cluster names in both data centers. Is this cause any problem? Any how I am running namespace policies command. I will post the output here ---- 2019-02-14 16:03:43 UTC - Siva Prasad Rao Janapati: Here is the output of topic stats and namespace policies. ---- 2019-02-14 16:03:52 UTC - Siva Prasad Rao Janapati: ==========DC2 name space policies========== { "auth_policies" : { "namespace_auth" : { }, "destination_auth" : { } }, "replication_clusters" : [ "pulsar-cluster-1" ], "bundles" : { "boundaries" : [ "0x00000000", "0x40000000", "0x80000000", "0xc0000000", "0xffffffff" ], "numBundles" : 4 }, "backlog_quota_map" : { }, "clusterDispatchRate" : { }, "subscriptionDispatchRate" : { }, "latency_stats_sample_rate" : { }, "message_ttl_in_seconds" : 0, "deleted" : false, "encryption_required" : false, "subscription_auth_mode" : "None", "max_producers_per_topic" : 0, "max_consumers_per_topic" : 0, "max_consumers_per_subscription" : 0, "compaction_threshold" : 0, "offload_threshold" : -1 } ========DC2 topic stats================== ./bin/pulsar-admin topics stats <persistent://my-tenant/global/my-namespace/my-topic> { "msgRateIn" : 0.0, "msgThroughputIn" : 0.0, "msgRateOut" : 0.0, "msgThroughputOut" : 0.0, "averageMsgSize" : 0.0, "storageSize" : 0, "publishers" : [ ], "subscriptions" : { }, "replication" : { }, "deduplicationStatus" : "Disabled" } ============DC1 name space policies ============= { "auth_policies" : { "namespace_auth" : { }, "destination_auth" : { } }, "replication_clusters" : [ "pulsar-cluster-1" ], "bundles" : { "boundaries" : [ "0x00000000", "0x40000000", "0x80000000", "0xc0000000", "0xffffffff" ], "numBundles" : 4 }, "backlog_quota_map" : { }, "clusterDispatchRate" : { }, "subscriptionDispatchRate" : { }, "latency_stats_sample_rate" : { }, "message_ttl_in_seconds" : 0, "deleted" : false, "encryption_required" : false, "subscription_auth_mode" : "None", "max_producers_per_topic" : 0, "max_consumers_per_topic" : 0, "max_consumers_per_subscription" : 0, "compaction_threshold" : 0, "offload_threshold" : -1 } ========DC1 topic stats================== ./bin/pulsar-admin topics stats <persistent://my-tenant/global/my-namespace/my-topic> { "msgRateIn" : 0.0, "msgThroughputIn" : 0.0, "msgRateOut" : 0.0, "msgThroughputOut" : 0.0, "averageMsgSize" : 0.0, "storageSize" : 575720, "publishers" : [ ], "subscriptions" : { "test" : { "msgRateOut" : 0.0, "msgThroughputOut" : 0.0, "msgRateRedeliver" : 0.0, "msgBacklog" : 10000, "blockedSubscriptionOnUnackedMsgs" : false, "unackedMessages" : 0, "msgRateExpired" : 0.0, "consumers" : [ ] } }, "replication" : { }, "deduplicationStatus" : "Disabled" } =====Command used to produce messages ==================== ./bin/pulsar-client produce <persistent://my-tenant/global/my-namespace/my-topic> -n 1000 -m "hello" ---- 2019-02-14 16:10:01 UTC - Matteo Merli: Yes, the cluster names need to be different, since each cluster needs to be aware of the others ---- 2019-02-14 16:10:21 UTC - Matteo Merli: And keep track where it’s reading from ---- 2019-02-14 16:10:51 UTC - Siva Prasad Rao Janapati: @Matteo Merli Thanks, Let me try by changing the cluster name on DC2 ---- 2019-02-14 16:11:01 UTC - Matteo Merli: In the “replication_clusters” section in the policies above, you’d need to have both cluster names ---- 2019-02-14 16:11:43 UTC - Siva Prasad Rao Janapati: I got your point ---- 2019-02-14 17:10:43 UTC - Siva Prasad Rao Janapati: When I am running on DC1 where I have cluster 1 I am getting below error ./bin/pulsar-admin namespaces set-clusters my-tenant-test/global/my-namespace \ > --clusters pulsar-cluster-1,pulsar-cluster-2 ---- 2019-02-14 17:10:45 UTC - Siva Prasad Rao Janapati: Invalid cluster id: pulsar-cluster-2 Reason: Invalid cluster id: pulsar-cluster-2 ---- 2019-02-14 17:15:04 UTC - Siva Prasad Rao Janapati: Guys, My bad. I fixed the above issue ---- 2019-02-14 17:15:09 UTC - Siva Prasad Rao Janapati: Now I am am able to replicate ---- 2019-02-14 17:15:29 UTC - Siva Prasad Rao Janapati: Able to see the messages produced on cluster 1 on to cluster 2 ---- 2019-02-14 17:15:34 UTC - Siva Prasad Rao Janapati: Thanks for your suport ---- 2019-02-14 17:15:36 UTC - Siva Prasad Rao Janapati: support ---- 2019-02-14 17:25:13 UTC - Matteo Merli: :+1: ---- 2019-02-14 18:10:34 UTC - Sanjeev Kulkarni: @dba windowing is supported only for java atm ---- 2019-02-14 21:01:08 UTC - Ali Ahmed: depends on your production, definitely it’s not a hardened system , you can also look at node port or cluster ip configurations ---- 2019-02-14 21:52:50 UTC - Karthik Palanivelu: I missed a point in my question, my apologies. If I am running synchronous replication I should publish the hostIP to the brokers in other clusters. Let me try node port on brokers as service and get back to you if I can add more brokers. ---- 2019-02-14 23:32:06 UTC - Sam Leung: @Sam Leung has joined the channel ---- 2019-02-15 03:56:31 UTC - Jacob O'Farrell: @Jacob O'Farrell has joined the channel ---- 2019-02-15 03:59:20 UTC - Khoa Tran: @Khoa Tran has joined the channel ---- 2019-02-15 04:53:48 UTC - bossbaby: @Matteo Merli when do you fix reset-cursor in global-topic? I'm looking forward to it ---- 2019-02-15 05:58:20 UTC - bossbaby: in pulsar, everything works normally until VMs cpu is alerting 90%, I go to check, the data cannot be written to bookie, After running the bin / bookkeeper shell listbookies -rw command Result: "No bookie exists!" and all my topics have been lost Can someone explain help me ---- 2019-02-15 06:07:47 UTC - Ali Ahmed: @bossbaby what was your cluster setup ? ---- 2019-02-15 06:09:38 UTC - bossbaby: i use geo-replicate and cluster setup with 1zk-1bk-1br ---- 2019-02-15 06:10:18 UTC - Ali Ahmed: is the other cluster fine ? ---- 2019-02-15 06:14:36 UTC - bossbaby: other cluster is fine ---- 2019-02-15 06:15:27 UTC - Ali Ahmed: try this ```bookkeeper autorecovery``` ---- 2019-02-15 06:21:07 UTC - bossbaby: this result: ``` 06:17:59.922 [main-EventThread] INFO org.apache.bookkeeper.zookeeper.ZooKeeperWatcherBase - ZooKeeper client is connected now. 06:18:00.196 [main] INFO org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl - Failed to initialize DNS Resolver org.apache.bookkeeper.net.ScriptBasedMapping, used default subnet resolver : java.lang.RuntimeException: No network topology script is found when using script based DNS resolver. 06:18:00.251 [main] INFO org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl - Initialize rackaware ensemble placement policy @ <Bookie:127.0.1.1:0> @ /default-rack : org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl$DefaultResolver. 06:18:00.252 [main] INFO org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl - Not weighted 06:18:00.295 [main] INFO org.apache.bookkeeper.client.BookKeeper - Weighted ledger placement is not enabled 06:18:00.683 [main] INFO org.apache.bookkeeper.replication.AutoRecoveryMain - Register shutdown hook successfully ``` ---- 2019-02-15 06:21:59 UTC - Ali Ahmed: can you get log files and link them here ---- 2019-02-15 06:25:55 UTC - bossbaby: this log: <https://gist.github.com/tuan6956/1a746f1bb0a15aae98288282618e562e> ---- 2019-02-15 06:27:51 UTC - Ali Ahmed: what about zk and bookie logs ---- 2019-02-15 06:38:19 UTC - bossbaby: i have a zk log: <https://gist.github.com/tuan6956/865a81d04793de6ad77840763973d65e> ---- 2019-02-15 06:39:38 UTC - Ali Ahmed: looks like pulsar is expecting it be a fresh instance @Sijie Guo can probably understand this better ---- 2019-02-15 06:42:21 UTC - bossbaby: thanks you @Ali Ahmed I hope someone can help me ---- 2019-02-15 06:57:21 UTC - Sijie Guo: can you check if the bookie process is still running? and can you share me the bookie log? ---- 2019-02-15 07:29:49 UTC - bossbaby: i start a broker in the foreground so i can't get bookie log ---- 2019-02-15 07:49:10 UTC - Sijie Guo: bookie has its own log ---- 2019-02-15 07:49:19 UTC - Sijie Guo: can you first check if the process is still running? ---- 2019-02-15 07:55:35 UTC - bossbaby: yes, process is running and i stopped, upgrade it and everything works again ---- 2019-02-15 07:56:04 UTC - bossbaby: and i can't file log in bookie ---- 2019-02-15 07:56:14 UTC - Sijie Guo: ok - it would be good if you can pipe the output to a file ---- 2019-02-15 07:56:24 UTC - Sijie Guo: if you start in foreground, it is in the console ---- 2019-02-15 07:59:33 UTC - bossbaby: Thanks @Sijie Guo, but sadly it was lost ---- 2019-02-15 08:00:21 UTC - Sijie Guo: yeah I would suggest you either start in background or pipe the output to file. ---- 2019-02-15 08:00:35 UTC - Sijie Guo: otherwise it is very hard to troubleshoot such problems +1 : bossbaby ---- 2019-02-15 08:01:00 UTC - bossbaby: to delete global-topic i did: 1. Stop all producer and consumer 2. Delete all subcription 3. set retention namespace to pulsar delete it it right? +1 : Matteo Merli ---- 2019-02-15 08:09:03 UTC - bossbaby: so have a problem, all global-topics are different from retention: -1 when adjusting 1M or 1H will lose old data ---- 2019-02-15 08:53:56 UTC - bossbaby: If there is another way to delete, please show me ----
