2018-03-12 13:22:47 UTC - Daniel Ferreira Jorge: Hi! I'm getting a `Provider org.apache.pulsar.shade.org.glassfish.jersey.internal.RuntimeDelegateImpl could not be instantiated: java.lang.IllegalStateException: No generator was provided and there is no default generator registered` after instantiating a `import org.apache.pulsar.client.admin.PulsarAdmin`. In my pom I have `pulsar-client` and `pulsar-client-admin`, 1.22 ---- 2018-03-12 13:24:05 UTC - Daniel Ferreira Jorge: Do I need anything else as a dependency? ---- 2018-03-12 13:28:52 UTC - Daniel Ferreira Jorge: @Daniel Ferreira Jorge uploaded a file: <https://apache-pulsar.slack.com/files/U8E1J0DHS/F9P6DN362/stack.txt|Stack> ---- 2018-03-12 15:34:06 UTC - Daniel Ferreira Jorge: To make things work I had to use `pulsar-client-original` and `pulsar-client-admin-original` (what is the difference between original and "non-original"?) Also, I had to include `jackson-annotations` and `jackson-jaxrs-base` to my pom... is this correct? ---- 2018-03-12 15:38:02 UTC - Daniel Ferreira Jorge: I think I found a small bug (unralated to the above): When I set `retentionTimeInMinutes=-1` and `retentionSizeInMB=-1` I cannot set any value to `BacklogQuota` because any value will give `Backlog Quota exceeds configured retention quota for namespace` ---- 2018-03-12 16:20:37 UTC - Sijie Guo: @Daniel Ferreira Jorge: It seems that a shaded class is not found : “java.lang.ClassNotFoundException: Provider org.apache.pulsar.shade.org.glassfish.jersey.internal.RuntimeDelegateImpl could not be instantiated:” ---- 2018-03-12 16:20:39 UTC - Sijie Guo: checking now ---- 2018-03-12 16:21:27 UTC - Sijie Guo: > what is the difference between original and “non-original”?
the “original” doesn’t shade any dependencies, while “non-original” shades the dependencies that pulsar client/admin-client is using. ---- 2018-03-12 16:24:31 UTC - Daniel Ferreira Jorge: ok ---- 2018-03-12 16:24:37 UTC - Daniel Ferreira Jorge: thanks ---- 2018-03-12 16:25:15 UTC - Daniel Ferreira Jorge: I created a PR for the other bug I mentioned <https://github.com/apache/incubator-pulsar/pull/1368> ---- 2018-03-12 16:26:42 UTC - Ivan Kelly: do we actually document that you should use -1 to disable anywhere? ---- 2018-03-12 16:27:04 UTC - Daniel Ferreira Jorge: yes ---- 2018-03-12 16:27:12 UTC - Daniel Ferreira Jorge: 1 sec ---- 2018-03-12 16:27:53 UTC - Daniel Ferreira Jorge: <https://pulsar.apache.org/docs/latest/advanced/RetentionExpiry/#Retentionpolicies-lieva> ---- 2018-03-12 16:28:09 UTC - Daniel Ferreira Jorge: "It is also possible to set infinite retention time or size, by setting -1 for either time or size retention." ---- 2018-03-12 16:28:38 UTC - Daniel Ferreira Jorge: also <https://github.com/apache/incubator-pulsar/pull/1135> ---- 2018-03-12 16:31:52 UTC - Ivan Kelly: ah ---- 2018-03-12 17:19:53 UTC - Sijie Guo: @Daniel Ferreira Jorge I think the problem comes from a problem on shading plugin - <https://issues.apache.org/jira/browse/MSHADE-182> Class org.glassfish.hk2.extension.ServiceLocatorGenerator gets relocated but META-INF/services/org.glassfish.hk2.extension.ServiceLocatorGenerator file doesn’t, so the implementation can’t be looked up /cc @Matteo Merli ---- 2018-03-12 17:23:51 UTC - Sijie Guo: I think pulsar-client-admin got shaded just in 1.22.0-incubating, so in 1.21.0-incubating, you are using the “non-shaded” dependency. ---- 2018-03-12 17:29:14 UTC - Daniel Ferreira Jorge: I'm using the 1.22 ---- 2018-03-12 17:55:09 UTC - Stephen Shepherd: @Stephen Shepherd has joined the channel ---- 2018-03-12 18:13:07 UTC - Sijie Guo: @Matteo Merli @Daniel Ferreira Jorge here is the fix : <https://github.com/apache/incubator-pulsar/pull/1370> ---- 2018-03-12 18:15:06 UTC - Daniel Ferreira Jorge: @Sijie Guo Thank you! ---- 2018-03-12 18:45:03 UTC - Stephen Shepherd: Newbie question - What keeps messages in a standalone cluster from being deleted? I have a standalone cluster running in Docker. 1.21.0-incubating. I am producing and consuming messages successfully, but no message appear to be deleted. Retention related configs are all defaults (I have zero retention period and size, zero TTL). I am assuming messages should be deleted as soon as acked. The only config change was setting brokerDeleteInactiveTopicsEnabled=false. I believe messages are not being deleted because I can reposition my consumer and read previously acknowledged messages. Internal stats for the topic also show "numberOfEntries" remains unchanged, even after "markDeletePosition" and "readPosition" have moved. Am I missing something else that indicates or controls when messages are removed? ---- 2018-03-12 18:47:20 UTC - Matteo Merli: @Stephen Shepherd By being deleted, do you mean physically from disk or “logically” ? ---- 2018-03-12 18:47:44 UTC - Matteo Merli: you can check `bin/admin persistent stats $MY_TOPIC` to get the stats ---- 2018-03-12 18:48:35 UTC - Matteo Merli: in general, data is not immediately physically deleted from disk but rather deletions are done in bulk once the data is marked to be deleted ---- 2018-03-12 18:52:11 UTC - Stephen Shepherd: @Matteo Merli Thank you for the quick reply. I was expecting physical deletion. What controls the bulk delete timing? ---- 2018-03-12 19:48:45 UTC - Matteo Merli: @Stephen Shepherd There are few layers here: * First messages are stored in BookKeeper “ledgers”. Each ledger is an append-only replicated log and can only be deleted entirely. So even if you consume few entries, the ledger won’t be deleted until all messages stored in that ledger are consumed and acknowledged for all subscription (plus, eventually, the retention time). Ledgers are rolled-over on a size and time basis and there are few tunable to set in `broker.conf`: * `managedLedgerMaxEntriesPerLedger=50000` * `managedLedgerMinLedgerRolloverTimeMinutes=10` * `managedLedgerMaxLedgerRolloverTimeMinutes=240` * When a ledger is deleted, the bookies (storage nodes) won’t delete the data immediately. Rather they rely on a garbage collection process. This GC runs periodically and checks for deleted ledgers and see if data on disk can be removed. Since there is no single file per-ledger, the bookie will compact the entry log files based on thresholds: * Gargage collection time: `gcWaitTime=900000` (default is 15min) - All empty files are removed * Minor compaction -- Runs every 1h and compact all the files with < 20% “valid” data - `minorCompactionThreshold=0.2` - `minorCompactionInterval=3600` * Major compaction -- Runs every 24h and compact all the files with < 50% “valid” data - `majorCompactionThreshold=0.5` - `majorCompactionInterval=86400` ---- 2018-03-12 19:58:33 UTC - Stephen Shepherd: @Matteo Merli Thank you for the detailed response. Very helpful! ---- 2018-03-12 21:40:42 UTC - Daniel Ferreira Jorge: Hello again! Are there any docs regarding PIP-7? ---- 2018-03-12 22:38:10 UTC - Joe Francis: @Daniel Ferreira Jorge Not at this point - what kind of information are you looking for? ---- 2018-03-12 22:48:48 UTC - Daniel Ferreira Jorge: Hi @Joe Francis I'm looking on how to make sure that the messages and its replicas are not in the same availability zone on GKE ---- 2018-03-12 22:50:56 UTC - Daniel Ferreira Jorge: suppose I have 3 bookies, 1 in each availability zone on Google Cloud and my Bookeeper Ensemble is set to 3. I want each replica on a different availability zone... ---- 2018-03-12 22:51:54 UTC - Daniel Ferreira Jorge: is the affinity group a setting on each bookie conf? ---- 2018-03-12 22:52:02 UTC - Joe Francis: Ah, ok. but PIP-7 works on brokers. You might want to look into rack placemnet policy on bookkeeper ---- 2018-03-12 22:52:48 UTC - Daniel Ferreira Jorge: ah... ok... I thought it was for everything ---- 2018-03-12 22:53:25 UTC - Daniel Ferreira Jorge: hahaha now I have to set 2 things ---- 2018-03-12 22:54:22 UTC - Joe Francis: PIP-7 is a tweak to the load balancer. The load balancer would attempt to place the namespaces in a given AA group on different brokers. ---- 2018-03-12 22:56:23 UTC - Joe Francis: If you want 3 different copies on different AZ, you should do something like define each AZ as a rack, and then enable rack-aware placement in BK. @Sijie Guo or @Matteo Merli ? ---- 2018-03-12 22:57:02 UTC - Daniel Ferreira Jorge: I'm looking for a way to guarantee that pulsar will be working perfectly (brokers and data) if a whole zone is down ---- 2018-03-12 22:59:22 UTC - Matteo Merli: @Daniel Ferreira Jorge For broker there should be nothing additional to do. PIP-7 was done to minimize the impact of a malfunctioning broker: if you spread the topics over more brokers (mixed with other topics from other tenants), one bad broker might impact 20% of traffic instead of 100% ---- 2018-03-12 23:00:00 UTC - Matteo Merli: when running in multiple AZs, you just need to provision broker VMs in the the different AZs ---- 2018-03-12 23:01:14 UTC - Matteo Merli: for bookies, there’s the BookKeeper rack-aware policy to configure. Admittedly we have not yet documented how to use it. :confused: That was pending for some time already, hopefully we can get to it very soon. ---- 2018-03-12 23:02:52 UTC - Joe Francis: @Daniel Ferreira Jorge We use rack aware BK policies in production in our DC (not GKE) ---- 2018-03-12 23:04:07 UTC - Daniel Ferreira Jorge: ah, ok then @Matteo Merli! I totally misunderstood PIP-7, thanks! ---- 2018-03-12 23:06:17 UTC - Daniel Ferreira Jorge: @Joe Francis do you have any rough guidelines on how implement this? I just need an Idea... does not have to be specific to GKE ---- 2018-03-12 23:44:24 UTC - Joe Francis: @Daniel Ferreira Jorge see this <https://github.com/apache/incubator-pulsar/issues/151> from @Matteo Merli ---- 2018-03-13 04:05:23 UTC - david-jin: @david-jin uploaded a file: <https://apache-pulsar.slack.com/files/U9GGA4QE7/F9N5HKVA4/image.png|image.png> ---- 2018-03-13 04:05:32 UTC - david-jin: I always got such error ,but i test the port, that is ok ---- 2018-03-13 04:06:16 UTC - david-jin: @david-jin uploaded a file: <https://apache-pulsar.slack.com/files/U9GGA4QE7/F9PSAJGJ3/image.png|Untitled> ---- 2018-03-13 04:06:45 UTC - david-jin: i run in standalone mode. ----