Slack digest for #general - 2018-03-13

Apache Pulsar Slack Tue, 13 Mar 2018 02:11:55 -0700

2018-03-12 13:22:47 UTC - Daniel Ferreira Jorge: Hi! I'm getting a `Provider 
org.apache.pulsar.shade.org.glassfish.jersey.internal.RuntimeDelegateImpl could 
not be instantiated: java.lang.IllegalStateException: No generator was provided 
and there is no default generator registered` after instantiating a `import 
org.apache.pulsar.client.admin.PulsarAdmin`. In my pom I have `pulsar-client` 
and `pulsar-client-admin`, 1.22
----
2018-03-12 13:24:05 UTC - Daniel Ferreira Jorge: Do I need anything else as a 
dependency?
----
2018-03-12 13:28:52 UTC - Daniel Ferreira Jorge: @Daniel Ferreira Jorge 
uploaded a file: 
<https://apache-pulsar.slack.com/files/U8E1J0DHS/F9P6DN362/stack.txt|Stack>
----
2018-03-12 15:34:06 UTC - Daniel Ferreira Jorge: To make things work I had to 
use `pulsar-client-original` and `pulsar-client-admin-original` (what is the 
difference between original and "non-original"?) Also, I had to include 
`jackson-annotations` and `jackson-jaxrs-base` to my pom... is this correct?
----
2018-03-12 15:38:02 UTC - Daniel Ferreira Jorge: I think I found a small bug 
(unralated to the above): When I set `retentionTimeInMinutes=-1` and 
`retentionSizeInMB=-1` I cannot set any value to `BacklogQuota` because any 
value will give `Backlog Quota exceeds configured retention quota for namespace`
----
2018-03-12 16:20:37 UTC - Sijie Guo: @Daniel Ferreira Jorge: It seems that a 
shaded class is not found : “java.lang.ClassNotFoundException: Provider 
org.apache.pulsar.shade.org.glassfish.jersey.internal.RuntimeDelegateImpl could 
not be instantiated:”
----
2018-03-12 16:20:39 UTC - Sijie Guo: checking now
----
2018-03-12 16:21:27 UTC - Sijie Guo: &gt; what is the difference between 
original and “non-original”?


the “original” doesn’t shade any dependencies, while “non-original” shades the 
dependencies that pulsar client/admin-client is using.
----
2018-03-12 16:24:31 UTC - Daniel Ferreira Jorge: ok
----
2018-03-12 16:24:37 UTC - Daniel Ferreira Jorge: thanks
----
2018-03-12 16:25:15 UTC - Daniel Ferreira Jorge: I created a PR for the other 
bug I mentioned <https://github.com/apache/incubator-pulsar/pull/1368>
----
2018-03-12 16:26:42 UTC - Ivan Kelly: do we actually document that you should 
use -1 to disable anywhere?
----
2018-03-12 16:27:04 UTC - Daniel Ferreira Jorge: yes
----
2018-03-12 16:27:12 UTC - Daniel Ferreira Jorge: 1 sec
----
2018-03-12 16:27:53 UTC - Daniel Ferreira Jorge: 
<https://pulsar.apache.org/docs/latest/advanced/RetentionExpiry/#Retentionpolicies-lieva>
----
2018-03-12 16:28:09 UTC - Daniel Ferreira Jorge: "It is also possible to set 
infinite retention time or size, by setting -1 for either time or size 
retention."
----
2018-03-12 16:28:38 UTC - Daniel Ferreira Jorge: also 
<https://github.com/apache/incubator-pulsar/pull/1135>
----
2018-03-12 16:31:52 UTC - Ivan Kelly: ah
----
2018-03-12 17:19:53 UTC - Sijie Guo: @Daniel Ferreira Jorge I think the problem 
comes from a problem on shading plugin - 
<https://issues.apache.org/jira/browse/MSHADE-182> Class 
org.glassfish.hk2.extension.ServiceLocatorGenerator gets relocated but  
META-INF/services/org.glassfish.hk2.extension.ServiceLocatorGenerator file 
doesn’t, so the implementation can’t be looked up /cc @Matteo Merli
----
2018-03-12 17:23:51 UTC - Sijie Guo: I think pulsar-client-admin got shaded 
just in 1.22.0-incubating, so in 1.21.0-incubating, you are using the 
“non-shaded” dependency.
----
2018-03-12 17:29:14 UTC - Daniel Ferreira Jorge: I'm using the 1.22
----
2018-03-12 17:55:09 UTC - Stephen Shepherd: @Stephen Shepherd has joined the 
channel
----
2018-03-12 18:13:07 UTC - Sijie Guo: @Matteo Merli @Daniel Ferreira Jorge here 
is the fix : <https://github.com/apache/incubator-pulsar/pull/1370>
----
2018-03-12 18:15:06 UTC - Daniel Ferreira Jorge: @Sijie Guo Thank you!
----
2018-03-12 18:45:03 UTC - Stephen Shepherd: Newbie question - What keeps 
messages in a standalone cluster from being deleted?

I have a standalone cluster running in Docker. 1.21.0-incubating.  I am 
producing and consuming messages successfully, but no message appear to be 
deleted.
Retention related configs are all defaults (I have zero retention period and 
size, zero TTL).  I am assuming messages should be deleted as soon as acked.
The only config change was setting brokerDeleteInactiveTopicsEnabled=false.

I believe messages are not being deleted because I can reposition my consumer 
and read previously acknowledged messages.
Internal stats for the topic also show "numberOfEntries" remains unchanged, 
even after "markDeletePosition" and "readPosition" have moved.

Am I missing something else that indicates or controls when messages are 
removed?
----
2018-03-12 18:47:20 UTC - Matteo Merli: @Stephen Shepherd By being deleted, do 
you mean physically from disk or “logically” ?
----
2018-03-12 18:47:44 UTC - Matteo Merli: you can check `bin/admin persistent 
stats $MY_TOPIC` to get the stats
----
2018-03-12 18:48:35 UTC - Matteo Merli: in general, data is not immediately 
physically deleted from disk but rather deletions are done in bulk once the 
data is marked to be deleted
----
2018-03-12 18:52:11 UTC - Stephen Shepherd: @Matteo Merli  Thank you for the 
quick reply.  I was expecting physical deletion.  What controls the bulk delete 
timing?
----
2018-03-12 19:48:45 UTC - Matteo Merli: @Stephen Shepherd There are few layers 
here: 

 * First messages are stored in BookKeeper “ledgers”. Each ledger is an 
append-only replicated log and can only be deleted entirely. 
   So even if you consume few entries, the ledger won’t be deleted until all 
messages stored in that ledger are consumed and 
   acknowledged for all subscription (plus, eventually, the retention time). 
   Ledgers are rolled-over on a size and time basis and there are few tunable 
to set in `broker.conf`: 
    * `managedLedgerMaxEntriesPerLedger=50000`
    * `managedLedgerMinLedgerRolloverTimeMinutes=10`
    * `managedLedgerMaxLedgerRolloverTimeMinutes=240`

 * When a ledger is deleted, the bookies (storage nodes) won’t delete the data 
immediately. Rather they rely on 
   a garbage collection process. This GC runs periodically and checks for 
deleted ledgers and see if data on 
   disk can be removed. 
   Since there is no single file per-ledger, the bookie will compact the entry 
log files based on thresholds: 
    * Gargage collection time: `gcWaitTime=900000` (default is 15min)
      - All empty files are removed
    
    * Minor compaction -- Runs every 1h and compact all the files with &lt; 20% 
“valid” data
       - `minorCompactionThreshold=0.2`
       - `minorCompactionInterval=3600`
       
    * Major compaction -- Runs every 24h and compact all the files with &lt; 
50% “valid” data
       - `majorCompactionThreshold=0.5`
       - `majorCompactionInterval=86400`
----
2018-03-12 19:58:33 UTC - Stephen Shepherd: @Matteo Merli Thank you for the 
detailed response.  Very helpful!
----
2018-03-12 21:40:42 UTC - Daniel Ferreira Jorge: Hello again! Are there any 
docs regarding PIP-7?
----
2018-03-12 22:38:10 UTC - Joe Francis: @Daniel Ferreira Jorge Not at this point 
- what kind of information are you looking for?
----
2018-03-12 22:48:48 UTC - Daniel Ferreira Jorge: Hi @Joe Francis I'm looking on 
how to make sure that the messages and its replicas are not in the same 
availability zone on GKE
----
2018-03-12 22:50:56 UTC - Daniel Ferreira Jorge: suppose I have 3 bookies, 1 in 
each availability zone on Google Cloud and my Bookeeper Ensemble is set to 3. I 
want each replica on a different availability zone...
----
2018-03-12 22:51:54 UTC - Daniel Ferreira Jorge: is the affinity group a 
setting on each bookie conf?
----
2018-03-12 22:52:02 UTC - Joe Francis: Ah, ok. but PIP-7 works on brokers.  You 
might want to look into rack placemnet policy on bookkeeper
----
2018-03-12 22:52:48 UTC - Daniel Ferreira Jorge: ah... ok... I thought it was 
for everything
----
2018-03-12 22:53:25 UTC - Daniel Ferreira Jorge: hahaha now I have to set 2 
things
----
2018-03-12 22:54:22 UTC - Joe Francis: PIP-7 is a tweak to the load balancer. 
The load balancer would attempt to place the namespaces in a given AA group on 
different brokers.
----
2018-03-12 22:56:23 UTC - Joe Francis: If you want 3 different copies on 
different AZ,  you should do something like define each AZ as a rack, and then 
enable rack-aware placement in BK.  @Sijie Guo or @Matteo Merli ?
----
2018-03-12 22:57:02 UTC - Daniel Ferreira Jorge: I'm looking for a way to 
guarantee that pulsar will be working perfectly (brokers and data) if a whole 
zone is down
----
2018-03-12 22:59:22 UTC - Matteo Merli: @Daniel Ferreira Jorge For broker there 
should be nothing additional to do. PIP-7 was done to minimize the impact of a 
malfunctioning broker: if you spread the topics over more brokers (mixed with 
other topics from other tenants), one bad broker might impact 20% of traffic 
instead of 100%
----
2018-03-12 23:00:00 UTC - Matteo Merli: when running in multiple AZs, you just 
need to provision broker VMs in the the different AZs
----
2018-03-12 23:01:14 UTC - Matteo Merli: for bookies, there’s the BookKeeper 
rack-aware policy to configure. Admittedly we have not yet documented how to 
use it. :confused: That was pending for some time already, hopefully we can get 
to it very soon.
----
2018-03-12 23:02:52 UTC - Joe Francis: @Daniel Ferreira Jorge We use rack aware 
BK policies in production in  our DC (not GKE)
----
2018-03-12 23:04:07 UTC - Daniel Ferreira Jorge: ah, ok then @Matteo Merli! I 
totally misunderstood PIP-7, thanks!
----
2018-03-12 23:06:17 UTC - Daniel Ferreira Jorge: @Joe Francis do you have any 
rough guidelines on how implement this? I just need an Idea... does not have to 
be specific to GKE
----
2018-03-12 23:44:24 UTC - Joe Francis: @Daniel Ferreira Jorge see this 
<https://github.com/apache/incubator-pulsar/issues/151> from @Matteo Merli
----
2018-03-13 04:05:23 UTC - david-jin: @david-jin uploaded a file: 
<https://apache-pulsar.slack.com/files/U9GGA4QE7/F9N5HKVA4/image.png|image.png>
----
2018-03-13 04:05:32 UTC - david-jin: I always got such error ,but i test the 
port, that is ok
----
2018-03-13 04:06:16 UTC - david-jin: @david-jin uploaded a file: 
<https://apache-pulsar.slack.com/files/U9GGA4QE7/F9PSAJGJ3/image.png|Untitled>
----
2018-03-13 04:06:45 UTC - david-jin: i run in standalone mode.
----

Slack digest for #general - 2018-03-13

Reply via email to