Slack digest for #general - 2020-01-24

Apache Pulsar Slack Fri, 24 Jan 2020 01:11:25 -0800

2020-01-23 10:12:45 UTC - Alexandre DUVAL: Hi, update clusters configurations 
files differences/possibilities can be so long and not fun, there is special 
tooling about that?
----
2020-01-23 10:20:22 UTC - Alexandre DUVAL: I mean something like a script which 
format, and add new configurations or rework older interactively, like `New 
feature XXX, activate it with default value yyy? y/n`, `Previous configuration 
xxx is now yyy with value zzz, confirm y/n?`  WDYT?
----
2020-01-23 10:24:55 UTC - Bo Han: Help needed! How can choose java8 and rerun 
the test on github CI?
----
2020-01-23 10:29:57 UTC - Yong Zhang: Only committee can do this for now. 
----
2020-01-23 10:37:40 UTC - darshan: @darshan has joined the channel
----
2020-01-23 12:53:05 UTC - Bo Han: How can I choose the java version to rerun 
the test in this case? I was asked to rerun test in java8:joy:
----
2020-01-23 13:41:37 UTC - Naby: I just checked the python client code in branch 
2.5 and noticed that `pattern_auto_discovery_period` has not been fixed in 
`pulsar.Client.subscribe`. The value of `pattern_auto_discovery_period` has no 
effect on consumer configuration.
----
2020-01-23 14:32:53 UTC - Guilherme Perinazzo: I'm getting the following 
exception trying to start a 2.5 bookie: 
`io.netty.util.internal.OutOfDirectMemoryError: failed to allocate 16777216 
byte(s) of direct memory (used: 2147483648, max: 2147483648)`
----
2020-01-23 14:40:29 UTC - Guilherme Perinazzo: Do I need to increase the max 
direct memory setting? It worked fine at 4gb for 2.4.1
----
2020-01-23 14:42:10 UTC - Roman Popenov: I think memory usage is JVM memory + 
direct memory
----
2020-01-23 14:44:06 UTC - Roman Popenov: so if your Xms is set to Xmx heap  and 
is set to 1g + 1 g of direct memory, you will hit the cap of your memory if it 
is set something close to 2 g
----
2020-01-23 14:45:14 UTC - Roman Popenov: I normally set xms to 3/4 of the xmx + 
whatever direct memory and the total memory is slightly higher than xmx + direct
----
2020-01-23 14:45:15 UTC - Guilherme Perinazzo: The server has about 8GB 
available, I don't know why it's dying at 2
----
2020-01-23 14:45:27 UTC - Roman Popenov: :thinking_face:
----
2020-01-23 14:45:50 UTC - Roman Popenov: Well, it failed to allocate 16777216 
bytes
----
2020-01-23 14:46:05 UTC - Guilherme Perinazzo: I'm using the default setting on 
the kubernetes manifest available at the pulsar repo, `-Xms4g -Xmx4g 
-XX:MaxDirectMemorySize=4g`
----
2020-01-23 14:47:29 UTC - Guilherme Perinazzo: Let me try to tweak these
----
2020-01-23 14:49:57 UTC - Roman Popenov: `-Xms2g -Xmx4g 
-XX:MaxDirectMemorySize=3g` with 8 GB for a broker and bookie pod for a rate of 
about ~400k messages a second
----
2020-01-23 14:51:34 UTC - Roman Popenov: I would experiment with each to see 
the results, but for simplicity of config, this is what I had tried
----
2020-01-23 14:52:06 UTC - Roman Popenov: Setting Xms = Xmx often lead to pod 
evictions
----
2020-01-23 14:52:38 UTC - Roman Popenov: You might be able to play with 
resource quotas, but I didn’t look into that part yet and don’t understand it 
fully
----
2020-01-23 14:53:29 UTC - Guilherme Perinazzo: Changed it to that, still 
crashes with the same error during initialization
----
2020-01-23 14:53:54 UTC - Guilherme Perinazzo: Is there anything else I need to 
change to upgrade a bookie to 2.5.0?
----
2020-01-23 14:54:17 UTC - Roman Popenov: Same exact numbers?
----
2020-01-23 14:54:47 UTC - Guilherme Perinazzo: yeah
----
2020-01-23 14:55:47 UTC - Guilherme Perinazzo: It fails with the same 
allocation size, and shows the same values for used/max every time
----
2020-01-23 14:56:00 UTC - Guilherme Perinazzo: I have no idea where the JVM is 
getting this 2GB limit from
----
2020-01-23 14:57:22 UTC - Guilherme Perinazzo: If I roll it back to 2.4.1, it 
starts normally
----
2020-01-23 14:57:43 UTC - Guilherme Perinazzo: Do I need to upgrade to 2.4.2 
first before 2.5.0?
----
2020-01-23 14:57:48 UTC - Roman Popenov: 2.4.2 was working as well, that’s the 
version I used for testing
----
2020-01-23 14:58:21 UTC - Guilherme Perinazzo: let me try 2.4.2, but I wanted 
to upgrade because there's some things from 2.5 I wanted to use
----
2020-01-23 14:59:10 UTC - Roman Popenov: There are a few issues in 2.5 
unfortunately, mainly some little things with helm charts, few other minor 
bugs, so I am waiting for kinks to be ironed out before upgrading.
----
2020-01-23 14:59:38 UTC - Guilherme Perinazzo: I'm not using the helm chart, 
I'm upgrading an existing cluster
----
2020-01-23 15:00:11 UTC - Guilherme Perinazzo: 2.4.2 seems to be working
+1 : Roman Popenov
----
2020-01-23 15:00:19 UTC - Roman Popenov: I was just pointing out that despite a 
point release and not a major release, there were things that have changed 
significantly
----
2020-01-23 15:00:34 UTC - Roman Popenov: For the memory issue, I think you 
might need to log in a bug
----
2020-01-23 15:00:48 UTC - Guilherme Perinazzo: Yeah, I'll post it on github, 
thanks!
----
2020-01-23 16:17:44 UTC - Miroslav Prymek: @Miroslav Prymek has joined the 
channel
----
2020-01-23 16:24:38 UTC - Miroslav Prymek: Hi! I’m trying to run Pulsar cluster 
components separately to learn about the architecture and I’ve discovered that 
when running standalone, there’s bookkeeper table service on port 4181 but when 
I run `bin/pulsar bookie` , it’s not started and I can’t find any info about 
how to start it. (I’m experimenting inside a stsandard docker container 
`apachepulsar/pulsar:2.5.0`) Could you please point me in the right direction? 
Thanks
----
2020-01-23 16:32:40 UTC - Miroslav Prymek: …also, it seems very unfortunate to 
me that `put_state`  and `incr_counter`  do not raise any error when there’s no 
state storage. `NullStateContext`  just does nothing or returns None 
(<https://github.com/apache/pulsar/blob/master/pulsar-functions/instance/src/main/python/state_context.py#L91>)
I think a function should fail somehow if it tries to do something that can’t 
be done…
----
2020-01-23 17:09:06 UTC - Roman Popenov: Anyone having proxy issues with 2.5.0?
I am seeing:


----
2020-01-23 17:09:06 UTC - Roman Popenov: ```17:03:22.064 
[pulsar-external-web-4-5] INFO  org.eclipse.jetty.server.RequestLog - 10.0.1.96 
- - [23/Jan/2020:17:03:22 +0000] "GET /admin/v2/clusters HTTP/1.1" 403 334 "-" 
"Apache-HttpClient/4.5.5 (Java/1.8.0_212)" 1
17:03:52.087 [pulsar-external-web-4-5] WARN  
org.apache.pulsar.proxy.server.AdminProxyHandler - [10.0.1.96:52792] Failed to 
get next active broker No active broker is available
org.apache.pulsar.broker.PulsarServerException: No active broker is available
        at 
org.apache.pulsar.proxy.server.BrokerDiscoveryProvider.nextBroker(BrokerDiscoveryProvider.java:94)
 ~[org.apache.pulsar-pulsar-proxy-2.5.0.jar:2.5.0]
        at 
org.apache.pulsar.proxy.server.AdminProxyHandler.rewriteTarget(AdminProxyHandler.java:272)
 [org.apache.pulsar-pulsar-proxy-2.5.0.jar:2.5.0]
        at org.eclipse.jetty.proxy.ProxyServlet.service(ProxyServlet.java:62) 
[org.eclipse.jetty-jetty-proxy-9.4.20.v20190813.jar:9.4.20.v20190813]
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) 
[javax.servlet-javax.servlet-api-3.1.0.jar:3.1.0]
        at 
org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:852) 
[org.eclipse.jetty-jetty-servlet-9.4.20.v20190813.jar:9.4.20.v20190813]
        at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:544) 
[org.eclipse.jetty-jetty-servlet-9.4.20.v20190813.jar:9.4.20.v20190813]
        at 
org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)
 [org.eclipse.jetty-jetty-server-9.4.20.v20190813.jar:9.4.20.v20190813]
        at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1581)
 [org.eclipse.jetty-jetty-server-9.4.20.v20190813.jar:9.4.20.v20190813]
        at 
org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)
 [org.eclipse.jetty-jetty-server-9.4.20.v20190813.jar:9.4.20.v20190813]
        at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1307)
 [org.eclipse.jetty-jetty-server-9.4.20.v20190813.jar:9.4.20.v20190813]
        at 
org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)
 [org.eclipse.jetty-jetty-server-9.4.20.v20190813.jar:9.4.20.v20190813]
        at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:482) 
[org.eclipse.jetty-jetty-servlet-9.4.20.v20190813.jar:9.4.20.v20190813]
        at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1549)
 [org.eclipse.jetty-jetty-server-9.4.20.v20190813.jar:9.4.20.v20190813]
        at 
org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186)
 [org.eclipse.jetty-jetty-server-9.4.20.v20190813.jar:9.4.20.v20190813]
        at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1204)
 [org.eclipse.jetty-jetty-server-9.4.20.v20190813.jar:9.4.20.v20190813]
        at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) 
[org.eclipse.jetty-jetty-server-9.4.20.v20190813.jar:9.4.20.v20190813]
        at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:221)
 [org.eclipse.jetty-jetty-server-9.4.20.v20190813.jar:9.4.20.v20190813]
        at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:146)
 [org.eclipse.jetty-jetty-server-9.4.20.v20190813.jar:9.4.20.v20190813]
        at 
org.eclipse.jetty.server.handler.StatisticsHandler.handle(StatisticsHandler.java:173)
 [org.eclipse.jetty-jetty-server-9.4.20.v20190813.jar:9.4.20.v20190813]
        at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) 
[org.eclipse.jetty-jetty-server-9.4.20.v20190813.jar:9.4.20.v20190813]
        at org.eclipse.jetty.server.Server.handle(Server.java:494) 
[org.eclipse.jetty-jetty-server-9.4.20.v20190813.jar:9.4.20.v20190813]
        at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:374) 
[org.eclipse.jetty-jetty-server-9.4.20.v20190813.jar:9.4.20.v20190813]
        at 
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:268) 
[org.eclipse.jetty-jetty-server-9.4.20.v20190813.jar:9.4.20.v20190813]
        at 
<http://org.eclipse.jetty.io|org.eclipse.jetty.io>.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)
 [org.eclipse.jetty-jetty-io-9.4.20.v20190813.jar:9.4.20.v20190813]
        at 
<http://org.eclipse.jetty.io|org.eclipse.jetty.io>.FillInterest.fillable(FillInterest.java:103)
 [org.eclipse.jetty-jetty-io-9.4.20.v20190813.jar:9.4.20.v20190813]
        at 
<http://org.eclipse.jetty.io|org.eclipse.jetty.io>.ChannelEndPoint$2.run(ChannelEndPoint.java:117)
 [org.eclipse.jetty-jetty-io-9.4.20.v20190813.jar:9.4.20.v20190813]
        at 
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336)
 [org.eclipse.jetty-jetty-util-9.4.20.v20190813.jar:9.4.20.v20190813]
        at 
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313)
 [org.eclipse.jetty-jetty-util-9.4.20.v20190813.jar:9.4.20.v20190813]
        at 
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171)
 [org.eclipse.jetty-jetty-util-9.4.20.v20190813.jar:9.4.20.v20190813]
        at 
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129)
 [org.eclipse.jetty-jetty-util-9.4.20.v20190813.jar:9.4.20.v20190813]
        at 
org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:367)
 [org.eclipse.jetty-jetty-util-9.4.20.v20190813.jar:9.4.20.v20190813]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
[?:1.8.0_232]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[?:1.8.0_232]
        at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
 [io.netty-netty-common-4.1.43.Final.jar:4.1.43.Final]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_232]
17:03:52.088 [pulsar-external-web-4-5] INFO  
org.eclipse.jetty.server.RequestLog - 10.0.1.96 - - [23/Jan/2020:17:03:52 
+0000] "GET /admin/v2/clusters HTTP/1.1" 403 334 "-" "Apache-HttpClient/4.5.5 
(Java/1.8.0_212)" 1```
----
2020-01-23 17:17:27 UTC - Lukas Chripko: @Roman Popenov hi roman, does the 
error persist after proxy restart? I had the similar issue turns out it this 
one: <https://github.com/apache/pulsar/issues/5994>
----
2020-01-23 17:20:49 UTC - Roman Popenov: Let me try kicking the proxy pod and 
see if that resolved the issues
----
2020-01-23 17:47:22 UTC - Guilherme Perinazzo: @Roman Popenov Setting 
dbStorage_readAheadCacheMaxSizeMb and dbStorage_writeCacheMaxSizeMb to 512 each 
allowed it to initialize, and setting it to 2048 (like in the default helm 
chart) makes it crash saying that the cache size is bigger than the max direct 
memory (even after setting `-XX:MaxDirectMemorySize`  to 14g).
Maybe something is wrong with how netty is evaluating the max direct memory 
size? This is the error I'm hitting: 
<https://github.com/apache/bookkeeper/blob/master/bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/storage/ldb/DbLedgerStorage.java#L114>
----
2020-01-23 17:51:33 UTC - Roman Popenov: You can potentially see the 
PlatformDependent.maxDirectMemory() number
----
2020-01-23 17:51:48 UTC - Roman Popenov: ```<http://log.info|log.info>("Started 
Db Ledger Storage");
        <http://log.info|log.info>(" - Number of directories: {}", 
numberOfDirs);
        <http://log.info|log.info>(" - Write cache size: {} MB", 
writeCacheMaxSize / MB);
        <http://log.info|log.info>(" - Read Cache: {} MB", readCacheMaxSize / 
MB);```
----
2020-01-23 17:51:52 UTC - Roman Popenov: Did you look for those?
----
2020-01-23 17:54:41 UTC - Guilherme Perinazzo: ```17:42:51.895 [main] INFO  
org.apache.bookkeeper.bookie.storage.ldb.DbLedgerStorage - Started Db Ledger 
Storage
17:42:51.895 [main] INFO  
org.apache.bookkeeper.bookie.storage.ldb.DbLedgerStorage -  - Number of 
directories: 1
17:42:51.895 [main] INFO  
org.apache.bookkeeper.bookie.storage.ldb.DbLedgerStorage -  - Write cache size: 
512 MB
17:42:51.896 [main] INFO  
org.apache.bookkeeper.bookie.storage.ldb.DbLedgerStorage -  - Read Cache: 512 
MB```
----
2020-01-23 17:55:07 UTC - Guilherme Perinazzo: But that only gives me the size 
of the cache that I changed, it doesn't tell me what it's evaluating as 
maxDirectMemory
----
2020-01-23 17:55:54 UTC - Guilherme Perinazzo: looks like it may be picking up 
2gb as maxDirectMemory (from the OutOfDirectMemory exception that claims max to 
be 2gb)
----
2020-01-23 17:55:56 UTC - Roman Popenov: Can we guestimate from :
``` private static final long DEFAULT_WRITE_CACHE_MAX_SIZE_MB = (long) (0.25 * 
PlatformDependent.maxDirectMemory())
            / MB;
    private static final long DEFAULT_READ_CACHE_MAX_SIZE_MB = (long) (0.25 * 
PlatformDependent.maxDirectMemory())
            / MB;```
?
----
2020-01-23 17:56:34 UTC - Guilherme Perinazzo: Let me remove the config and 
check
----
2020-01-23 17:56:36 UTC - Roman Popenov: Then those numbers are used to 
calculate read and write
----
2020-01-23 17:57:13 UTC - Guilherme Perinazzo: I love the fact that i'm just 
playing around with this bookie and the rest of the cluster is still going 
strong
muscle : Roman Popenov, Karthik Ramasamy
stuck_out_tongue_closed_eyes : Roman Popenov, Karthik Ramasamy
----
2020-01-23 17:58:59 UTC - Guilherme Perinazzo: Yep, it defaults to 512mb, so 
it's seeing max direct memory as 2gb
----
2020-01-23 17:59:33 UTC - Guilherme Perinazzo: I have it set to 4gb right now, 
but my previous test was 14gb and it still failed
----
2020-01-23 17:59:55 UTC - Guilherme Perinazzo: Any idea what could make it 
ignore the setting?
----
2020-01-23 20:19:16 UTC - Guilherme Perinazzo: @Matteo Merli thanks for the 
quick reply on the issue, that PR was indeed the problem. I don't know if it's 
necessary to revert it, just need to set an extra environment variable for it 
to work. A note on the upgrade guide and an update to the default yaml's for 
kubernetes to include it should be enough.
----
2020-01-24 00:28:12 UTC - Siva: @Siva has joined the channel
----
2020-01-24 06:11:15 UTC - Atri Sharma: @Atri Sharma has joined the channel
----

Slack digest for #general - 2020-01-24

Reply via email to