2019-06-05 16:06:20 UTC - Joe: @Joe has joined the channel
----
2019-06-05 16:11:35 UTC - Joe: Hello
Just browsed through the code base to understand how Pulsar handled evolving
JsonSchema, from the function (in class JsonSchemaCompatibilityCheck)
```@Override
public boolean isCompatible(SchemaData from, SchemaData to,
SchemaCompatibilityStrategy strategy) {
if (isAvroSchema(from)) {
if (isAvroSchema(to)) {
// if both producer and broker have the schema in avro format
return super.isCompatible(from, to, strategy);
} else if (isJsonSchema(to)) {
// if broker have the schema in avro format but producer sent a
schema in the old json format
// allow old schema format for backwards compatiblity
return true;
} else {
// unknown schema format
return false;
}
} else if (isJsonSchema(from)){
if (isAvroSchema(to)) {
// if broker have the schema in old json format but producer
sent a schema in the avro format
// return true and overwrite the old format
return true;
} else if (isJsonSchema(to)) {
// if both producer and broker have the schema in old json
format
return isCompatibleJsonSchema(from, to);
} else {
// unknown schema format
return false;
}
} else {
// broker has schema format with unknown format
// maybe corrupted?
// return true to overwrite
return true;
}
}```
The function isCompatibleJsonSchema does not seem to take into account the
strategy, is it correct to say that Pulsar schema registry does not handle
other compatibility than equality for JsonSchema ?
----
2019-06-05 16:35:35 UTC - Sijie Guo: @Joe this piece of code is a bit hard to
understand due to legacy problem. `isAvroSchema` and `isJsonSchema` are all
about how does Pulsar store the schema definition. originally Pulsar was using
`JsonSchema` to store the schema definition for Json Structs. then it moved to
use `Avro` for storing the schema definition.
When it was using `JsonSchema` for storing the schema definition, there was no
strategy than equality for JsonSchema. After Pulsar moved to use Avro for
storing the schema definition, it then use the strategies that Avro provides
for compatibility checks.
Hope this clarify the code there.
----
2019-06-05 16:49:51 UTC - Kendall Magesh-Davis: On Kubernetes, we’ve had a
pulsar cluster running for some time and it has been going well. Just today, we
ran into an issue where one of the ZK nodes is stuck in a CrashLoopBackOff.
Details from that pod (0):
```
[conf/pulsar_env.sh] Applying config PULSAR_GC = "-XX:+UseG1GC
-XX:MaxGCPauseMillis=10"
[conf/pulsar_env.sh] Applying config PULSAR_MEM = "-Xms3g -Xmx3g
-Dcom.sun.management.jmxremote -Djute.maxbuffer=10485760
-XX:+ParallelRefProcEnabled -XX:+UnlockExperimentalVMOptions
-XX:+AggressiveOpts -XX:+DoEscapeAnalysis -XX:+DisableExplicitGC
-XX:+PerfDisableSharedMem -Dzookeeper.forceSync=no"
Current server id 1
Creating data/zookeeper/myid with id = 1
bin/generate-zookeeper-config.sh: line 54: data/zookeeper/myid: Input/output
error```
From the other ZK pods:
(1)
```
16:45:16.880 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181] INFO
org.apache.zookeeper.server.NIOServerCnxnFactory - Accepted socket connection
from /127.0.0.1:55192
16:45:16.880 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181] INFO
org.apache.zookeeper.server.NIOServerCnxn - Processing ruok command from
/127.0.0.1:55192
16:45:16.880 [Thread-1013369] INFO org.apache.zookeeper.server.NIOServerCnxn -
Closed socket connection for client /127.0.0.1:55192 (no session established
for client)
16:45:20.562 [ProcessThread(sid:2 cport:-1):] INFO
org.apache.zookeeper.server.PrepRequestProcessor - Got user-level
KeeperException when processing sessionid:0x30091960bf10005 type:delete
cxid:0x2c01e1 zxid:0x10009d9ea txntype:-1 reqpath:n/a Error
Path:/ledgers/00/0001 Error:KeeperErrorCode = Directory not empty for
/ledgers/00/0001
```
(2)
```16:45:39.899 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181] INFO
org.apache.zookeeper.server.NIOServerCnxnFactory - Accepted socket connection
from /127.0.0.1:54932
16:45:39.899 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181] INFO
org.apache.zookeeper.server.NIOServerCnxn - Processing ruok command from
/127.0.0.1:54932
16:45:39.899 [Thread-121786] INFO org.apache.zookeeper.server.NIOServerCnxn -
Closed socket connection for client /127.0.0.1:54932 (no session established
for client)
```
Any suggestions?
----
2019-06-05 16:50:37 UTC - Matteo Merli: Seems a problem with the ZK disk
----
2019-06-05 16:51:28 UTC - Matteo Merli: I’d suggest to just wipe this ZK
instance out, restart it empty and it will sync up with the rest of the ensemble
----
2019-06-05 16:51:29 UTC - Kendall Magesh-Davis: Worth noting: ```repository:
apachepulsar/pulsar-all
tag: 2.2.1```
----
2019-06-05 16:52:14 UTC - Kendall Magesh-Davis: It restarted itself like 65
times, then I did a force delete of that specific pod (StatefulSet). It
generated a new one with the same issue.
----
2019-06-05 16:52:39 UTC - Matteo Merli: Yes, but the data disk is mounted from
an EBS right?
----
2019-06-05 16:52:43 UTC - Kendall Magesh-Davis: correct
----
2019-06-05 16:52:55 UTC - Kendall Magesh-Davis: gp2 provisioner
----
2019-06-05 16:53:19 UTC - Matteo Merli: try either cleaning that volume or just
drop it (so that it will be re-created)
eyes : Kendall Magesh-Davis
----
2019-06-05 16:57:04 UTC - Kendall Magesh-Davis: I will look into that. Thanks
for the tip
----
2019-06-05 17:00:21 UTC - Joe: thanks a lot for your feedback, very much
appreciated
----
2019-06-05 17:02:29 UTC - Sijie Guo: You are welcome
----
2019-06-05 20:52:57 UTC - Sam Leung: Just want to confirm some things:
1. Partitioned topics can add more partitions seamlessly (but no removing
partitions).
2. Non-partitioned topics cannot be turned into partitioned topics.
----
2019-06-05 20:54:30 UTC - Matteo Merli: > 1. Partitioned topics can add more
partitions seamlessly (but no removing partitions).
Correct. it’s seamless unless you require to keep the ordering
> 2. Non-partitioned topics cannot be turned into partitioned topics.
Correct. It’s not possible at this point
----
2019-06-05 20:55:57 UTC - Sam Leung: Right, the partition distribution would
change. Thanks!
----
2019-06-05 21:11:51 UTC - Addison Higham: hrm, so I wanted to understand
something better, with s3-offload, it becomes tempting to keep many topics
around for a long time (like just turn off trim completely). I understand how
that works with the broker and bookies making that happen, but those ledgers
still have metadata in ZK. Is there any ZK scaling issues if you keep around a
ton of ledgers? I know it uses the hierarchial ledger storage to perhaps make
that more scalable, but I can imagine that if you have really short lived
segments and have 0 trim, that must still get difficult at some point...
----
2019-06-05 21:18:54 UTC - Devin G. Bost: @Jerry Peng I figured out that I
hadn't set up my Spring context correctly, so it was running the wrong method.
I got a little further. It gets all the way down to `pulsar.start()`, which
blows up with this:
```
java.lang.NoSuchMethodError:
org.apache.pulsar.common.util.ObjectMapperFactory.create()Lcom/fasterxml/jackson/databind/ObjectMapper;
at
org.apache.pulsar.broker.cache.ResourceQuotaCache.<init>(ResourceQuotaCache.java:47)
at
org.apache.pulsar.broker.cache.LocalZooKeeperCacheService.<init>(LocalZooKeeperCacheService.java:122)
at
org.apache.pulsar.broker.PulsarService.startZkCacheService(PulsarService.java:554)
at org.apache.pulsar.broker.PulsarService.start(PulsarService.java:348)
at
com.overstock.dataeng.pulsar.deployment.test.TopologyTests.test_zookeeper_locally(TopologyTests.java:443)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
. . .
```
----
2019-06-05 21:24:12 UTC - Devin G. Bost: Could it be an issue with my
dependencies?
----
2019-06-05 21:24:37 UTC - Devin G. Bost: I'm using fasterxml elsewhere, and I
think it's a different version than what Pulsar is using.
----
2019-06-05 21:50:59 UTC - Jerry Peng: the issue is this method:
org.apache.pulsar.common.util.ObjectMapperFactory.create()Lcom/fasterxml/jackson/databind/ObjectMapper
----
2019-06-05 21:51:19 UTC - Jerry Peng: are you using a shaded version any pulsar
modules?
----
2019-06-05 21:51:27 UTC - Jerry Peng: like pulsar-client or pulsar-client-admin?
----
2019-06-05 21:52:03 UTC - Jerry Peng: if you are use pulsar-client-original or
pulsar-client-admin-original which are not shaded
----
2019-06-05 22:41:37 UTC - Matteo Merli: Correct, of course there are limits,
but in practice that would be very high. Also, you can control the ledgers
rollover periods and max-sizes, in order to have less (and bigger) ledgers
----
2019-06-05 23:03:38 UTC - Devin G. Bost: Here are the dependencies in my POM:
----
2019-06-05 23:04:19 UTC - Devin G. Bost:
----
2019-06-05 23:05:47 UTC - Devin G. Bost: `pulsar-client` and
`pulsar-client-admin` are in there. Are you saying that I should use:
`pulsar-client-original` and `pulsar-client-admin-original` instead?
----
2019-06-05 23:08:30 UTC - Devin G. Bost: @Jerry Peng It sounds like you're
saying that I should be depending on modules that are not shaded. Is that right?
----
2019-06-05 23:10:09 UTC - Jerry Peng: yup your pulsar-client and
pulsar-admin-client should be changed to pulsar-client-original and
pulsar-admin-client-original
----
2019-06-05 23:10:17 UTC - Jerry Peng: respectively
----