Slack digest for #general - 2020-01-15

Apache Pulsar Slack Wed, 15 Jan 2020 01:12:07 -0800

2020-01-14 09:31:24 UTC - Brian Doran: Hi.. just wanted to get some some 
opinion on our recent findings and if it is something that would have been 
expected.


We have been doing some performance testing over the last while and we were 
using creating our Producers as Avro Producers and sending 
org.apache.pulsar.client.api.schema.GenericRecord to Pulsar. We were not 
getting anything near the throughput numbers we were expecting/hoping for 
peaking at 266K records/s across 3 brokers.

We switched out Avro for JSON byte arrays and the throughput shot up to ~ 955 K 
records/s across 3 brokers.

The same data is used for both tests.

Is this something difference in throughput something that you would expect 
between the two types?

Any comment appreciated.

regards
Brian
----
2020-01-14 11:02:23 UTC - Wayne Robinson: @Wayne Robinson has joined the channel
----
2020-01-14 13:50:46 UTC - Roman Popenov: Quick question: can Pulsar Functions 
read messages in bulk and merge them?
----
2020-01-14 14:10:36 UTC - Thomas: @Thomas has joined the channel
----
2020-01-14 14:32:43 UTC - Roman Popenov: I see that there is WindowFunction API
----
2020-01-14 14:42:30 UTC - Roman Popenov: 
<https://github.com/streamnative/pulsar/issues/520>
----
2020-01-14 15:50:02 UTC - Sijie Guo: at the wire, it reads messages in bulk. 
but the messages was passed one by one to the function.

window function is used for processing batch of messages.
----
2020-01-14 15:50:46 UTC - Sijie Guo: Interesting. Have you observed any 
difference between CPU usages?
----
2020-01-14 15:51:03 UTC - Roman Popenov: Perfect, that’s exactly what I need 
:smile:
----
2020-01-14 16:11:37 UTC - Joe Francis: Whats the broker side stats vs. the 
client side stats?  Did you test with schemaValidation disabled ?
----
2020-01-14 16:47:06 UTC - Roman Popenov: How detailed the data provenance in 
Pulsar is?
----
2020-01-14 16:53:46 UTC - Sergii Zhevzhyk: Is it possible to define compression 
and/or batching in source connectors?
----
2020-01-14 17:09:04 UTC - Jordan Widdison: @Addison Higham thanks for getting 
back to me! Yeah we had started to brainstorm a few different strategies to use 
like the ones you identified here, but just wanted to make sure I wasn't crazy 
and that `nack` really wasn't a thing.
----
2020-01-14 17:10:24 UTC - Jordan Widdison: Its really interesting to me that 
its spelled out in the documentation in exactly the way I'd expect it to exist, 
but then is also not actually implemented.
----
2020-01-14 17:27:15 UTC - Brian Doran: Schema validation is off by default and 
we haven't turned it on.
----
2020-01-14 17:29:13 UTC - Brian Doran: we create newProducer(Schema)...
----
2020-01-14 17:47:59 UTC - Naby: :+1:
----
2020-01-14 17:56:35 UTC - Naby: I created an InfluxDB sink connector using 
pulsar-admin, but its status shows that it’s not running. I checked the log. It 
had failed to load a new schema. InfluxDB is a schemaless database. How come 
its connector needs a schema?
----
2020-01-14 18:41:27 UTC - Naby: Also, is it enough to define a schema and pass 
it when a pulsar producer is created or should it be uploaded to the topic 
using pulsar-admin before creating a built-in connector as well?
----
2020-01-14 19:15:49 UTC - Naby: I’ve done both and I am still getting error 
(“error” : “UNAVAILABLE: io exception”). Any suggestion?
----
2020-01-14 19:26:50 UTC - Naby: How about BytesSchema? How can I specify it to 
InfluxDB connector?
----
2020-01-14 20:08:39 UTC - Joshua Dunham: Hey Everyone, Can someone point me to 
any best practices guide for data format for pulsar ingestion? I'm thinking 
about Snappy-Avro but want to make sure that I balance compression at source, 
helping transport, and decompression at bookie storage, hurting CPU.
----
2020-01-14 20:22:29 UTC - David Kjerrumgaard: @Naby What is the command you 
used to create the InfluxDB sink connector?
----
2020-01-14 20:48:46 UTC - Ryan Samo: Hey guys, for a single topic, is there a 
limit to the number of subscriptions that can attach and consume messages?
----
2020-01-14 20:52:38 UTC - Ryan Samo: Subscription name limit I mean
----
2020-01-14 20:56:01 UTC - Naby: bin/pulsar-admin sinks create --tenant public 
--namespace default --name influxdb-test-sink --sink-type influxdb 
--sink-config-file ./connectors/io_influxdb_sink.yaml --inputs test-topics
----
2020-01-14 21:20:22 UTC - Patrick Chase: @Patrick Chase has joined the channel
----
2020-01-14 21:53:27 UTC - David Kjerrumgaard: @Ryan Samo Not to my knowledge
----
2020-01-14 21:54:13 UTC - David Kjerrumgaard: Can you share the error from the 
log that mentions the schema?
----
2020-01-14 21:54:14 UTC - Ryan Samo: @David Kjerrumgaard ok thanks, I couldn’t 
find one in the docs or codebase. I appreciate it
----
2020-01-14 22:10:21 UTC - Wayne Robinson: Hi everyone, I’m pretty new to Pulsar 
so I apologize in advance for my potentially newbie questions. 
----
2020-01-14 22:13:48 UTC - Wayne Robinson: I’m trying to spin up a config for a 
redundant, but relatively small (sub 2,000 messages per second) cluster. Is it 
OK to collocate the brokers with the bookies and Zookeeper? 

E.g. 3 bookies each with Zookkeeper+Bookkeeper+broker?

Is there much downside to this if, say, we want to do a zero-downtime migration 
to a larger, separated cluster later?
----
2020-01-14 22:21:51 UTC - Chris Bartholomew: @Wayne Robinson you can easily 
move the brokers later, because they are stateless. Zookeeper and BookKeeper 
need to store state, so they are harder to move around. If you are using 
Kubernetes with persistent volume claims (PVC) for storage, that is 
straightfoward. If you are on bare metal, then you will have to migrate some 
data around. The easiest would be the Zookeeper data.
----
2020-01-14 22:21:52 UTC - David Kjerrumgaard: @Wayne Robinson Do you intend for 
this to be a production cluster?
----
2020-01-14 22:23:24 UTC - Wayne Robinson: It will be on EC2. Was planning 
either EBS-backed or NVMe. For moving bookies later (on NVMe) would it be a 
matter of just spinning up the new ones and draining the old? 
----
2020-01-14 22:23:45 UTC - Wayne Robinson: Yes. Would be for production. 
----
2020-01-14 22:32:54 UTC - Chris Bartholomew: If you store the Zookeeper data on 
separate EBS volumes, then it will be easier to migrate them to other nodes 
later. I would keep the BookKeeper on NVMe backed EC2. If you need more 
storage, you can add BookKeeper nodes. If you want to get rid of the original 
BookKeeper nodes, yes you should be able to add new ones and drain the old ones 
off. It might take a while, though.
----
2020-01-14 22:36:06 UTC - Wayne Robinson: Thanks. :-)

Regarding BookKeeper. Is there some docs on how to backup each node? Can I just 
copy from its storage locations when backing up without doing anything to 
freeze its state? 

For all it’s guarantees about redundancy, I can’t see anything talking about 
the worst happening and that you’d need to restore from backup. 
----
2020-01-14 22:36:49 UTC - David Kjerrumgaard: The one issue you run into when 
co-locating all the services, is the ability for the cluster to sustain a 
single EC2 node outage. You would lose your ZK quorum if that were to happen in 
this unlikely scenario
----
2020-01-14 22:37:56 UTC - Wayne Robinson: But wouldn’t that be the case with 
just 3 ZK nodes anyway if they were running separately?
----
2020-01-14 22:41:22 UTC - Wayne Robinson: And doesn’t ZK keep its quorum when 2 
out of 3 nodes is still up?
----
2020-01-14 22:41:29 UTC - Chris Bartholomew: As for backing up BookKeeper, 
assuming you have configured  message replications &gt; 1, then the cluster has 
multiple copies of the message. In the event of failure, BookKeeper will 
automatically recover the lost data from the other nodes in the cluster.
----
2020-01-14 22:41:53 UTC - David Kjerrumgaard: Yes, it is just more likely to 
happen with other services running on the same nodes. It is best to isolate the 
processes if possible, such as K8s.
----
2020-01-14 22:42:18 UTC - David Kjerrumgaard: maybe use AWS EKS and the open 
source helm chart to mitigate some of these issues.
----
2020-01-14 22:43:08 UTC - Wayne Robinson: But what if all the BK nodes 
disappear. Or all the data on them. Whilst I’m pretty good these days, I can 
point to a lot of human error situations (both manually and software bugs) that 
effectively wipe out an entire cluster. 
----
2020-01-14 22:44:02 UTC - Chris Bartholomew: Right, you can't survive all BK 
nodes failing
----
2020-01-14 22:44:21 UTC - Chris Bartholomew: I would do periodic snapshots of 
the EBS volumes to cover that case
----
2020-01-14 22:45:24 UTC - Wayne Robinson: But if we’re using NVMe storage we 
can’t snapshot. Happy to schedule BK backups, but not really sure of the 
process and the manual seems silent on it. 
----
2020-01-14 22:46:38 UTC - Chris Bartholomew: I agree with David, using 
Kubernetes (EKS) would make things easier.
----
2020-01-14 22:47:25 UTC - Wayne Robinson: One of the reasons we want to control 
the number of instances is the HA requirement to support multiple regions. So 
ultimately we’d end up with 6 copies of everything anyway. We could probably 
tighten up the admin portion of changing each region separately, but we still 
may have software bugs that delete everything. :-)
----
2020-01-14 22:49:41 UTC - Wayne Robinson: We’d probably use ECS (as it’s the 
thing we have the most familiarity with). And I guess at less than 2,000 4KB 
messages a second we wouldn’t hit bandwidth limits using EBS as the backend 
store. And then we migrate to a more stand-alone solution as we approach the IO 
limits of the cluster. 
----
2020-01-14 22:50:28 UTC - David Kjerrumgaard: @Wayne Robinson BookKeeper is a 
distributed storage layer, so it is assumed that the replication within BK 
itself eliminates the need for backups. Similar to HDFS, which doesn't have 
this either.  The best you can do is backup the underlying data volumes but it 
doesn't really help much.
----
2020-01-14 22:51:12 UTC - Wayne Robinson: Backups are rarely about HA, they’re 
about disaster recovery. Just like RAID is not a backup, so too is a HA storage 
cluster not a backup. 
----
2020-01-14 22:54:03 UTC - David Kjerrumgaard: Right, so a D/R scenario where 
you lost all of your BK nodes at the same time is highly unlikely. And Pulsar 
itself will stop working if it cannot persist the required number of copies of 
the data. This would cause producers to stop sending messages so message loss 
is effectively minimized in this scenario.
----
2020-01-14 22:55:44 UTC - David Kjerrumgaard: Now if someone is able to bring 
down all of your BK nodes in a single command, then that data is lost. But so 
would any data that hadn't been backed up either. So even if you snapshot every 
hour, you could still lose 59 min worth of data anyways
----
2020-01-14 22:57:35 UTC - Wayne Robinson: That’s all well and good until 
someone or something fat-fingers something like `pulsar-admin persistent delete 
--force` `<persistent://tenant/ns1/tp1|persistent://tenant/ns1/tp1> 😅`
----
2020-01-14 22:58:08 UTC - Wayne Robinson: Losing 59 minutes is better than 12 
months. :-)
----
2020-01-14 22:59:07 UTC - Wayne Robinson: We might look at a solution that 
subscribes separately to all topics and makes a continues write-only stream or 
similar to handle our DR case. 
----
2020-01-14 22:59:37 UTC - Chris Bartholomew: For DR, you can geo-replicate to a 
remote cluster
----
2020-01-14 22:59:55 UTC - Wayne Robinson: No wait… spending time on that is 
much more expensive and complicated than just setting up EBS to snapshot every 
minute and take the performance penalty of using EBS over NVMe. 
----
2020-01-14 23:01:01 UTC - Chris Bartholomew: EBS volumes have 99.999% 
availability, so it's pretty had to lose them unless you delete it. And if you 
delete it, that's what snapshots are for.
----
2020-01-14 23:01:21 UTC - Wayne Robinson: @Chris Bartholomew but that still 
doesn’t stop software from doing a delete similar to the above. I’m not 
concerned about maliciousness, but I’ve written enough code and managed enough 
systems to see things like that legitimately happen to production systems by 
accident.  
----
2020-01-14 23:02:23 UTC - Wayne Robinson: I’m not really concerned about the 
reliability of the hardware and AWS platform as a whole. Just my own human 
frailties. :-)
----
2020-01-14 23:03:38 UTC - Chris Bartholomew: Well, snapshots should help 
:slightly_smiling_face:
----
2020-01-14 23:04:12 UTC - Chris Bartholomew: Unless you delete the snapshots 
too :slightly_smiling_face:
----
2020-01-14 23:06:16 UTC - David Kjerrumgaard: @Wayne Robinson One thing to bear 
in mind is that Pulsar isn't intended to be used as a long-term storage 
platform. Messages that are 12 months old will have been deleted due to message 
retention policies, etc.
----
2020-01-14 23:06:41 UTC - Wayne Robinson: I thought Pulsar absolutely was 
designed as a permanent storage platform.  
----
2020-01-14 23:06:49 UTC - David Kjerrumgaard: It is intended to only keep 
messages around long enough for all subscribers to consume them :smiley:
----
2020-01-14 23:07:47 UTC - David Kjerrumgaard: It can be, but you have to 
utilize tiered-storage for that.  which is NOT the default configuration. 
Pulsar is first and foremost a messaging platform.
----
2020-01-14 23:07:51 UTC - Wayne Robinson: That’s actually the primary reason 
why we’ve been looking at it as a solution. 
----
2020-01-14 23:08:04 UTC - Wayne Robinson: Including tiered storage. 
----
2020-01-14 23:08:05 UTC - Wayne Robinson: Yes. 
----
2020-01-14 23:08:12 UTC - David Kjerrumgaard: Is your use case event storage 
based?
----
2020-01-14 23:08:48 UTC - Wayne Robinson: It’s primary queue based, with the 
requirement to have a user potentially rewind that queue and reprocess for an 
arbitrary length of time. 
----
2020-01-14 23:09:27 UTC - Wayne Robinson: And the ability to lookup old 
messages. 
----
2020-01-14 23:09:44 UTC - Wayne Robinson: Both historical access patterns are 
very low usage. 
----
2020-01-14 23:09:52 UTC - David Kjerrumgaard: Gotcha, then you will need to 
configure tiered-storage and retention policies on the topics to support this 
behavior.
----
2020-01-14 23:09:57 UTC - Wayne Robinson: But it would be the primary source of 
truth for these messages. 
----
2020-01-14 23:10:22 UTC - Wayne Robinson: No problem. :-)
----
2020-01-14 23:11:51 UTC - David Kjerrumgaard: Ok, that is not an uncommon use 
case, just not the one supported "out-of-the-box".  By default, Pulsar keeps 
10GB of data per topic until it starts deleting older messages.  :smiley:
----
2020-01-14 23:12:11 UTC - Wayne Robinson: It would be cool if tiered storage 
worked by just constantly streaming the logs up for each time they get closed 
off and then deleted after some timeout versus manually. But it’s still a cool 
feature that seems like it will save us some effort. 
----
2020-01-14 23:13:08 UTC - David Kjerrumgaard: That might be a good feature 
request to add. More configurable policies around triggering tiered-storage 
offloads
----
2020-01-14 23:13:19 UTC - Wayne Robinson: Or just literally used the multi-part 
upload feature of S3 to stream data up in bactches of 5MB over time. 
----
2020-01-14 23:13:45 UTC - Wayne Robinson: Closing off each upload when the 
journal file gets closed off. 
thinking_face : David Kjerrumgaard
----
2020-01-14 23:21:16 UTC - Wayne Robinson: 2GB per journal as a default size 
seems a little high. I’m not sure if the performance of 100 2GB files would be 
much different to 2000 100MB files. But In either case, streaming up 5MB at a 
time as this file grows would even out the bandwidth requirements of tiering 
the storage and then you just close off the S3 multi-part upload when you close 
off the journal in BK. 
----
2020-01-14 23:22:07 UTC - Wayne Robinson: The next stage is working out how to 
only store a single copy of data with tiered storage, rather than one per BK 
node… but that feels like a much much harder problem to solve with how BK 
works. 
----
2020-01-14 23:23:16 UTC - David Kjerrumgaard: A journal would have the contents 
of several topics inter-weaved together, so you could lose track of which topic 
the messages belonged to, etc.
----
2020-01-14 23:23:36 UTC - Wayne Robinson: Yup. Much harder. :-)
----
2020-01-14 23:24:49 UTC - David Kjerrumgaard: That's why the tiered storage 
works at the ledger level. It retains that info and makes the storage interface 
the same regardless of the underlying storage mechanism, a ledger is a ledger 
no matter where it lives.
----
2020-01-15 00:53:59 UTC - George Hafiz: @George Hafiz has joined the channel
----
2020-01-15 07:42:41 UTC - Fernando: Is anyone familiar with connecting an 
external presto cluster to pulsar? The instructions are not very clear:
```Query Pulsar from Existing Presto Cluster
If you already have an existing Presto cluster, you can copy Presto Pulsar 
connector plugin to your existing cluster. You can download the archived plugin 
package via:

$ wget 
<https://archive.apache.org/dist/pulsar/pulsar-2.4.2/apache-pulsar-2.4.2-bin.tar.gz>```
Do I copy this file to the presto cluster? if so, where?
----

Slack digest for #general - 2020-01-15

Reply via email to