2018-07-03 14:07:23 UTC - Idan: hi guys
----
2018-07-03 14:07:26 UTC - Idan: iam getting this:
----
2018-07-03 14:07:27 UTC - Idan: Error init producer for topic MyQueue error 
java.lang.RuntimeException: Error creating producer Namespace missing local 
cluster name in clusters list: local_cluster=pulasr-pulsar-cluster 
ns=public/default clusters=[pulsar-pulsar]
----
2018-07-03 14:07:44 UTC - Idan: i guess iam using diff namespace. how can I 
modify the java client to use diff namespace?
----
2018-07-03 14:21:44 UTC - Idan: where do I set in my java client producer’s code
----
2018-07-03 14:21:44 UTC - Idan: this:
----
2018-07-03 14:21:46 UTC - Idan: 
<persistent://pulsar-pulsar-cluster/public/default>
----
2018-07-03 14:21:49 UTC - Idan: not sure i get this via the api
----
2018-07-03 14:53:27 UTC - Idan: ok found out how. had to do it on the topic 
input
----
2018-07-03 15:40:43 UTC - Idan: Hi guys, we created very basic pulsar cluster. 
just to check sanity we send and consumed few bets (around 10 bets) we see 
pretty much latency statistics.. any idea?
----
2018-07-03 15:40:49 UTC - Idan: 2018-07-03 15:37:52,162 INFO  
org.apache.pulsar.client.impl.ProducerStatsRecorderImpl - 
[<persistent://pulsar-pulsar-cluster/public/default/Queue>] 
[pulasr-pulsar-cluster-0-3] Pending messages: 0 --- Publish throughput: 0.33 
msg/s --- 0.00 Mbit/s --- Latency: med: 7.342 ms - 95pct: 31.354 ms - 99pct: 
31.354 ms - 99.9pct: 31.354 ms - max: 31.354 ms --- Ack received rate: 0.33 
ack/s --- Failed messages: 0
----
2018-07-03 15:40:55 UTC - Idan: 31 ms
----
2018-07-03 15:44:49 UTC - Rasty Turek: @Rasty Turek has joined the channel
----
2018-07-03 15:57:17 UTC - Sijie Guo: @Idan it depends on your disk 
characteristics. What type of disks do you have? 
----
2018-07-03 16:01:38 UTC - Idan: ill get that from my infra guy
----
2018-07-03 16:01:48 UTC - Idan: what you need to know exactly regarding disk 
characteristics?
----
2018-07-03 16:01:53 UTC - Idan: e.g ssd??
----
2018-07-03 16:05:05 UTC - Sijie Guo: Pulsar does fsync by default. So your 
latency really depends on how fast your disk can do fsync. If you have those 
metrics, it would be good. If you don’t, knowing what type of disks helps as 
well.
----
2018-07-03 16:05:58 UTC - Daniel Ferreira Jorge: Hi, doesn't 
`subscriptionInitialPosition` (pull #1397) work with `topicsPattern`? (java)
----
2018-07-03 16:09:11 UTC - Idan: @Sijie Guo ill get this data and come back with 
results
----
2018-07-03 16:14:44 UTC - Idan: @Sijie Guo thats gp2
----
2018-07-03 16:16:21 UTC - Idan: we are using aws ssd
----
2018-07-03 16:16:22 UTC - Idan: Volume Type     General Purpose SSD (gp2)*      
Provisioned IOPS SSD (io1)      Throughput Optimized HDD (st1)  Cold HDD (sc1)
----
2018-07-03 16:33:27 UTC - Matteo Merli: @Idan is that an EBS?
----
2018-07-03 16:38:28 UTC - Matteo Merli: To have low latency (with fsync on) you 
should preferably be using a locally attached SSD (in AWS). That will give you 
a fsync() latency of 0.5 ms at 99pct. The only problem in AWS is that the SSDs 
are not very good at fsync workloads, so the latency will be a bit bumpy (when 
SSD is performing its own GC cycle). 

You have a couple of options to improve latency: 
 * Write more copies of data — eg: write 3 and wait for 3 acks — That will 
prune out the slowest storage node on each write request 
 * Disable fsync in bookies (`journalSyncData=false` in `bookkeeper.conf`)
----
2018-07-03 16:50:36 UTC - Idan: I cant write 3's by 3's as this will delay my 
overall latency per one message
----
2018-07-03 16:51:06 UTC - Matteo Merli: sorry, that was a typo 
:slightly_smiling_face:
----
2018-07-03 16:51:06 UTC - Idan: How would you describe 'bumpy'? Avg of 10ms avg 
is fair enough 
----
2018-07-03 16:51:12 UTC - Matteo Merli: eg: write 3 and wait for *2* acks
----
2018-07-03 16:52:57 UTC - Matteo Merli: &gt; How would you describe ‘bumpy’? 
Avg of 10ms avg is fair enough

Avg is typically fine, especially at normal rate, though at sustained rate of 
&gt; 100MB/s per node the SSD fsync 99pct latency will occasionaly spike up to 
~100ms every ~1min and then go back to ~2ms
----
2018-07-03 16:54:24 UTC - Idan: Thats durable i think. Ok ill perform serious 
load test and share statistics 
----
2018-07-03 16:54:34 UTC - Idan: Perhaps you recommend to us .other aws disk 
type?
----
2018-07-03 16:55:12 UTC - Matteo Merli: If you want to keep the fsync behavior, 
I recommend to use VMs with local attached SSD
----
2018-07-03 16:56:16 UTC - Idan: All our sys is on AWS
----
2018-07-03 16:56:23 UTC - Idan: we wont be able to do that
----
2018-07-03 16:57:10 UTC - Rasty Turek: You can however use Local SSD as a 
caching in front of your hdd
----
2018-07-03 16:57:13 UTC - Matteo Merli: You have several VM types with local 
<http://disks.eg|disks.eg>: `i3.*`
----
2018-07-03 16:59:43 UTC - Matteo Merli: <http://i3.xxx|i3.xxx> all have local 
SSDs. There are other options with local HDDs d2 / h1 / r3
----
2018-07-03 17:02:07 UTC - Idan: Available in AWS
----
2018-07-03 17:08:20 UTC - Matteo Merli: yes, these are all EC2 VM types : 
<https://aws.amazon.com/ec2/instance-types/> check “Storage optimized”
----
2018-07-03 17:43:56 UTC - Daniel Ferreira Jorge: I'm having a pretty hard time 
with a specific use case here. We use couchbase and we are tailing the 
couchbase replication logs and sending it to pulsar (this tailing, gives me at 
least once delivery). 

A couchbase cluster is always divided into 1024 partitions (vBuckets). Inside 
one partition, we have guaranteed ordering and an incrementing transaction 
number. 

To be able to put each of these transactions in pulsar exactly once, I enabled 
pulsar de-duplication, created one topic and one producer for each couchbase 
partition (each producer is publishing messages from one couchbase partition), 
and used the couchbase transaction number as the producer sequence id. 

It is working perfectly and I have 1024 topics that are mirroring each of the 
couchbase's 1024 partitions and the messages are being published exactly once.

Now I need to consume these 1024 topics. Obviously I went with a 
`topicsPattern` *but*, the problem is that I was not able to create a new 
subscription and start consuming from the beginning using 
`subscriptionInitialPosition`. I also tried a list of topics using `topics()`. 
The `subscriptionInitialPosition` only works if I create a consumer that 
subscribes to only one topic.

Is there a way to achieve that?
----
2018-07-03 17:44:11 UTC - Idan: @Matteo Merli thanks ill talk a look and come 
back with responses
----
2018-07-03 18:01:32 UTC - Matteo Merli: @Daniel Ferreira Jorge I think the 
`subscriptionInitialPosition` should ideally work even with topicsPattern, 
since it only allows to specify either Earliest or Latest and not a specific 
message id. If that’s not the case, we should fix it
----
2018-07-03 18:03:32 UTC - Matteo Merli: As a workaround, you could subscribe to 
all topics individually, with a single message listener. That would be almost 
the same behavior of the regex/multi-topics subscribe
----
2018-07-03 18:05:02 UTC - Daniel Ferreira Jorge: @Matteo Merli I believe this 
is the case. I made many tests here and `topicsPattern()` does not work, while 
`topic()` works as expected.
----
2018-07-03 18:05:25 UTC - Matteo Merli: Update, we just saw the problem is that 
the config for `subscriptionInitialPosition` is not propagated correctly on the 
multi-topics config. @Sijie Guo is opening an issue
----
2018-07-03 18:05:42 UTC - Matteo Merli: we should have a fix quickly
----
2018-07-03 18:06:18 UTC - Daniel Ferreira Jorge: ahhh that is great!
----
2018-07-03 18:07:17 UTC - Daniel Ferreira Jorge: for now, do you have a quick 
example on how to subscribe to all topics individually?
----
2018-07-03 18:08:09 UTC - Matteo Merli: as you mentioned, you get the list of 
topics, either static of using the API and create a new consumer for each of 
them
----
2018-07-03 18:08:54 UTC - Matteo Merli: on the `ConsumerBuilder`, specify the 
`messageListener()` to receive call for messages from any topic
----
2018-07-03 18:09:49 UTC - Daniel Ferreira Jorge: My consumer initialization is 
this 
`pulsarClient.newConsumer().consumerName("vbucket-consumer").topicsPattern(pattern).subscriptionName("vbucket-to-objects5").subscriptionType(SubscriptionType.Shared).subscriptionInitialPosition(SubscriptionInitialPosition.Earliest).messageListener(new
 MQListener()).receiverQueueSize(1).subscribe();`
----
2018-07-03 18:10:11 UTC - Sijie Guo: fyi - 
<https://github.com/apache/incubator-pulsar/issues/2077> this is the issue for 
tracking the problem. we will try to include this as part of 2.1 release.
----
2018-07-03 18:10:13 UTC - Daniel Ferreira Jorge: But if I create 1024 
consumers, it is too expensive
----
2018-07-03 18:10:58 UTC - Matteo Merli: You can share the same `new 
MQListener()` instance
----
2018-07-03 18:11:24 UTC - Daniel Ferreira Jorge: @Sijie Guo ah... that is great!
----
2018-07-03 18:11:45 UTC - Daniel Ferreira Jorge: @Matteo Merli I will try that!
----
2018-07-03 18:11:52 UTC - Daniel Ferreira Jorge: Thanks guys
----
2018-07-03 18:14:55 UTC - Daniel Ferreira Jorge: @Matteo Merli The workaround 
worked great. Thanks!
----
2018-07-03 18:15:11 UTC - Matteo Merli: :+1:
----
2018-07-03 21:16:31 UTC - Grant Wu: @Grant Wu has joined the channel
----
2018-07-03 21:25:36 UTC - Grant Wu: I’m having trouble using the website
----
2018-07-03 21:26:10 UTC - Grant Wu: @Grant Wu uploaded a file: 
<https://apache-pulsar.slack.com/files/UBHR9CH5E/FBJLQS2N9/screen_shot_2018-07-03_at_17.24.26.png|Screen
 Shot 2018-07-03 at 17.24.26.png> and commented: The page doesn’t scroll down 
to allow me to view all the links in the accordion; and the footer blocks things
----
2018-07-03 21:35:16 UTC - Sijie Guo: @Grant Wu I think there is some problems 
about overlays at the sidebar. we are aware of the problem and actually there 
is one people working on improving the website in general.

for the specific issue here, a temp get-around solution is to zoom-in the 
webpage (on mac/chrome, it is command and ‘-’), so the sidebar can fit in your 
screen. I know it is a bit inconvenient :disappointed:
----
2018-07-03 21:35:52 UTC - Grant Wu: Thanks, just making sure you were aware 
:slightly_smiling_face:
ok_hand : Sijie Guo
----
2018-07-03 21:37:05 UTC - Grant Wu: I just started my first fulltime job 
yesterday, and I’m going to be working with Pulsar for part of what I’m doing
I was wondering if anyone had any suggestions for something I’ve been asked to 
implement
----
2018-07-03 21:37:26 UTC - Grant Wu: We want to be able to get timestamp ranges 
of messages from Pulsar, i.e. all messages for a topic sent between two 
timestamps
----
2018-07-03 21:38:04 UTC - Grant Wu: Is there anything better than storing a 
subsampled timestamp to message ID approximate mapping for this?
----
2018-07-03 21:38:52 UTC - Grant Wu: Also apologies in advance if I’m 
misunderstood anything about Pulsar
----
2018-07-03 21:40:05 UTC - Sijie Guo: @Grant Wu is the timestamp your 
application’s timestamp? or publish timestamp of a message?
----
2018-07-03 21:41:46 UTC - Grant Wu: The latter
----
2018-07-03 21:42:59 UTC - Matteo Merli: You could use 
<http://pulsar.apache.org/api/admin/org/apache/pulsar/client/admin/Topics.html#resetCursor-java.lang.String-java.lang.String-long->
 to position a consumer on a particular timestamp of messages
----
2018-07-03 21:43:13 UTC - Grant Wu: Hrm, interesting
----
2018-07-03 21:43:20 UTC - Grant Wu: I don’t think we’re using a Java client
----
2018-07-03 21:43:24 UTC - Grant Wu: Let me look through the other libraries…
----
2018-07-03 21:43:30 UTC - Matteo Merli: and then scan until you get messages 
after your upper bound
----
2018-07-03 21:44:03 UTC - Matteo Merli: reset cursor is part of Admin API, you 
can access it through REST, Java and CLI
----
2018-07-03 21:48:19 UTC - Grant Wu: Hrm… my supervisor said there were concerns 
about that affecting more clients than we want, but I’m not sure how this 
interacts with the at most once delivery of messages
----
2018-07-03 21:48:30 UTC - Grant Wu: Uh I’ll ask him…
----
2018-07-03 22:01:12 UTC - Grant Wu: Uh, just to clarify something -
----
2018-07-03 22:01:21 UTC - Grant Wu: Can there be more than one subscription to 
a particular topic?
----
2018-07-03 22:01:36 UTC - Matteo Merli: yes
----
2018-07-03 22:06:53 UTC - Grant Wu: Hrm and as I understand it this is on a per 
subscription basis?
----
2018-07-03 22:07:18 UTC - Grant Wu: are there any before/after guarantees for 
the timestamp?
----
2018-07-03 22:07:42 UTC - Grant Wu: or is it just the message with the lowest 
absolute difference timestamp
----
2018-07-03 22:08:16 UTC - Matteo Merli: subscription will get positioned on the 
message with timestamp &lt;= to the specified parameter
----
2018-07-03 22:08:58 UTC - Grant Wu: ah okay.  might be useful to clarify that 
in the docs, it doesn’t seem to explicitly state that
----

Reply via email to