2018-02-22 19:16:18 UTC - Sijie Guo: @Masakazu Kitajo: I think @Matteo Merli 
setup a jenkins job for the slack digest job. so if you have jenkins access, 
you can access that
----
2018-02-22 19:17:09 UTC - Matteo Merli: Yes, I replied on the mailing list.. it 
was my mistake on the cron schedule…
----
2018-02-22 19:17:20 UTC - Matteo Merli: it went off every min for 1h
----
2018-02-22 19:18:27 UTC - Sijie Guo: oh i see
----
2018-02-22 19:25:14 UTC - Sijie Guo: @SansWord Huang sorry for late response. 
just saw your replies. 

- you don’t need large capacity for journal disk. but it is critical to 
latency, you probably want a hdd with battery-backup-unit or an ssd, since it 
is doing fsyncs. ledger disks are basically the place where the data is 
eventually stored. so you need to caculate that based on how many data you are 
going to store.

> data first written into journal disc then “flush” into ledger storage?

yes data is first written to journal disk and the write is responded once the 
data is fsynced to journal disk. and the data is asynchronously indexed and 
flushed back to ledger storage.

> when will data be rebalanced?

if you are using bookkeeper in the log/messaging workload, you typically don’t 
need data rebalance, because when a ledger is created, there are old ledgers 
deleted because the data has been expired due to retention. 

however if you are using bookkeeper for long term storage, you might need some 
sort of data rebalance. There was a BP (bookkeeper proposal) in bookkeeper for 
that purpose. but you can still do it using autorecovery to rebalance the data 
manually. 

so this topic depends on what are you using pulsar (bookkeeper) for.

> how do I know my data is already replicated?

if a ledger is underreplicated, it will be listed at zookkeeper under an 
`underreplicated` znode. there are bookkeeper CLI and also metrics for that as 
well.
----
2018-02-22 19:57:02 UTC - Karthik Palanivelu: @Sijie Guo I cannot use the 
deployment model directly from source code. ASG is autoscaling group which will 
bring up a new instance in case of a node failure. In ZK case if a node fails a 
new ZK node comes up with a different IP. This need to be updated as a ZK node 
in pulsar by replacing the old IP. To avoid it AWS allows us to generate Static 
IPs. We can use these IPs for ZK so that we can hard code it in Pulsar. In this 
scenario if a ZK fails new ZK node comes up with a assigned IP. I am checking 
is there a better way to handle this scenario?
----
2018-02-22 20:00:55 UTC - Matteo Merli: @Karthikeyan Palanivelu If you’re using 
ASG with ZK nodes, you could also assign DNS names to each ZK server. That way 
there’s no need to change the configuration in other ZK ensemble members when a 
node is replaced with one with a different IP
----
2018-02-22 20:57:02 UTC - Karthik Palanivelu: @Matteo Merli we have a separate 
system to manage DNS which makes one more point of failure.
----
2018-02-22 22:54:41 UTC - Matteo Merli: Sure, I was thinking more of the AWS 
managed DNS
----
2018-02-23 06:14:58 UTC - SansWord Huang: @Sijie Guo Thanks for all answers, 
these helps me a lot on understanding how Pulsar work.
----
2018-02-23 06:20:43 UTC - SansWord Huang: @SansWord Huang uploaded a file: 
<https://apache-pulsar.slack.com/files/U9CDBEH1P/F9D3G828H/bookie_restart_error|bookie_restart_error>
 and commented: By an experiment today I put too many messages into Pulsar and 
Bookies node shut down.
After extend storage they use, I've tried to restarted all bookies.
And here comes two problem:
1. I've skipped all message using pulsar-admin, when will disc space be 
released? 
2. one of my bookie node can not restart with the following error message, what 
can I do?
----
2018-02-23 07:00:43 UTC - SansWord Huang: @SansWord Huang commented on 
@SansWord Huang’s file 
<https://apache-pulsar.slack.com/files/U9CDBEH1P/F9D3G828H/bookie_restart_error|bookie_restart_error>:
 The first question, once I’ve produce more messages, old ledgers will be 
deleted and disk space is released.
----
2018-02-23 07:01:28 UTC - SansWord Huang: @SansWord Huang commented on 
@SansWord Huang’s file 
<https://apache-pulsar.slack.com/files/U9CDBEH1P/F9D3G828H/bookie_restart_error|bookie_restart_error>:
 The second, I still don’t know why, but I decide to delete this node’s journal 
and ledgers and start again.
----
2018-02-23 07:18:56 UTC - Sijie Guo: @SansWord Huang sorry for late response. 
just saw the message now.
----
2018-02-23 07:19:55 UTC - Sijie Guo: for the first question 1) the ledgers are 
deleted on new ledgers rolled. new ledgers rolled based on time or size. so if 
you produce new messages, it will triggered ledger rolling, it then will delete 
ledgers that are all ready skipped.
----
2018-02-23 07:21:18 UTC - Sijie Guo: for the second questions 2) it seems 
during replaying journal, it tries to replay the entries and it encountered 
issues on inserting those entries. I am wondering if your disks were full at 
that time?
----
2018-02-23 08:11:50 UTC - SansWord Huang: yes, I’ve noticed even I’ve expanded 
the disc, it’s not enough for journal to replay.
so the quickest way is to delete data and restart this book keeper node.

but lesson I learned is that 
1. I should really separate disc for journal and ledgers.
2. if not doing so, I should save some space for ledgers to be able to playback 
while journal is growing.
----
2018-02-23 08:18:47 UTC - Sijie Guo: yeah i see
----
2018-02-23 10:43:31 UTC - Till Rathschlag: @Till Rathschlag has joined the 
channel
----
2018-02-23 10:55:27 UTC - Till Rathschlag: Hello everybody, I'm currently 
evaluating pulsar and I try to understand if it fits to the following usecase: 
I like to use pulsar (among others) as a task queue. I want my task producer to 
generate as many jobs as the consumers can work on, so I need some kind of 
communication consumers -&gt; producer. I tried to build this with 
acknowledging but noticed that this is only propagated to pulsar and not back 
to the producer. So my question is, how would I do this? I thought about the 
following:
- Provide some other topic for job acknowledging
- Monitor the ack-ratio from the producer service
Is pulsar the right tool for this? I would be glad if someone can share their 
experience with this. Thanks in advance!
----
2018-02-23 16:58:33 UTC - Matteo Merli: @Till Rathschlag The primary function 
of a messaging system is to decouple the producers and the consumer and that’s 
way we don’t have correlation of consumers acks to producer 
:slightly_smiling_face:

However, if you’re not requiring exact precision, you can try using backlog 
quota to stop the producer. 
You can configure a very low quota (eg: 10MB or 1MB…) and the default action is 
to block the producers when the consumers accumulate that amount of “backlog” 
in the queue. 

I’m saying it’s not precise because the check for quota is only done 
periodically in background (every 1min by default I think) for efficiency 
reasons, so a user can go a bit over quota before getting stopped. 

If you need a more finer control, you could use a 2nd topic. For example: 
 * Consumer gets a message, process it
 * Consumer sends confirmation on the 2nd topic (referring to a particular 
msgId for 1st topic)
 * Consumer acks the message

Producer can do a kind of “semaphore” limiting the number of “in-processing” 
messages, by waiting for confirmations on the 2nd topic. This could work even 
if there are multiple producers, because you can ignore msg Ids that were 
published by other producers
----

Reply via email to