2018-01-19 03:02:35 UTC - Jaebin Yoon: @Matteo Merli @Sijie Guo I'm setting up 
bookies on AWS d2.4xlarge instances (16 cores, 122G memory, 12x2TB raid-0 hd). 
Do you have any recommendation for memory configuration for this kind of setup? 
for configurations like java heap, direct memory and 
dbStorage_writeCacheMaxSizeMb, dbStorage_readAheadCacheMaxSizeMb, 
dbStorage_rocksDB_blockCacheSize.
BTW, I'm going to use journalSyncData=false since we cannot recover machines 
when they shutsdown. So no fsync is required for every message.
----
2018-01-19 03:14:43 UTC - Matteo Merli: Since the VM has lot of RAM you can 
increase a lot from the defaults and leave the rest page cache. For JVM heap 
I'd say ~24g. WriteCacheMaxSize and ReadAheadCacheMaxSize are both coming from 
JVM direct memory.  I'd say to start with 16g @ 16g. For rocksdb block cache, 
which is allocated in JNI so it's completely out of JVM configuration, ideally 
you want to cache most of the indexes. I'd say 4gb should be enough to index 
all the data in the 24Tb storage space.  
----
2018-01-19 03:19:39 UTC - Jaebin Yoon: alright. thanks @Matteo Merli for quick 
response! let me try that. And I'm going to use m4.2x for brokers (8 cores, 
32G).
----
2018-01-19 03:27:45 UTC - Matteo Merli: No prob. If you post the final settings 
I can take a look as well
----
2018-01-19 03:28:12 UTC - Jaebin Yoon: sounds good. thanks!
----
2018-01-19 03:38:42 UTC - YANGLiiN: @YANGLiiN has joined the channel
----
2018-01-19 04:43:26 UTC - Jaebin Yoon: @Matteo Merli here is the bookie 
configuration I'm going to use on d2.4xlarge :

```Bookie JVM options

-server
-Dsnappy.bufferSize=32768
-Dlog4j.configuration=file:///apps/pulsarbookie/conf/log4j.properties
-XX:+UseCompressedOops
-XX:+DisableExplicitGC
-Xms24g
-Xmx24g
-XX:MaxDirectMemorySize=16g
-verbose:gc
-Xloggc:$GCLOG
-XX:+UseGCLogFileRotation
-XX:NumberOfGCLogFiles=30
-XX:GCLogFileSize=10M
-XX:+PreserveFramePointer
-XX:+PrintGCDetails
-XX:+PrintGCDateStamps
-XX:+PrintGCTimeStamps
-XX:+PrintTenuringDistribution
-Dcom.sun.management.jmxremote.ssl=false
-Dcom.sun.management.jmxremote.authenticate=false
-Djava.awt.headless=true
-XX:+UseG1GC
-XX:MaxGCPauseMillis=10
-XX:+ParallelRefProcEnabled
-XX:+UnlockExperimentalVMOptions
-XX:+AggressiveOpts
-XX:+DoEscapeAnalysis
-XX:ParallelGCThreads=32
-XX:ConcGCThreads=32
-XX:G1NewSizePercent=50
-XX:+DisableExplicitGC
-XX:-ResizePLAB
-Djute.maxbuffer=10485760
-Djava.net.preferIPv4Stack=true
-Dio.netty.leakDetectionLevel=disabled
-Dio.netty.recycler.maxCapacity.default=1000
-Dio.netty.recycler.linkCapacity=1024


 == Bookie Configuration

dbStorage_writeCacheMaxSizeMb=4096
dbStorage_readAheadCacheMaxSizeMb=4096
dbStorage_rocksDB_blockCacheSize=4294967296

readBufferSizeBytes=4096
writeBufferSizeBytes=65536

journalSyncData=false

# defaults
ledgerStorageClass=org.apache.bookkeeper.bookie.storage.ldb.DbLedgerStorage
entryLogFilePreallocationEnabled=true
logSizeLimit=2147483648
minorCompactionThreshold=0.2
minorCompactionInterval=3600
majorCompactionThreshold=0.5
majorCompactionInterval=86400
compactionMaxOutstandingRequests=100000
compactionRate=1000
isThrottleByBytes=false
compactionRateByEntries=1000
compactionRateByBytes=1000000
journalMaxSizeMB=2048
journalMaxBackups=5
journalPreAllocSizeMB=16
journalWriteBufferSizeKB=64
journalRemoveFromPageCache=true
journalAdaptiveGroupWrites=true
journalMaxGroupWaitMSec=1
journalAlignmentSize=4096
journalBufferedWritesThreshold=524288
journalFlushWhenQueueEmpty=false
numJournalCallbackThreads=8
rereplicationEntryBatchSize=5000
gcWaitTime=900000
gcOverreplicatedLedgerWaitTime=86400000
flushInterval=60000
bookieDeathWatchInterval=1000
zkTimeout=30000
serverTcpNoDelay=true
openFileLimit=0
pageLimit=0
readOnlyModeEnabled=true
diskUsageThreshold=0.95
diskCheckInterval=10000
auditorPeriodicCheckInterval=604800
auditorPeriodicBookieCheckInterval=86400
numAddWorkerThreads=0
numReadWorkerThreads=8
maxPendingReadRequestsPerThread=2500
useHostNameAsBookieID=false

dbStorage_readAheadCacheBatchSize=1000
dbStorage_rocksDB_writeBufferSizeMB=64
dbStorage_rocksDB_sstSizeInMB=64
dbStorage_rocksDB_blockSize=65536
dbStorage_rocksDB_bloomFilterBitsPerKey=10
dbStorage_rocksDB_numLevels=-1
dbStorage_rocksDB_numFilesInLevel0=4
dbStorage_rocksDB_maxSizeInLevel1MB=256```
----
2018-01-19 07:21:31 UTC - DengJian: @DengJian has joined the channel
----
2018-01-19 17:56:08 UTC - Jaebin Yoon: When there are multiple consumers for a 
topic, the broker reads once from bookies and send them to all consumers with 
some buffer? or go get from bookies all the time for each consumers ?
----
2018-01-19 17:56:47 UTC - Matteo Merli: In general, all dispatching is done 
directly by broker memory
----
2018-01-19 17:57:13 UTC - Matteo Merli: we only read from bookies when consumer 
are falling behind
----
2018-01-19 17:57:39 UTC - Jaebin Yoon: ah i see. that's great. I'm trying to 
understand network implication of high fan-out case.
----
2018-01-19 17:58:08 UTC - Matteo Merli: in that case it depends on the broker 
cache, if they’re reading close together they’ll be probably be cached anyway. 
Otherwise go back to bookies
----
2018-01-19 17:58:29 UTC - Jaebin Yoon: is the broker cache size configurable?
----
2018-01-19 17:58:41 UTC - Matteo Merli: yes
----
2018-01-19 17:59:17 UTC - Jaebin Yoon: which configuration is it? definitely 
this is something i need to tweak for high fanout case.
----
2018-01-19 18:00:02 UTC - Matteo Merli: @Matteo Merli uploaded a file: 
<https://apache-pulsar.slack.com/files/U680ZCXA5/F8WGEBS30/-.sh|Untitled>
----
2018-01-19 18:04:46 UTC - Matteo Merli: Yes, the config looks good. One other 
“improvement”, since you’re disabling the fsync could be to either reduce 
`journalMaxGroupWaitMSec` to 0 oe 0.1 to avoid the “minimal” group commit 
latency.
----
2018-01-19 19:08:18 UTC - Fred Monroe: hi everyone, i apologize if this is a 
very noob question, is there a simple example and client library somewhere of 
publishing to apache pulsar from go (the programming language) - similar to the 
examples for python.
----
2018-01-19 19:09:14 UTC - Ali Ahmed: @Fred Monroe there is no official go 
client at this time
----
2018-01-19 19:09:41 UTC - Fred Monroe: ok thanks, i looked around a little, 
just wanted to make sure i wasn’t missing something
----
2018-01-19 19:21:17 UTC - Jaebin Yoon: Yeah I'm interested in go client too. 
Since c++ client is available, it would be relatively easy to have a wrapper 
for go.
----
2018-01-19 19:25:13 UTC - Matteo Merli: There was some effort of a pure go 
client some time back. I cannot vouch for completeness/stability though: 
<https://github.com/t2y/go-pulsar>
----
2018-01-19 19:26:02 UTC - Matteo Merli: Though yeah, my preference would be to 
have a C++ based wrapper. That would ensure to start from a mature library and 
have all features available.
----
2018-01-19 19:26:32 UTC - Matteo Merli: I don’t know how bad is to distribute 
Go libraries with native components
----
2018-01-19 19:29:27 UTC - Jaebin Yoon: yeah that would be challenging. Users 
might have to compile in their environment.  Kafka go client takes that 
approach. <https://github.com/confluentinc/confluent-kafka-go>
----
2018-01-19 19:32:00 UTC - Matteo Merli: is that basically telling to have the 
client library installed in your system? so there’s no “embedding” of sort
----
2018-01-19 19:34:10 UTC - Matteo Merli: on that front, Python wheel files are 
nicer!
----
2018-01-19 22:58:53 UTC - Allen Wang: Hello: what configuration to use if we 
want the bookkeeper-ensemble for a namespace to be all bookies in the cluster? 
We want the ensemble to grow as the bookie cluster size grows.
----
2018-01-19 22:59:54 UTC - Matteo Merli: but you still want to write 2 (or 3) 
copies of the data, right?
----
2018-01-19 23:00:00 UTC - Allen Wang: Yes
----
2018-01-19 23:00:27 UTC - Matteo Merli: the, default ensemble size (2) is good 
then
----
2018-01-19 23:00:44 UTC - Matteo Merli: that is just referred to a particular 
“ledger”
----
2018-01-19 23:01:44 UTC - Allen Wang: I thought the number of copies is 
controlled by bookkeeper-write-quorum
----
2018-01-19 23:01:57 UTC - Matteo Merli: defaults are `ensemble=2`, 
`write-quorum=2`, `ack-quorum=2`. This means: 
for a new ledger, pick any 2 available bookies and write 2 copies and wait for 
2 acks
----
2018-01-19 23:02:44 UTC - Matteo Merli: in this same scenario, if you increase 
the ensemble size, you would be enabling “striping” when writing into a 
specific ledger
----
2018-01-19 23:03:24 UTC - Matteo Merli: eg: `e=5 w=2 a=2` -&gt; Picks 5 bookies 
and write 2 copies of the data, striping in round-robin across the 5 bookies
----
2018-01-19 23:12:39 UTC - Matteo Merli: Again, this is only for a particular 
ledger (segment of a topic), over time data for a topic will be assigned to 
multiple bookies. Every ledger's ensemble pick is unrelated to the previouses
----

Reply via email to