Oh whoops, I didn't scroll down and missed that. Thanks! Mauricio's suggestion is a good one. To that I would add: consider increasing the number of hash buckets.
Additionally, what's the rest of the primary key look like? _key and event_time are in there, but in what order? UUIDs in particular are usually a poor choice for primary keys because of their random distribution, all but guaranteeing lots of compaction during ingest, which slows down throughput considerably. How bad it is depends on the arrangement of columns in the primary key, and how that order reflects (or does not reflect) the key order of incoming data. On Wed, Nov 13, 2019 at 4:00 PM Mauricio Aristizabal <mauri...@impact.com> wrote: > You should start by making sure each of your 3 hash partition tablets' > leaders is in each of your 3 nodes. Very well could be all 3 were in the > same tablet server and you were ingesting into a single node. If needed > use leader_step_down > <https://kudu.apache.org/docs/command_line_tools_reference.html#tablet-leader_step_down> > to > move leaders around. > > FYI Adar, table schema was at bottom inside that iframe > > On Wed, Nov 13, 2019 at 3:24 PM Adar Lieber-Dembo <a...@cloudera.com> > wrote: > >> Some thoughts on how you might increase your write speed: >> - Don't use the same disk for both WAL and data directories. If you >> have enough disks, dedicate one for the WAL and the rest for data >> directories. >> - Since each disk is an SSD, experiment with a higher ratio of MM >> threads to data directories. We typically recommend 1:3, but that's >> for spinning disks. I see you've configured 2 MM threads for the >> masters but are still using just 1 for the tservers? Consider using >> 2-4. >> - How is your schema structured? Are you using hash partitioning? >> Range partitioning? Both? What's your primary key look like and does >> incoming data arrive in sorted order (or mostly sorted order) w.r.t. >> that key? Random order? >> https://kudu.apache.org/docs/schema_design.html is an excellent >> resource for understanding how schema can impact writes and reads. >> >> On Wed, Nov 13, 2019 at 3:07 PM wei ximing <wxmimpe...@outlook.com> >> wrote: >> > >> > Hi! >> > >> > I have some questions about kudu performance tuning. >> > >> > Kudu version: kudu 1.7.0-cdh5.16.2 >> > >> > System memary pre node:256G >> > >> > 4 SSDs per machine:512G >> > >> > Three Master nodes and three Tserver nodes. >> > >> > // Master config >> > --fs_wal_dir=/mnt/disk1/kudu/var/wal >> > >> --fs_data_dirs=/mnt/disk1/kudu/var/data,/mnt/disk2/kudu/var/data,/mnt/disk3/kudu/var/data,/mnt/disk4/kudu/var/data >> > --fs_metadata_dir=/mnt/disk1/kudu/var/metadata >> > --log_dir=/mnt/disk1/kudu/var/logs >> > --master_addresses=xxxx >> > --maintenance_manager_num_threads=2 >> > --block_cache_capacity_mb=6144 >> > --memory_limit_hard_bytes=34359738368 >> > --max_log_size=40 >> > >> > // Tserver config >> > --fs_wal_dir=/mnt/disk1/kudu/var/wal >> > >> --fs_data_dirs=/mnt/disk1/kudu/var/data,/mnt/disk2/kudu/var/data,/mnt/disk3/kudu/var/data,/mnt/disk4/kudu/var/data >> > --fs_metadata_dir=/mnt/disk1/kudu/var/metadata >> > --log_dir=/mnt/disk1/kudu/var/logs >> > --tserver_master_addrs=xxxx >> > --block_cache_capacity_mb=6144 >> > --memory_limit_hard_bytes=34359738368 >> > --max_log_size=40 >> > >> > // Table schema >> > // _key is UUID for each msg >> > // event_time is data time >> > // Schema has only 15 columns >> > // Single message does not exceed 100Bytes >> > >> > HASH (_key) PARTITIONS 3, >> > RANGE (event_time) ( >> > PARTITION 2019-10-31T16:00:00.000000Z <= VALUES < >> 2019-11-30T16:00:00.000000Z >> > ) >> > >> > I write a project to write data to kudu. >> > >> > Whether manual or automatic flush mode write speed is only 6MB/s. >> > >> > I think SSD should be more than this speed, and the network and memory >> have not reached the bottleneck. >> > >> > Is this the normal level of kudu writing? How to tuning? >> > >> > >> > Thanks. >> > > > -- > Mauricio Aristizabal > Architect - Data Pipeline > mauri...@impact.com | 323 309 4260 > https://impact.com > <https://www.linkedin.com/company/impact-martech/> > <https://www.facebook.com/ImpactParTech/> > <https://twitter.com/impactpartech> > <https://www.youtube.com/c/impactmartech> >