Hi all,

I'm running kudu 1.6.0-cdh5.14.2. When looking into the logs of tablet server, 
I find most of the compactions are compacting small files (~40MB for each). For 
example:


I0615 07:22:42.63735130614tablet.cc:1661] T 6bdefb8c27764a0597dcf98ee1b450ba P 
70f3e54fe0f3490cbf0371a6830a33a7: Compaction: stage 1 complete, picked 4 
rowsets to compact
I0615 07:22:42.63738530614compaction.cc:903] Selected 4 rowsets to compact:
I0615 07:22:42.63739330614compaction.cc:906] RowSet(343)(current size on disk: 
~40666600 bytes)
I0615 07:22:42.63740130614compaction.cc:906] RowSet(1563)(current size on disk: 
~34720852 bytes)
I0615 07:22:42.63740830614compaction.cc:906] RowSet(1645)(current size on disk: 
~29914833 bytes)
I0615 07:22:42.63741530614compaction.cc:906] RowSet(1870)(current size on disk: 
~29007249 bytes)
I0615 07:22:42.63742830614tablet.cc:1447] T 6bdefb8c27764a0597dcf98ee1b450ba P 
70f3e54fe0f3490cbf0371a6830a33a7: Compaction: entering phase 1 (flushing 
snapshot). Phase 1 snapshot: MvccSnapshot[committed={T|T < 6263071556616208384 
or (T in {6263071556616208384})}]
I0615 07:22:42.64158230614multi_column_writer.cc:103] Opened CFile writers for 
124 column(s)
I0615 07:22:43.87539630614multi_column_writer.cc:103] Opened CFile writers for 
124 column(s)
I0615 07:22:44.41842130614multi_column_writer.cc:103] Opened CFile writers for 
124 column(s)
I0615 07:22:45.11438930614multi_column_writer.cc:103] Opened CFile writers for 
124 column(s)
I0615 07:22:54.76256330614tablet.cc:1532] T 6bdefb8c27764a0597dcf98ee1b450ba P 
70f3e54fe0f3490cbf0371a6830a33a7: Compaction: entering phase 2 (starting to 
duplicate updates in new rowsets)
I0615 07:22:54.77357230614tablet.cc:1587] T 6bdefb8c27764a0597dcf98ee1b450ba P 
70f3e54fe0f3490cbf0371a6830a33a7: Compaction Phase 2: carrying over any updates 
which arrived during Phase 1
I0615 07:22:54.77359930614tablet.cc:1589] T 6bdefb8c27764a0597dcf98ee1b450ba P 
70f3e54fe0f3490cbf0371a6830a33a7: Phase 2 snapshot: MvccSnapshot[committed={T|T 
< 6263071556616208384 or (T in {6263071556616208384})}]
I0615 07:22:55.18975730614tablet.cc:1631] T 6bdefb8c27764a0597dcf98ee1b450ba P 
70f3e54fe0f3490cbf0371a6830a33a7: Compaction successful on 82987 rows 
(123387929 bytes)
I0615 07:22:55.19142630614maintenance_manager.cc:491] Time spent running 
CompactRowSetsOp(6bdefb8c27764a0597dcf98ee1b450ba): real 12.628suser 1.460ssys 
0.410s
I0615 07:22:55.19148430614maintenance_manager.cc:497] P 
70f3e54fe0f3490cbf0371a6830a33a7: 
CompactRowSetsOp(6bdefb8c27764a0597dcf98ee1b450ba) metrics: 
{"cfile_cache_hit":812,"cfile_cache_hit_bytes":16840376,"cfile_cache_miss":2730,"cfile_cache_miss_bytes":251298442,"cfile_init":496,"data
 dirs.queue_time_us":6646,"data dirs.run_cpu_time_us":2188,"data 
dirs.run_wall_time_us":101717,"fdatasync":315,"fdatasync_us":9617174,"lbm_read_time_us":1288971,"lbm_reads_1-10_ms":32,"lbm_reads_10-100_ms":41,"lbm_reads_lt_1ms":4641,"lbm_write_time_us":122520,"lbm_writes_lt_1ms":2799,"mutex_wait_us":25,"spinlock_wait_cycles":155264,"tcmalloc_contention_cycles":768,"thread_start_us":677,"threads_started":14,"wal-append.queue_time_us":300}


The flush_threshold_mb is set in the default value (1024). Wouldn't the flushed 
file size be ~1GB?


I think increasing the initial RowSet size can reduce compactions and then 
reduce the impact of other ongoing operations. It may also improve the flush 
performance. Is that right? If so, how can I increase the RowSet size?


I'd be grateful if someone can make me clear about these!


Thanks,
Quanlong

Reply via email to