Re:Re: Re: Re: Why RowSet size is much smaller than flush_threshold_mb

2018-08-02 Thread Quanlong Huang
No, I failed to tune other flags... That's why I started this thread... I understand it's a trade-off whether to expose the design docs. Not exposing them will make the document clearer. The downside is users may bother you guys more when they encounter problems since there're no answers they

Re: Re: Re: Why RowSet size is much smaller than flush_threshold_mb

2018-08-01 Thread Todd Lipcon
On Wed, Aug 1, 2018 at 4:52 PM, Quanlong Huang wrote: > In my experience, when I found the performance is below my expectation, > I'd like to tune flags listed in https://kudu.apache.org/ > docs/configuration_reference.html , which needs a clear understanding of > kudu internals. Maybe we can

Re:Re: Re: Why RowSet size is much smaller than flush_threshold_mb

2018-08-01 Thread Quanlong Huang
In my experience, when I found the performance is below my expectation, I'd like to tune flags listed in https://kudu.apache.org/docs/configuration_reference.html , which needs a clear understanding of kudu internals. Maybe we can add the link there? At 2018-08-02 01:06:40,"Todd Lipcon"

Re: Re: Why RowSet size is much smaller than flush_threshold_mb

2018-08-01 Thread Todd Lipcon
On Wed, Aug 1, 2018 at 6:28 AM, Quanlong Huang wrote: > Hi Todd and William, > > I'm really appreciated for your help and sorry for my late reply. I was > going to reply with some follow-up questions but was assigned to focus some > other works... Now I'm back to this work. > > The design docs

Re:Re: Why RowSet size is much smaller than flush_threshold_mb

2018-08-01 Thread Quanlong Huang
Hi Todd and William, I'm really appreciated for your help and sorry for my late reply. I was going to reply with some follow-up questions but was assigned to focus some other works... Now I'm back to this work. The design docs are really helpful. Now I understand the flush and compaction. I

Re: Why RowSet size is much smaller than flush_threshold_mb

2018-06-15 Thread William Berkeley
The op seen in the logs is a rowset compaction, which takes existing diskrowsets and rewrites them. It's not a flush, which writes data in memory to disk, so I don't think the flush_threshold_mb is relevant. Rowset compaction is done to reduce the amount of overlap of rowsets in primary key space,

Why RowSet size is much smaller than flush_threshold_mb

2018-06-15 Thread Quanlong Huang
Hi all, I'm running kudu 1.6.0-cdh5.14.2. When looking into the logs of tablet server, I find most of the compactions are compacting small files (~40MB for each). For example: I0615 07:22:42.63735130614tablet.cc:1661] T 6bdefb8c27764a0597dcf98ee1b450ba P 70f3e54fe0f3490cbf0371a6830a33a7: