Hi Todd, This tablet disappeared from WAL path. I think it was time partition that we already removed.
чт, 27 июн. 2019 г. в 08:58, Todd Lipcon <[email protected]>: > Hey Pavel, > > I went back and looked at the source here. It appears that 24MB is the > expected size for an index file -- each entry is 24 bytes and the index > file should keep 1M entries. > > That said, for a "cold tablet" (in which you'd have only a small number of > actual WAL files) I would expect only a single index file. The example you > gave where you have 12 index files but only one WAL segment seems quite > fishy to me. Having 12 index files indicates you have 12M separate WAL > entries, but given you have only 8MB of WAL, that indicates each entry is > less than one byte large, which doesn't make much sense at all. > > If you go back and look at that same tablet now, did it eventually GC > those log index files? > > -Todd > > > > On Wed, Jun 19, 2019 at 1:53 AM Pavel Martynov <[email protected]> wrote: > >> > Try adding the '-p' flag here? That should show preallocated extents. >> Would be interesting to run it on some index file which is larger than 1MB, >> for example. >> >> # du -h --apparent-size index.000000108 >> 23M index.000000108 >> >> # du -h index.000000108 >> 23M index.000000108 >> >> # xfs_bmap -v -p index.000000108 >> index.000000108: >> EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL >> FLAGS >> 0: [0..2719]: 1175815920..1175818639 2 (3704560..3707279) 2720 >> 00000 >> 1: [2720..5111]: 1175828904..1175831295 2 (3717544..3719935) 2392 >> 00000 >> 2: [5112..7767]: 1175835592..1175838247 2 (3724232..3726887) 2656 >> 00000 >> 3: [7768..10567]: 1175849896..1175852695 2 (3738536..3741335) 2800 >> 00000 >> 4: [10568..15751]: 1175877808..1175882991 2 (3766448..3771631) 5184 >> 00000 >> 5: [15752..18207]: 1175898864..1175901319 2 (3787504..3789959) 2456 >> 00000 >> 6: [18208..20759]: 1175909192..1175911743 2 (3797832..3800383) 2552 >> 00000 >> 7: [20760..23591]: 1175921616..1175924447 2 (3810256..3813087) 2832 >> 00000 >> 8: [23592..26207]: 1175974872..1175977487 2 (3863512..3866127) 2616 >> 00000 >> 9: [26208..28799]: 1175989496..1175992087 2 (3878136..3880727) 2592 >> 00000 >> 10: [28800..31199]: 1175998552..1176000951 2 (3887192..3889591) 2400 >> 00000 >> 11: [31200..33895]: 1176008336..1176011031 2 (3896976..3899671) 2696 >> 00000 >> 12: [33896..36591]: 1176031696..1176034391 2 (3920336..3923031) 2696 >> 00000 >> 13: [36592..39191]: 1176037440..1176040039 2 (3926080..3928679) 2600 >> 00000 >> 14: [39192..41839]: 1176072008..1176074655 2 (3960648..3963295) 2648 >> 00000 >> 15: [41840..44423]: 1176097752..1176100335 2 (3986392..3988975) 2584 >> 00000 >> 16: [44424..46879]: 1176132144..1176134599 2 (4020784..4023239) 2456 >> 00000 >> >> >> >> >> >> ср, 19 июн. 2019 г. в 10:56, Todd Lipcon <[email protected]>: >> >>> >>> >>> On Wed, Jun 19, 2019 at 12:49 AM Pavel Martynov <[email protected]> >>> wrote: >>> >>>> Hi Todd, thanks for the answer! >>>> >>>> > Any chance you've done something like copy the files away and back >>>> that might cause them to lose their sparseness? >>>> >>>> No, I don't think so. Recently we experienced some problems with >>>> stability with Kudu, and ran rebalance a couple of times, if this related. >>>> But we never used fs commands like cp/mv against Kudu dirs. >>>> >>>> I ran du on all-WALs dir: >>>> # du -sh /mnt/data01/kudu-tserver-wal/ >>>> 12G /mnt/data01/kudu-tserver-wal/ >>>> >>>> # du -sh --apparent-size /mnt/data01/kudu-tserver-wal/ >>>> 25G /mnt/data01/kudu-tserver-wal/ >>>> >>>> And on WAL with a many indexes: >>>> # du -sh --apparent-size >>>> /mnt/data01/kudu-tserver-wal/wals/779a382ea4e6464aa80ea398070a391f >>>> 306M >>>> /mnt/data01/kudu-tserver-wal/wals/779a382ea4e6464aa80ea398070a391f >>>> >>>> # du -sh >>>> /mnt/data01/kudu-tserver-wal/wals/779a382ea4e6464aa80ea398070a391f >>>> 296M >>>> /mnt/data01/kudu-tserver-wal/wals/779a382ea4e6464aa80ea398070a391f >>>> >>>> >>>> > Also, any chance you're using XFS here? >>>> >>>> Yes, exactly XFS. We use CentOS 7.6. >>>> >>>> What is interesting, there are no many holes in index files in >>>> /mnt/data01/kudu-tserver-wal/wals/779a382ea4e6464aa80ea398070a391f (WAL dir >>>> that I mention before). Only single hole in single index file (of 13 >>>> files): >>>> # xfs_bmap -v index.000000120 >>>> >>> >>> Try adding the '-p' flag here? That should show preallocated extents. >>> Would be interesting to run it on some index file which is larger than 1MB, >>> for example. >>> >>> >>>> index.000000120: >>>> EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET >>>> TOTAL >>>> 0: [0..4231]: 1176541248..1176545479 2 (4429888..4434119) >>>> 4232 >>>> 1: [4232..9815]: 1176546592..1176552175 2 (4435232..4440815) >>>> 5584 >>>> 2: [9816..11583]: 1176552832..1176554599 2 (4441472..4443239) >>>> 1768 >>>> 3: [11584..13319]: 1176558672..1176560407 2 (4447312..4449047) >>>> 1736 >>>> 4: [13320..15239]: 1176565336..1176567255 2 (4453976..4455895) >>>> 1920 >>>> 5: [15240..17183]: 1176570776..1176572719 2 (4459416..4461359) >>>> 1944 >>>> 6: [17184..18999]: 1176575856..1176577671 2 (4464496..4466311) >>>> 1816 >>>> 7: [19000..20927]: 1176593552..1176595479 2 (4482192..4484119) >>>> 1928 >>>> 8: [20928..22703]: 1176599128..1176600903 2 (4487768..4489543) >>>> 1776 >>>> 9: [22704..24575]: 1176602704..1176604575 2 (4491344..4493215) >>>> 1872 >>>> 10: [24576..26495]: 1176611936..1176613855 2 (4500576..4502495) >>>> 1920 >>>> 11: [26496..26655]: 1176615040..1176615199 2 (4503680..4503839) >>>> 160 >>>> 12: [26656..46879]: hole >>>> 20224 >>>> >>>> But in some other WAL I see like this: >>>> # xfs_bmap -v >>>> /mnt/data01/kudu-tserver-wal/wals/508ecdfa8904bdb97a02078a91822af/index.000000000 >>>> >>>> /mnt/data01/kudu-tserver-wal/wals/508ecdfa89054bdb97a02078a91822af/index.000000000: >>>> EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL >>>> 0: [0..7]: 1758753776..1758753783 3 (586736..586743) 8 >>>> 1: [8..46879]: hole 46872 >>>> >>>> Looks like there actually used only 8 blocks and all other blocks are >>>> the hole. >>>> >>>> >>>> So looks like I can use formulas with confidence. >>>> Normal case: 8 MB/segment * 80 max segments * 2000 tablets = 1,280,000 >>>> MB = ~1.3 TB (+ some minor index overhead) >>>> Worse case: 8 MB/segment * 1 segment * 2000 tablets = 1,280,000 MB = >>>> ~16 GB (+ some minor index overhead) >>>> >>>> Right? >>>> >>>> >>>> ср, 19 июн. 2019 г. в 09:35, Todd Lipcon <[email protected]>: >>>> >>>>> Hi Pavel, >>>>> >>>>> That's not quite expected. For example, on one of our test clusters >>>>> here, we have about 65GB of WALs and about 1GB of index files. If I recall >>>>> correctly, the index files store 8 bytes per WAL entry, so typically a >>>>> couple orders of magnitude smaller than the WALs themselves. >>>>> >>>>> One thing is that the index files are sparse. Any chance you've done >>>>> something like copy the files away and back that might cause them to lose >>>>> their sparseness? If I use du --apparent-size on mine, it's total of about >>>>> 180GB vs the 1GB of actual size. >>>>> >>>>> Also, any chance you're using XFS here? XFS sometimes likes to >>>>> preallocate large amounts of data into files while they're open, and only >>>>> frees it up if disk space is contended. I think you can use 'xfs_bmap' on >>>>> an index file to see the allocation status, which might be interesting. >>>>> >>>>> -Todd >>>>> >>>>> On Tue, Jun 18, 2019 at 11:12 PM Pavel Martynov <[email protected]> >>>>> wrote: >>>>> >>>>>> Hi guys! >>>>>> >>>>>> We want to buy SSDs for TServers WALs for our cluster. I'm working on >>>>>> capacity estimation for this SSDs using "Getting Started with Kudu" book, >>>>>> Chapter 4, Write-Ahead Log ( >>>>>> https://www.oreilly.com/library/view/getting-started-with/9781491980248/ch04.html >>>>>> <https://www.oreilly.com/library/view/getting-started-with/9781491980248/ch04.html#idm139738927926240> >>>>>> ). >>>>>> >>>>>> NB: we use default Kudu WAL configuration settings. >>>>>> >>>>>> There is a formula for worse-case: >>>>>> 8 MB/segment * 80 max segments * 2000 tablets = 1,280,000 MB = ~1.3 TB >>>>>> >>>>>> So, this formula takes into account only segment files. But in our >>>>>> cluster, I see that every segment file has >= 1 corresponding index >>>>>> files. >>>>>> And every index file actually larger than segment file. >>>>>> >>>>>> Numbers from one of our nodes. >>>>>> WALs count: >>>>>> $ ls /mnt/data01/kudu-tserver-wal/wals/ | wc -l >>>>>> 711 >>>>>> >>>>>> Overall WAL size: >>>>>> $ du -d 0 -h /mnt/data01/kudu-tserver-wal/ >>>>>> 13G /mnt/data01/kudu-tserver-wal/ >>>>>> >>>>>> Size of all segment files: >>>>>> $ find /mnt/data01/kudu-tserver-wal/ -type f -name 'wal-*' -exec du >>>>>> -ch {} + | grep total$ >>>>>> 6.1G total >>>>>> >>>>>> Size of all index files: >>>>>> $ find /mnt/data01/kudu-tserver-wal/ -type f -name 'index*' -exec du >>>>>> -ch {} + | grep total$ >>>>>> 6.5G total >>>>>> >>>>>> So I have questions. >>>>>> >>>>>> 1. How can I estimate the size of index files? >>>>>> Looks like in our cluster size of index files approximately equal to >>>>>> size segment files. >>>>>> >>>>>> 2. There is some WALs with more than one index files. For example: >>>>>> $ ls -lh >>>>>> /mnt/data01/kudu-tserver-wal/wals/779a382ea4e6464aa80ea398070a391f/ >>>>>> total 296M >>>>>> -rw-r--r-- 1 root root 23M Jun 18 21:31 index.000000108 >>>>>> -rw-r--r-- 1 root root 23M Jun 18 21:41 index.000000109 >>>>>> -rw-r--r-- 1 root root 23M Jun 18 21:52 index.000000110 >>>>>> -rw-r--r-- 1 root root 23M Jun 18 22:10 index.000000111 >>>>>> -rw-r--r-- 1 root root 23M Jun 18 22:22 index.000000112 >>>>>> -rw-r--r-- 1 root root 23M Jun 18 22:35 index.000000113 >>>>>> -rw-r--r-- 1 root root 23M Jun 18 22:48 index.000000114 >>>>>> -rw-r--r-- 1 root root 23M Jun 18 23:01 index.000000115 >>>>>> -rw-r--r-- 1 root root 23M Jun 18 23:14 index.000000116 >>>>>> -rw-r--r-- 1 root root 23M Jun 18 23:27 index.000000117 >>>>>> -rw-r--r-- 1 root root 23M Jun 18 23:40 index.000000118 >>>>>> -rw-r--r-- 1 root root 23M Jun 18 23:52 index.000000119 >>>>>> -rw-r--r-- 1 root root 23M Jun 19 01:13 index.000000120 >>>>>> -rw-r--r-- 1 root root 8.0M Jun 19 01:13 wal-000007799 >>>>>> >>>>>> Is this a normal situation? >>>>>> >>>>>> 3. Not a question. Please, consider adding documentation about the >>>>>> estimation of WAL storage. Also, I can't found any mentions about index >>>>>> files, except here >>>>>> https://kudu.apache.org/docs/scaling_guide.html#file_descriptors. >>>>>> >>>>>> Thanks! >>>>>> >>>>>> -- >>>>>> with best regards, Pavel Martynov >>>>>> >>>>> >>>>> >>>>> -- >>>>> Todd Lipcon >>>>> Software Engineer, Cloudera >>>>> >>>> >>>> >>>> -- >>>> with best regards, Pavel Martynov >>>> >>> >>> >>> -- >>> Todd Lipcon >>> Software Engineer, Cloudera >>> >> >> >> -- >> with best regards, Pavel Martynov >> > > > -- > Todd Lipcon > Software Engineer, Cloudera > -- with best regards, Pavel Martynov
