On 7/20/12 1:23 PM, "Billie J Rinaldi" <[email protected]> wrote:
> >One thing you should think about is making it so that you only have one >file per tablet, i.e. that you create a new split point for every new >file that you import. This should be doable if your files are pretty >large and you don't end up having too many tablets. If there is only one >file per tablet, it won't compact unless you tell it to. Awesome...that's exactly the case...I'll have one file per tablet, and all the files should be more-or-less the same size (within 10% or so), on the order of a gigabyte each. Thanks for the split point tip...I hadn't thought of that. This should do exactly what I want. Thanks! Ed > >If you want to have multiple files per tablet, there are a number of >parameters you should think about. However, you should make sure that >you don't have too many files per tablet because 1) query performance >will suffer and 2) there is a limit to the number of files that a tablet >server will open. The limit to open files is adjustable. For scan, it >defaults to 100 files for all the tablets, and for major compaction it >defaults to 10 files per tablet (but the compaction can be performed in >stages). > >To change the compaction criteria, adjust table.file.max and >table.compaction.major.ratio. table.file.max is the maximum number of >files that a tablet can have. If a tablet has more files than this, it >will compact. table.compaction.major.ratio governs when compaction >occurs when a tablet has fewer files than the maximum. It also governs >which files are compacted together in either case. Raising the ratio >will make compactions happen less. If table.file.max is larger than the >number of files you expect to have per tablet, setting >table.compaction.major.ratio to the same value as table.file.max should >keep it from compacting unless there is high variation in your file >sizes. A set of files is compacted into a single file if the size of the >largest file times the ratio is <= the sum of the sizes of the files. > >Billie >
