Great! This is exactly what I needed. Thank you for taking the time to respond. BenTimber.io -Blog -Github -Twitter
On Thu, Jan 19, 2017 5:57 PM, Gopal Vijayaraghavan [email protected] wrote: > 1. If we merge these files using the above query, will the ordering be preserved? The merge does not re-compress, but it will concatenate files (literally) & write a new footer index. So most characteristics of the original ordering will be maintained within each stripe, though the file level indexes will change. > 2. And will the file be structured in a way that it will take full advantage of the indexes? Yes, well as much as the original files. The row-group index is not affected at all and will work exactly as before & same for the stripe elimination via PPD. > 3. Or are we better served transitioning the data through a traditional INSERT query from a second table? Not always, particularly if you're dealing with good performance right now. You can get much better compression with a reinsert, but that might be a disadvantage in some cases (since compressed stripe forms an indivisible split, reducing the stripe # might reduce splits). More compression == fewer cpus to process the same data. So that's a trade-off. Cheers, Gopal
