On Mon, Jul 2, 2018 at 4:36 AM, Sergejs Andrejevs <[email protected]> wrote:
> Hi there, > > > > I am very glad Kudu is evolving so rapidly. Thanks for your contributions! > > > > I have faced with a challenge: > > I need to upgrade (reinstall) prod servers, where 3 Kudu masters are > running. What would be the best way to do it from Kudu perspective? > Will you be changing the hostnames of the servers or just reinstalling the OS but otherwise keeping the same configuration? Have you partitioned the servers with separate OS and data disks? If so, you can likely just reinstall the OS without reformatting the data disks. When the OS has been reinstalled, simply install Kudu again, use the same configuration as before to point at the existing data directories, and everything should be fine. > If it is not officially supported yet, could you advise a way, which > minimizes the risks? > > > > Environment/conditions: > > Cloudera 5.14 > > Kudu 1.6 > > High-level procedure: remove 1 server from cluster, upgrade, return back > to CM cluster, check, proceed with the next server. > > Some downtime is possible (let’s say < 1h) > I can't give any particular advise on how this might interact with Cloudera Manager. I think the Cloudera community forum probably is a more appropriate spot for that. But, from a Kudu-only perspective, it should be fine to have a mixed-OS cluster where one master has been upgraded before the others. Just keep the data around when you reinstall. > > > Approach: > > I have already tried out at test cluster the steps, which were used to > migrate from a single-master to multi-master cluster (see the plan below). > However, there was a remark not to use it in order to add new nodes for 3+ > master cluster. > Therefore, what could be an alternative way? If no alternatives, what > could be the extra steps to pay additional attention to check the status if > Kudu cluster is in a good shape? > Any comments/suggestions are extremely appreciated as well. > > > > Current plan: > > 0. Cluster check > > 1. Stop all masters (let’s call them master-1, master-2, master-3). > > 2. Remove from CM one Kudu master, e.g. master-3. > > 3. Update raft meta by removing “master-3” from Kudu cluster (to be > able to restart Kudu): > *sudo -u kudu kudu local_replica cmeta rewrite_raft_config > 00000000000000000000000000000000 1234567890:master-1:7051 > 0987654321:master-2:7051* > By the way, do I understand right that tablet_id > 00000000000000000000000000000000 is a special, containing cluster meta > info? > > 4. Start all masters. From now Kudu temporary consists of 2 masters. > Why bother removing the master that's down? If you can keep its data around, and it will come back with the same hostname, there's no need to remove it. You could simply shut down the node and be running with 2/3 servers up, which would give you the same reliability as using 2/2 without the extra steps. > 5. Cluster check. > > 6. Upgrade the excluded server > > 7. Stop all masters. > > 8. Prepare “master-3” as Kudu master: > > *sudo -u kudu kudu fs format --fs_wal_dir=… --fs_data_dirs=… sudo -u kudu > kudu fs dump uuid --fs_wal_dir=… --fs_data_dirs=… 2>/dev/null* > Let’s say obtained id is 7777777777. > Add master-3 to CM. > > 9. Run metainfo update at existing masters, i.e. master-1 and > master-2: > > *sudo -u kudu kudu local_replica cmeta rewrite_raft_config > 00000000000000000000000000000000 1234567890:master-1:7051 > 0987654321:master-2:7051 7777777777:master-3:7051* > > 10. Start one master, e.g. master-1. > > Copy the current cluster state from master-1 to master-3: > *sudo -u kudu kudu local_replica copy_from_remote --fs_wal_dir=… > --fs_data_dirs=… 00000000000000000000000000000000 1234567890:master-1:7051* > > 11. Start remaining Kudu masters: master-2 and master-3. > > 12. Cluster check. > > > > * Optionally, at first there may be added 1 extra node (to increase from 3 > to 4 the initial number of Kudu masters, so that after removal of 1 node > there are still HA with quorum of 3 masters). In this case steps 7-12 > should be repeated and additionally HiveMetaStore update should be executed: > > *UPDATE hive_meta_store_database.TABLE_PARAMS* > > *SET PARAM_VALUE = 'master-1,master-2,master-3'* > > *WHERE PARAM_KEY = 'kudu.master_addresses' AND PARAM_VALUE = > 'master-1,master-2,master-3,master-4';* > > After upgrades, the master-4 node to be removed by running > steps 1-5. > > > > Thanks! > > > > Best regards, > > *Sergejs Andrejevs* > > Information about how we process personal data > <http://www.intrum.com/privacy> > > > -- Todd Lipcon Software Engineer, Cloudera
