How much duplicate code are we talking about? Do these share common code to close/open the regions, or is it a completely different code path? (I know I could look myself, but it's easier to ask)
From a gut reaction it feels cleaner to keep track of the state of the alter in ZK. ________________________________ From: Jean-Daniel Cryans <[email protected]> To: [email protected] Sent: Monday, April 2, 2012 4:56 PM Subject: Pull instant schema updating out? Hi nifty devs, After encountering HBASE-5702, I started playing with instant schema updating (HBASE-4213) a bit more and I must say that it's a bit rough which makes me wonder... should we pull that code out? We're in this "interesting" situation in 0.94 where we have two different ways to alter tables without disabling them and I don't trust either. I'm pretty sure most of the devs don't even know which one takes precedence over the other when both are enabled without looking at the code. Well, right now hbase.online.schema.update.enable needs to be enabled in order to have hbase.instant.schema.alter.enabled working. If only the former is enabled the master handles the alter, else if both are enabled then it's going to be done via ZK although the master still keeps track of it. So the differences between both IIUC: - "Online schema update" is a rolling close/open of all the regions so that they pick up the new HTD. It's handled by the master and has been in since 0.92. I've used it quite a bit when running other tests and it's ok as long as regions are not splitting and RS are not shutting down. We also enabled it on our clusters here since our regions don't tend to move that much. - "Instant schema alter" is instant in the sense that all the regions are asked to close from the get-go but effectively the region servers can only close one region at a time. The state of the alter is kept in ZK and the master has a bunch of watches and logs the progress. It's new since 0.94 and I'm not sure if anyone is using it. I've tested it a bit and at the moment I can say the the MonitedTasks handling needs to be redone, it logs way too much information in the log, but the few alters I ran worked... It's just a bit hard to know when they're done. FWIW we could pull either or both out, but instant schema alter hasn't been in a released version yet so it's unlikely it'll bother someone while the other is already in use (like here). Opinions? Thanks, J-D
