----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/6650/#review10764 -----------------------------------------------------------
conf/hive-default.xml.template <https://reviews.apache.org/r/6650/#comment23202> This comment has fallen out of sync with the value given. Please back this change out since it is no longer relevant to this patch. metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java <https://reviews.apache.org/r/6650/#comment23204> This comment should be repeated for each of the new methods added to ObjectStore. metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java <https://reviews.apache.org/r/6650/#comment23203> Please replace the println calls with LOG.info() calls. Generally speaking the only code that should use System.*.println to print directly to stdout/stderr is stuff that's located in an application's main() method, or in classes that contain a main() method. Also, please change the sentence to use the active tense, i.e. "Executing query: ..." metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java <https://reviews.apache.org/r/6650/#comment23205> Seems like this stuff should only get printed out if the transaction succeeded. metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java <https://reviews.apache.org/r/6650/#comment23207> Formatting: please remove these blank lines. metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java <https://reviews.apache.org/r/6650/#comment23206> Same deal here. I know that commitTransaction is probably kind of broken, but we should only print this out if committed == true. Also, each of these methods should probably either return a boolean or throw an exception in order to indicate that transaction succeeded or failed. metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java <https://reviews.apache.org/r/6650/#comment23209> Doesn't look the 'committed' boolean variable is used. Also, this function should probably return a List<String> of hdfsRoot strings instead of printing them out. metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java <https://reviews.apache.org/r/6650/#comment23208> Formatting: '{' goes on the same line as the method parameter list. metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java <https://reviews.apache.org/r/6650/#comment23210> Please use a foreach loop instead of an explicit iterator. Same comment applies to the other methods when possible. metastore/src/java/org/apache/hadoop/hive/metastore/tools/HiveMetaTool.java <https://reviews.apache.org/r/6650/#comment23211> s/data nucleus/DataNucleus/ metastore/src/java/org/apache/hadoop/hive/metastore/tools/HiveMetaTool.java <https://reviews.apache.org/r/6650/#comment23216> s/location/locations/ Also, it would be good to eliminate duplicate entries from the output. metastore/src/java/org/apache/hadoop/hive/metastore/tools/HiveMetaTool.java <https://reviews.apache.org/r/6650/#comment23221> "Update HDFS root location..." instead of "Updates HDFS root location..." metastore/src/java/org/apache/hadoop/hive/metastore/tools/HiveMetaTool.java <https://reviews.apache.org/r/6650/#comment23219> Please change "HAUpgrade" to something that isn't HA specific. Also, all variable names (with the exception of final statics) should adhere to the camel-case naming convention, including starting with a lowercase letter (which I admit is a bit awkward in this case). metastore/src/java/org/apache/hadoop/hive/metastore/tools/HiveMetaTool.java <https://reviews.apache.org/r/6650/#comment23220> I'm a little worried that this operation is too generic, and that as a consequence we're probably missing some errors. Every location value stored in MDatabase and MStorageDescriptor should be a valid o.a.h.fs.Path object (http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/fs/Path), which means that it should also be a valid URI (http://docs.oracle.com/javase/6/docs/api/java/net/URI.html?is-external=true). java.net.URI allows you to pick out individual components of the URI, and in this case I think we're mainly interested in updating the "host" (and possibly the "port") part of the URI. So I think metatool should give you option of specifying a scheme, host, and port to match (although only the host is required), as well as replacement values for the those three parts, although only the host is required. Another thing to point out is that all three of these things should be case-insensitive (which can't be said for "path" component of the URI), so everything should get converted to lowercase in ObjectStore before the matching logic is applied. I think what we want is something like the following: metatool --updateLocation [--oldScheme=hdfs] --oldHost=127.0.0.1 [--oldPort=xyz] --newHost=yoyodyne [--newPort=333] etc. Also, can you please add a --dryRun option that dumps logs the changes that would be made to stdout but doesn't actually make any of the changes? This way people can first experiment with the options without making any big mistakes. Finally, it's possible to run Hive on top of other filesystems besides HDFS (S3 is one example), so it's probably best to avoid using the term "HDFS" in this code. metastore/src/java/org/apache/hadoop/hive/metastore/tools/HiveMetaTool.java <https://reviews.apache.org/r/6650/#comment23218> Please remove. metastore/src/java/org/apache/hadoop/hive/metastore/tools/HiveMetaTool.java <https://reviews.apache.org/r/6650/#comment23213> main() should not throw any checked exceptions. Please catch them and print appropriate error messages. metastore/src/java/org/apache/hadoop/hive/metastore/tools/HiveMetaTool.java <https://reviews.apache.org/r/6650/#comment23212> I don't think moving this logic to run() really accomplishes anything. Please move it back into main(). metastore/src/java/org/apache/hadoop/hive/metastore/tools/HiveMetaTool.java <https://reviews.apache.org/r/6650/#comment23223> It looks like run() has two ways of signaling an error, either by returning a non-zero value or by throwing an exception. Exceptions are semantically much richer than return codes (e.g. you can wrap return codes in exceptions if you really want to) and are the right way of handling error conditions in Java. Generally speaking the only method that should use an error return code is main(). metastore/src/java/org/apache/hadoop/hive/metastore/tools/HiveMetaTool.java <https://reviews.apache.org/r/6650/#comment23217> Options should start with a lowercase letter, e.g. "listHDFSRoots", "executeJDOQL", etc. metastore/src/java/org/apache/hadoop/hive/metastore/tools/HiveMetaTool.java <https://reviews.apache.org/r/6650/#comment23215> Formatting: else should go on the same line as '}' metastore/src/java/org/apache/hadoop/hive/metastore/tools/HiveMetaTool.java <https://reviews.apache.org/r/6650/#comment23214> A couple things: * Throwing java.lang.Exception is never the right thing to do. If you have to throw an exception you need to throw a subclass of java.lang.Exception, or the appropriate runtime exception if it's not something that you expect another caller further down the stack will be able to handle. * If the command line args are incorrect or unrecognizable it's considerably more polite to echo the unrecognizable part to stderr and then print the help screen. Right now when I run metatool without an arguments I get an exception. It should print the help info instead. metastore/src/java/org/apache/hadoop/hive/metastore/tools/HiveMetaTool.java <https://reviews.apache.org/r/6650/#comment23222> Please use e.getLocalizedMessage() instead. Also, it's generally a good idea to put the catch() block as close to the thrower as possible. Wrapping parser.parse() with it's own try/catch block would probably be cleaner. - Carl Steinbach On Aug. 25, 2012, 9 p.m., Shreepadma Venugopalan wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/6650/ > ----------------------------------------------------------- > > (Updated Aug. 25, 2012, 9 p.m.) > > > Review request for hive and Carl Steinbach. > > > Description > ------- > > This patch implement hive metatool which, > > * lets admins perform a HA upgrade by patching the location of the NN in > Hive's metastore > * allows JDOQL to be executed against the metastore. > > > This addresses bug HIVE-3056. > https://issues.apache.org/jira/browse/HIVE-3056 > > > Diffs > ----- > > bin/ext/metatool.sh PRE-CREATION > bin/metatool PRE-CREATION > build.xml 6712af9 > conf/hive-default.xml.template cc1565f > eclipse-templates/TestHiveMetaTool.launchtemplate PRE-CREATION > metastore/ivy.xml 3011d2f > metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java > 045b550 > metastore/src/java/org/apache/hadoop/hive/metastore/tools/HiveMetaTool.java > PRE-CREATION > metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetaTool.java > PRE-CREATION > > Diff: https://reviews.apache.org/r/6650/diff/ > > > Testing > ------- > > A new JUnit test - TestHiveMetaTool - has been added to test the various > metatool options. > > > Thanks, > > Shreepadma Venugopalan > >