----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/6650/#review10836 -----------------------------------------------------------
metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java <https://reviews.apache.org/r/6650/#comment23341> Please remove this blank line. metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java <https://reviews.apache.org/r/6650/#comment23340> These methods don't know for a fact that they're getting called by HiveMetaTool, so I don't think the logging statements should reference HiveMetaTool. metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java <https://reviews.apache.org/r/6650/#comment23342> blank line. metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java <https://reviews.apache.org/r/6650/#comment23343> blank line. metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java <https://reviews.apache.org/r/6650/#comment23344> I think this should be refactored one of two ways: 1) Create a method "Set<MDatabase> getDatabases()" that returns all MDatabase records, and then iterate over the result to extract the locationUris. 2) Execute a JDOQL query that only extracts the locationUri field (see ObjectStore.getDatabases(String pattern) for an example of how to do this). Option (1) is more generic, but option (2) will be more efficient since it avoids extracting unnecessary information. In either case the name and signature of this method should be changed to Set<URI> getDatabaseLocationURIs() metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java <https://reviews.apache.org/r/6650/#comment23345> This logic belongs in HiveMetaTool. metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java <https://reviews.apache.org/r/6650/#comment23339> Please be consistent with blank lines. I don't think having one here improves readability. metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java <https://reviews.apache.org/r/6650/#comment23348> This logic (or most of it) also needs to get moved up into HiveMetaTool. The tip-off that this is necessary is that we can't print to stdout/stderr here, and printing the dry-run info to the log stream isn't good enough because it requires folks to have logging enabled and redirected to the console in order to actually see it. I also tend to doubt that we want to do the entire update in a single transaction. Most administrators will probably find a best effort approach more useful since it means they can use at least some of their tables after an HA failover even some updates fail to commit. Also, one of the Java coding conventions is to give boolean variables names starting with "is", e.g. "isDryRun". metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java <https://reviews.apache.org/r/6650/#comment23350> Regardless of where this logic goes, I think the method is too long. It needs to get split up into smaller pieces. I recommend splitting it into separate methods for the DB, StorageDescriptor, and SerdeParam updates. metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java <https://reviews.apache.org/r/6650/#comment23346> Formatting. metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java <https://reviews.apache.org/r/6650/#comment23347> Formatting. metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java <https://reviews.apache.org/r/6650/#comment23337> Looks like there's a TAB here. metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java <https://reviews.apache.org/r/6650/#comment23351> It looks like if we encounter this exception we're not going to be able to skip over the bad field and continue updating those that follow. Utilities like this should generally try to make a best effort to execute the user's command, and log any errors that they encounter to stderr. metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java <https://reviews.apache.org/r/6650/#comment23349> Info like this needs to get dumped to stderr, which reinforces the point that this logic belongs in HiveMetaTool. metastore/src/java/org/apache/hadoop/hive/metastore/tools/HiveMetaTool.java <https://reviews.apache.org/r/6650/#comment23352> You're working around commons-cli instead of with it. This should get split into three separate arguments. If they aren't relevant to the specific action requested by the user (e.g. when a query string is specified) you can either abort with an error or ignore them. metastore/src/java/org/apache/hadoop/hive/metastore/tools/HiveMetaTool.java <https://reviews.apache.org/r/6650/#comment23353> Prefixing everything with "HiveMetaTool" is unnecessary. metastore/src/java/org/apache/hadoop/hive/metastore/tools/HiveMetaTool.java <https://reviews.apache.org/r/6650/#comment23354> Generally speaking when a class has a main() method the main() method should only be responsible for parsing the command line args and using these values to initialize and control an instance of the class via the public methods that the class exposes. In other words, main() should not refer directly to ObjectStore -- it should be handled instead in HiveMetaTool's constructor or in its init() method. The advantage of structuring the code this way is that you end up with a class that can also be used productively by other code. The other reason why you need to structure the code this way is because Java's main method must have a void return type, which precludes you from returning non-zero values on errors. In order to do that you have to call System.exit(), and doing that makes it difficult for other code to cheat and invoke your main() method directly. metastore/src/java/org/apache/hadoop/hive/metastore/tools/HiveMetaTool.java <https://reviews.apache.org/r/6650/#comment23355> blank line. metastore/src/java/org/apache/hadoop/hive/metastore/tools/HiveMetaTool.java <https://reviews.apache.org/r/6650/#comment23356> We should print the help screen here and then exit immediately with a non-zero return value. metastore/src/java/org/apache/hadoop/hive/metastore/tools/HiveMetaTool.java <https://reviews.apache.org/r/6650/#comment23358> Always assume that the user is going to use the command as part of a pipeline. Given that assumption it makes sense to only output the information they requested unless an error occurred. metastore/src/java/org/apache/hadoop/hive/metastore/tools/HiveMetaTool.java <https://reviews.apache.org/r/6650/#comment23357> Under what circumstances would you expect this error to get triggered? metastore/src/java/org/apache/hadoop/hive/metastore/tools/HiveMetaTool.java <https://reviews.apache.org/r/6650/#comment23359> This does not look right. Please handle this with commons-cli instead. metastore/src/java/org/apache/hadoop/hive/metastore/tools/HiveMetaTool.java <https://reviews.apache.org/r/6650/#comment23360> This looks weird. updateLocations = objStore.updateFSRootLocations(...) if (isDryRun) { printUpdateLocations(updateLocations); } metastore/src/java/org/apache/hadoop/hive/metastore/tools/HiveMetaTool.java <https://reviews.apache.org/r/6650/#comment23361> Putting shudown code at the end of a long method like this is risky because it assumes that you'll always make it here. Maybe that's the case today, but you have to assume the next person who modifies this code maintain that invariant. Code in a finally() block always gets executed (ok, there are some exceptions to that rule), so putting your cleanup code in a finally block (even if you don't have any catch blocks) is usually a good idea. metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetaTool.java <https://reviews.apache.org/r/6650/#comment23362> Wow this looks painful. We really need to make MetaStoreClient easier to use. metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetaTool.java <https://reviews.apache.org/r/6650/#comment23363> Does anything here actually throw a java.lang.Error? If not then you should catch an Exception instead of a Throwable. metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetaTool.java <https://reviews.apache.org/r/6650/#comment23364> The test methods that JUnit calls should be declare to throw exceptions. If one is thrown JUnit will automatically log the exception along with additional debugging info. Squelching it here means that your test will always pass. metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetaTool.java <https://reviews.apache.org/r/6650/#comment23338> Extra ';' - Carl Steinbach On Aug. 29, 2012, 3:24 a.m., Shreepadma Venugopalan wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/6650/ > ----------------------------------------------------------- > > (Updated Aug. 29, 2012, 3:24 a.m.) > > > Review request for hive and Carl Steinbach. > > > Description > ------- > > This patch implement hive metatool which, > > * lets admins perform a HA upgrade by patching the location of the NN in > Hive's metastore > * allows JDOQL to be executed against the metastore. > > > This addresses bug HIVE-3056. > https://issues.apache.org/jira/browse/HIVE-3056 > > > Diffs > ----- > > bin/ext/metatool.sh PRE-CREATION > bin/metatool PRE-CREATION > build.xml 6712af9 > eclipse-templates/TestHiveMetaTool.launchtemplate PRE-CREATION > metastore/ivy.xml 3011d2f > metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java > 045b550 > metastore/src/java/org/apache/hadoop/hive/metastore/tools/HiveMetaTool.java > PRE-CREATION > metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetaTool.java > PRE-CREATION > > Diff: https://reviews.apache.org/r/6650/diff/ > > > Testing > ------- > > A new JUnit test - TestHiveMetaTool - has been added to test the various > metatool options. > > > Thanks, > > Shreepadma Venugopalan > >