Hi All! We are about to upgrade the Java version of our Hadoop cluster (Hadoop 2.2.0). Just would like to ask about your recommendations and experience:
(A) Should we schedule a downtime of a whole cluster and then upgrade Java everywhere (all Hadoop projects e.g. HDFS, YARN, Pig, Hive, Sqoop etc) or (B) Can we do a rolling upgrade to avoid downtime (this means that we would run some nodes/projects on Java 6 talking to nodes/projects running on Java 7). According to our initial research, (A) is generally recommended (because some libraries can have incompatible APIs between their java 7 and java 6 versions e.g. Google Guavaa), but we are curious if there are any ways to make (B) happen. Cheers! Adam
