Hi Hayden, Most of the recommendation looks okay to me since there are many change to be done I think you should go ahead and create main JIRA which would have multiple subtasks addressing all the changes. I am almost sure that you might get into similar kind of issue if you run other java based NoSql distributions i.e. HBase/Cassandra on IBM jdk, I personally had surprises in api calls related to ordering in my application a long back ago. Your observations looks reasonable to me.
Regards, Vicky On Thu, Jun 19, 2014 at 3:47 PM, Hayden Marchant <[email protected]> wrote: > Hi there, > > I have been working on getting Accumulo running on IBM JDK, as preparation > of including Accumulo in an upcoming version of BigInsights (IBM's Hadoop > distribution). I have come across a number of issues, to which I have made > some local fixes in my own environment. Since I'm a newbie in Accumulo, I > wanted to make sure that the approach that I have taken for resolving > these issues is aligned with the design intent of Accumulo. > > Some of the issues are real defects, and some are instances in which the > assumption of Sun/Oracle JDK being the used JVM is hard-coded into the > source-code. > > I have grouped the issues into 2 sections - Unit test failures and > Sun-specific dependencies (though there is an overlap) > > 1. Unit Test failures - should run consistently no matter which OS, Java > vendor/version etc... > a. > > org.apache.accumulo.core.util.format.ShardedTableDistributionFormatterTest.testAggregate > . This fails on IBM JRE, since the test is asserting order of elements in > a HashMap. This consistently passes on Sun , and consistently fails on > Oracle. Proposal: Change ShardedTableDistributionFormatter.countsByDay to > TreeMap > > b. > > org.apache.accumulo.core.security.crypto.BlockedIOStreamTest.testGiantWrite. > This test assumes a max heap of about 1GB. This fails on IBM JRE, > since the default max heap is not specified, and on IBM JRE this depends > on the OS (see > > http://www-01.ibm.com/support/knowledgecenter/SSYKE2_6.0.0/com.ibm.java.doc.diagnostics.60/diag/appendixes/defaults.html?lang=en > ). > Proposal: add -Xmx1g to the surefire maven plugin reference in > parent maven pom. > > c. Both org.apache.accumulo.core.security.crypto.CrypoTest & > org.apache.accumulo.core.file.rfile.RFileTest have lots of failures due to > calls to SEcureRandom with Random Number Generator Provider hard-coded as > Sun. The IBM JRE has it's own built in RNG Provider called IBMJCE. 2 > issues - hard-coded calls to SecureRandom.getInstance(<algo>,"SUN") and > also default value in Property class is "SUN". > Proposal: Add mechanism to override default Property through > System property through new annotator in Property class. Only usage will > be by Property.CRYPTO_SECURE_RNG_PROVIDER > > > 2. Environment/Configuration > a. The generated configuration files contain references to GC > params that are specific to Sun JVM. In accumulo-env.sh, the > ACCUMULO_TSERVER_OPTS contains -XX:NewSize and -XX:MaxNewSize , and also > in ACCUMULO_GENERAL_OPTS, > -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 are used. > b. in bin/accumulo, get ClassNotFoundException due to > specification of JAXP Doc Builder: > > -Djavax.xml.parsers.DocumentBuilderFactory=com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl > . > The Sun implementation of Document Builder Factory does not exists > in IBM JDK, so a ClassNotFoundException is thrown on running accumulo > script > > c. MiniAccumuloCluster - in the MiniAccumuloClusterImpl, > Sun-speciifc GC params are passed as params to the java process (similar > to section a. ) > > Single proposal for solving all three above issues: > Enhance bootstrap_config.sh with request to select Java vendor. > Selecting this will set correct values for GC params (they differ between > IBM and Sun), inclusion/ommision of JAXP setting. The > MiniAccumuloClusterImpl can read the same env variable that was set in > code for the GC Params, and use in the exec command. > > > So far, my work has been focused on getting unit tests working for all > Java vendors in a clean manner. I have not yet run intensive testing of > real clusters following these changes, and would be happy to get pointers > to what else might need treatment. > > I would also like to hear if these changes make sense, and if so, should > I go ahead and create some JIRAs, and attach my patches for commit > approval? > > Looking forward to hearing feedback! > > Regards, > Hayden Marchant > Software Architect > IBM BigInsights, IBM >
