[ 
https://issues.apache.org/jira/browse/HBASE-20295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16418264#comment-16418264
 ] 

Michael Jin commented on HBASE-20295:
-------------------------------------

{code:java}
// code placeholder
[ERROR] Error occurred in starting fork, check output in log
[ERROR] Process Exit Code: 1
[ERROR]         at 
org.apache.maven.plugin.surefire.booterclient.ForkStarter.awaitResultsDone(ForkStarter.java:496)
[ERROR]         at 
org.apache.maven.plugin.surefire.booterclient.ForkStarter.runSuitesForkPerTestSet(ForkStarter.java:443)
[ERROR]         at 
org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:295)
[ERROR]         at 
org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:246)
[ERROR]         at 
org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeProvider(AbstractSurefireMojo.java:1124)
[ERROR]         at 
org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeAfterPreconditionsChecked(AbstractSurefireMojo.java:954)
[ERROR]         at 
org.apache.maven.plugin.surefire.AbstractSurefireMojo.execute(AbstractSurefireMojo.java:832)
[ERROR]         at 
org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:137)
[ERROR]         at 
org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:208)
[ERROR]         at 
org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:154)
[ERROR]         at 
org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:146)
[ERROR]         at 
org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:117)
[ERROR]         at 
org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:81)
[ERROR]         at 
org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build(SingleThreadedBuilder.java:56)
[ERROR]         at 
org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:128)
[ERROR]         at 
org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:305)
[ERROR]         at 
org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:192)
[ERROR]         at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:105)
[ERROR]         at org.apache.maven.cli.MavenCli.execute(MavenCli.java:956)
[ERROR]         at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:290)
[ERROR]         at org.apache.maven.cli.MavenCli.main(MavenCli.java:194)
[ERROR]         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[ERROR]         at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
[ERROR]         at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[ERROR]         at java.lang.reflect.Method.invoke(Method.java:498)
[ERROR]         at 
org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:289)
[ERROR]         at 
org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:229)
[ERROR]         at 
org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:415)
[ERROR]         at 
org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:356)
[ERROR] Caused by: 
org.apache.maven.surefire.booter.SurefireBooterForkException: The forked VM 
terminated without properly saying goodbye. VM crash or System.exit called?
[ERROR] Command was /bin/sh -c cd /testptch/hbase/hbase-mapreduce && 
/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -enableassertions 
-Dhbase.build.id=2018-03-28T23:51:24Z -Xmx2800m 
-Djava.security.egd=file:/dev/./urandom -Djava.net.preferIPv4Stack=true 
-Djava.awt.headless=true -jar 
/testptch/hbase/hbase-mapreduce/target/surefire/surefirebooter8410814171354767862.jar
 /testptch/hbase/hbase-mapreduce/target/surefire 
2018-03-28T23-51-40_553-jvmRun1 surefire7620982142840600885tmp 
surefire_521935780032859030481tmp
[ERROR] Error occurred in starting fork, check output in log
[ERROR] Process Exit Code: 1
[ERROR]         at 
org.apache.maven.plugin.surefire.booterclient.ForkStarter.fork(ForkStarter.java:686)
[ERROR]         at 
org.apache.maven.plugin.surefire.booterclient.ForkStarter.fork(ForkStarter.java:535)
[ERROR]         at 
org.apache.maven.plugin.surefire.booterclient.ForkStarter.access$700(ForkStarter.java:116)
[ERROR]         at 
org.apache.maven.plugin.surefire.booterclient.ForkStarter$2.call(ForkStarter.java:431)
[ERROR]         at 
org.apache.maven.plugin.surefire.booterclient.ForkStarter$2.call(ForkStarter.java:408)
[ERROR]         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
[ERROR]         at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
[ERROR]         at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[ERROR]         at java.lang.Thread.run(Thread.java:748)
[ERROR] 
{code}
This error seems not caused by code change, anyone can help?

> TableOutputFormat.checkOutputSpecs throw NullPointerException Exception
> -----------------------------------------------------------------------
>
>                 Key: HBASE-20295
>                 URL: https://issues.apache.org/jira/browse/HBASE-20295
>             Project: HBase
>          Issue Type: Bug
>          Components: mapreduce
>    Affects Versions: 1.4.0
>         Environment: Spark 2.2.1, HBase 1.4.0
>            Reporter: Michael Jin
>            Assignee: Michael Jin
>            Priority: Major
>         Attachments: HBASE-20295.branch-1.4.001.patch, 
> HBASE-20295.master.001.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> I am using spark write data to HBase by using RDD.
> saveAsNewAPIHadoopDataset function, it works fine with hbase 1.3.1, but when 
> update my hbase dependency to 1.4.0 in pom.xml, it throw 
> java.lang.NullPointerException, it is caused by a logic error in 
> TableOutputFormat.checkOutputSpecs function, please check below details:
> first let's take a look at SparkHadoopMapReduceWriter.write function in 
> SparkHadoopMapReduceWriter.scala
> {code:java}
> // SparkHadoopMapReduceWriter.write 
> (org.apache.spark.internal.io.SparkHadoopMapReduceWriter.scala)
> def write[K, V: ClassTag](
>     rdd: RDD[(K, V)],
>     hadoopConf: Configuration): Unit = {
>   // Extract context and configuration from RDD.
>   val sparkContext = rdd.context
>   val stageId = rdd.id
>   val sparkConf = rdd.conf
>   val conf = new SerializableConfiguration(hadoopConf)
>   // Set up a job.
>   val jobTrackerId = SparkHadoopWriterUtils.createJobTrackerID(new Date())
>   val jobAttemptId = new TaskAttemptID(jobTrackerId, stageId, TaskType.MAP, 
> 0, 0)
>   val jobContext = new TaskAttemptContextImpl(conf.value, jobAttemptId)
>   val format = jobContext.getOutputFormatClass
>   if (SparkHadoopWriterUtils.isOutputSpecValidationEnabled(sparkConf)) {
>     // FileOutputFormat ignores the filesystem parameter
>     val jobFormat = format.newInstance
>     jobFormat.checkOutputSpecs(jobContext)
>   }
>   val committer = FileCommitProtocol.instantiate(
>     className = classOf[HadoopMapReduceCommitProtocol].getName,
>     jobId = stageId.toString,
>     outputPath = 
> conf.value.get("mapreduce.output.fileoutputformat.outputdir"),
>     isAppend = false).asInstanceOf[HadoopMapReduceCommitProtocol]
>   committer.setupJob(jobContext)
> ...{code}
> in "write" function if output spec validation is enabled, it will call 
> checkOutputSpec function in TableOutputFormat class, but the job format is 
> simply created by "vall jobFormat = format.newInstance", this will NOT 
> initialize "conf" member variable in TableOutputFormat class, let's continue 
> check checkOutputSpecs function in TableOutputFormat class
>  
> {code:java}
> // TableOutputFormat.checkOutputSpecs 
> (org.apache.hadoop.hbase.mapreduce.TableOutputFormat.java) HBASE 1.4.0
> @Override
> public void checkOutputSpecs(JobContext context) throws IOException,
>     InterruptedException {
>   try (Admin admin = 
> ConnectionFactory.createConnection(getConf()).getAdmin()) {
>     TableName tableName = TableName.valueOf(this.conf.get(OUTPUT_TABLE));
>     if (!admin.tableExists(tableName)) {
>       throw new TableNotFoundException("Can't write, table does not exist:" +
>           tableName.getNameAsString());
>     }
>     if (!admin.isTableEnabled(tableName)) {
>       throw new TableNotEnabledException("Can't write, table is not enabled: 
> " +
>           tableName.getNameAsString());
>     }
>   }
> }
> {code}
>  
> "ConnectionFactory.createConnection(getConf())", as mentioned above "conf" 
> class member is not initialized, so getConf() will return null, so in the 
> next UserProvider create instance process, it throw the 
> NullPointException(Please part of stack trace at the end), it is a little 
> confused that, context passed by function parameter is actually been properly 
> constructed, and it contains Configuration object, why context is never used? 
> So I suggest to use below code to partly fix this issue:
>  
> {code:java}
> // code placeholder
> @Override
> public void checkOutputSpecs(JobContext context) throws IOException,
>     InterruptedException {
>   Configuration hConf = context.getConfiguration();
>   if(hConf == null)
>     hConf = this.conf;
>   try (Admin admin = ConnectionFactory.createConnection(hConf).getAdmin()) {
>     TableName tableName = TableName.valueOf(hConf.get(OUTPUT_TABLE));
>     if (!admin.tableExists(tableName)) {
>       throw new TableNotFoundException("Can't write, table does not exist:" +
>               tableName.getNameAsString());
>     }
>     if (!admin.isTableEnabled(tableName)) {
>       throw new TableNotEnabledException("Can't write, table is not enabled: 
> " +
>               tableName.getNameAsString());
>     }
>   }
> }
> {code}
> In hbase 1.3.1, this issue is not exists because checkOutputSpecs has a blank 
> function body
>  
>  
> Part of stack trace:
> Exception in thread "main" java.lang.NullPointerException
>  at 
> org.apache.hadoop.hbase.security.UserProvider.instantiate(UserProvider.java:122)
>  at 
> org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:214)
>  at 
> org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:119)
>  at 
> org.apache.hadoop.hbase.mapreduce.TableOutputFormat.checkOutputSpecs(TableOutputFormat.java:177)
>  at 
> org.apache.spark.internal.io.SparkHadoopMapReduceWriter$.write(SparkHadoopMapReduceWriter.scala:76)
>  at 
> org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply$mcV$sp(PairRDDFunctions.scala:1085)
>  at 
> org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply(PairRDDFunctions.scala:1085)
>  at 
> org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply(PairRDDFunctions.scala:1085)
>  at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
>  at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
>  at org.apache.spark.rdd.RDD.withScope(RDD.scala:362)
>  at 
> org.apache.spark.rdd.PairRDDFunctions.saveAsNewAPIHadoopDataset(PairRDDFunctions.scala:1084)
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to