[ 
https://issues.apache.org/jira/browse/HBASE-20295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16418258#comment-16418258
 ] 

Hadoop QA commented on HBASE-20295:
-----------------------------------

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  2m  
1s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
19s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
33s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
18s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
 8s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
35s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
18s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
49s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
19m 55s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.5 2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 59s{color} 
| {color:red} hbase-mapreduce in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
 9s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 41m 37s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:d8b550f |
| JIRA Issue | HBASE-20295 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12916702/HBASE-20295.master.001.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 9a690f291054 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / d8b550fabc |
| maven | version: Apache Maven 3.5.3 
(3383c37e1f9e9b3bc3df5050c29c8aff9f295297; 2018-02-24T19:49:05Z) |
| Default Java | 1.8.0_162 |
| findbugs | v3.1.0-RC3 |
| unit | 
https://builds.apache.org/job/PreCommit-HBASE-Build/12195/artifact/patchprocess/patch-unit-hbase-mapreduce.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/12195/testReport/ |
| modules | C: hbase-mapreduce U: hbase-mapreduce |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/12195/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was automatically generated.



> TableOutputFormat.checkOutputSpecs throw NullPointerException Exception
> -----------------------------------------------------------------------
>
>                 Key: HBASE-20295
>                 URL: https://issues.apache.org/jira/browse/HBASE-20295
>             Project: HBase
>          Issue Type: Bug
>          Components: mapreduce
>    Affects Versions: 1.4.0
>         Environment: Spark 2.2.1, HBase 1.4.0
>            Reporter: Michael Jin
>            Assignee: Michael Jin
>            Priority: Major
>         Attachments: HBASE-20295.branch-1.4.001.patch, 
> HBASE-20295.master.001.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> I am using spark write data to HBase by using RDD.
> saveAsNewAPIHadoopDataset function, it works fine with hbase 1.3.1, but when 
> update my hbase dependency to 1.4.0 in pom.xml, it throw 
> java.lang.NullPointerException, it is caused by a logic error in 
> TableOutputFormat.checkOutputSpecs function, please check below details:
> first let's take a look at SparkHadoopMapReduceWriter.write function in 
> SparkHadoopMapReduceWriter.scala
> {code:java}
> // SparkHadoopMapReduceWriter.write 
> (org.apache.spark.internal.io.SparkHadoopMapReduceWriter.scala)
> def write[K, V: ClassTag](
>     rdd: RDD[(K, V)],
>     hadoopConf: Configuration): Unit = {
>   // Extract context and configuration from RDD.
>   val sparkContext = rdd.context
>   val stageId = rdd.id
>   val sparkConf = rdd.conf
>   val conf = new SerializableConfiguration(hadoopConf)
>   // Set up a job.
>   val jobTrackerId = SparkHadoopWriterUtils.createJobTrackerID(new Date())
>   val jobAttemptId = new TaskAttemptID(jobTrackerId, stageId, TaskType.MAP, 
> 0, 0)
>   val jobContext = new TaskAttemptContextImpl(conf.value, jobAttemptId)
>   val format = jobContext.getOutputFormatClass
>   if (SparkHadoopWriterUtils.isOutputSpecValidationEnabled(sparkConf)) {
>     // FileOutputFormat ignores the filesystem parameter
>     val jobFormat = format.newInstance
>     jobFormat.checkOutputSpecs(jobContext)
>   }
>   val committer = FileCommitProtocol.instantiate(
>     className = classOf[HadoopMapReduceCommitProtocol].getName,
>     jobId = stageId.toString,
>     outputPath = 
> conf.value.get("mapreduce.output.fileoutputformat.outputdir"),
>     isAppend = false).asInstanceOf[HadoopMapReduceCommitProtocol]
>   committer.setupJob(jobContext)
> ...{code}
> in "write" function if output spec validation is enabled, it will call 
> checkOutputSpec function in TableOutputFormat class, but the job format is 
> simply created by "vall jobFormat = format.newInstance", this will NOT 
> initialize "conf" member variable in TableOutputFormat class, let's continue 
> check checkOutputSpecs function in TableOutputFormat class
>  
> {code:java}
> // TableOutputFormat.checkOutputSpecs 
> (org.apache.hadoop.hbase.mapreduce.TableOutputFormat.java) HBASE 1.4.0
> @Override
> public void checkOutputSpecs(JobContext context) throws IOException,
>     InterruptedException {
>   try (Admin admin = 
> ConnectionFactory.createConnection(getConf()).getAdmin()) {
>     TableName tableName = TableName.valueOf(this.conf.get(OUTPUT_TABLE));
>     if (!admin.tableExists(tableName)) {
>       throw new TableNotFoundException("Can't write, table does not exist:" +
>           tableName.getNameAsString());
>     }
>     if (!admin.isTableEnabled(tableName)) {
>       throw new TableNotEnabledException("Can't write, table is not enabled: 
> " +
>           tableName.getNameAsString());
>     }
>   }
> }
> {code}
>  
> "ConnectionFactory.createConnection(getConf())", as mentioned above "conf" 
> class member is not initialized, so getConf() will return null, so in the 
> next UserProvider create instance process, it throw the 
> NullPointException(Please part of stack trace at the end), it is a little 
> confused that, context passed by function parameter is actually been properly 
> constructed, and it contains Configuration object, why context is never used? 
> So I suggest to use below code to partly fix this issue:
>  
> {code:java}
> // code placeholder
> @Override
> public void checkOutputSpecs(JobContext context) throws IOException,
>     InterruptedException {
>   Configuration hConf = context.getConfiguration();
>   if(hConf == null)
>     hConf = this.conf;
>   try (Admin admin = ConnectionFactory.createConnection(hConf).getAdmin()) {
>     TableName tableName = TableName.valueOf(hConf.get(OUTPUT_TABLE));
>     if (!admin.tableExists(tableName)) {
>       throw new TableNotFoundException("Can't write, table does not exist:" +
>               tableName.getNameAsString());
>     }
>     if (!admin.isTableEnabled(tableName)) {
>       throw new TableNotEnabledException("Can't write, table is not enabled: 
> " +
>               tableName.getNameAsString());
>     }
>   }
> }
> {code}
> In hbase 1.3.1, this issue is not exists because checkOutputSpecs has a blank 
> function body
>  
>  
> Part of stack trace:
> Exception in thread "main" java.lang.NullPointerException
>  at 
> org.apache.hadoop.hbase.security.UserProvider.instantiate(UserProvider.java:122)
>  at 
> org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:214)
>  at 
> org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:119)
>  at 
> org.apache.hadoop.hbase.mapreduce.TableOutputFormat.checkOutputSpecs(TableOutputFormat.java:177)
>  at 
> org.apache.spark.internal.io.SparkHadoopMapReduceWriter$.write(SparkHadoopMapReduceWriter.scala:76)
>  at 
> org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply$mcV$sp(PairRDDFunctions.scala:1085)
>  at 
> org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply(PairRDDFunctions.scala:1085)
>  at 
> org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply(PairRDDFunctions.scala:1085)
>  at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
>  at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
>  at org.apache.spark.rdd.RDD.withScope(RDD.scala:362)
>  at 
> org.apache.spark.rdd.PairRDDFunctions.saveAsNewAPIHadoopDataset(PairRDDFunctions.scala:1084)
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to