[ 
https://issues.apache.org/jira/browse/HBASE-22769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16947242#comment-16947242
 ] 

Junko Urata edited comment on HBASE-22769 at 10/9/19 4:56 PM:
--------------------------------------------------------------

Hello, I'm hitting this issue too. I can read data from HBase table fine as 
long as I don't use where() or filter(). Once I put where/filter I get the same 
exception. Below is the very simple code I used for testing. I didn't do 
anything fancy.  
{code:java}
val hbaseDF = spark.read
                 .options(Map(HBaseTableCatalog.tableCatalog -> catalog))
                 .format("org.apache.hadoop.hbase.spark").load()
val lookedup = hbaseDF.filter($"colName" === "some value"){code}
The class in question is JavaBytesEncoder, which is in hbase-spark connector 
jar. This jar exists in my spark job(executor). It looks like the cause is 
InvocationTargetException due to Scala Reflection, and it is happening NOT in 
spark executor, but probably somewhere in between Zookeeper and HBase. 
 CDH Cluster ver. 6.1.1
 Hbase-spark library version 2.1.0-cdh6.1.1
 Scala version 2.11.8
 Spark version 2.4.3


was (Author: ujunko):
Hello, I'm hitting this issue too. I can read data from HBase table fine as 
long as I don't use where() or filter(). Once I put where/filter I get the same 
exception. Below is the very simple code I used for testing. I didn't do 
anything fancy.  
{code:java}
val hbaseDF = spark.read
                 .options(Map(HBaseTableCatalog.tableCatalog -> catalog))
                 .format("org.apache.hadoop.hbase.spark").load()
val lookedup = hbaseDF.filter($"colName" === "some value"){code}
The class in question is JavaBytesEncoder, which is in hbase-spark connector 
jar. This jar exists in my spark job(executor). It looks like the cause is 
InvocationTargetException due to Scala Reflection, and it is happening NOT in 
spark executor, but probably somewhere in between Zookeeper and HBase.  I 
attached my executor log. 
CDH Cluster ver. 6.1.1
Hbase-spark library version 2.1.0-cdh6.1.1
Scala version 2.11.8
Spark version 2.4.3

> Runtime Error on join (with filter) when using hbase-spark connector
> --------------------------------------------------------------------
>
>                 Key: HBASE-22769
>                 URL: https://issues.apache.org/jira/browse/HBASE-22769
>             Project: HBase
>          Issue Type: Bug
>          Components: hbase-connectors
>    Affects Versions: connector-1.0.0
>         Environment: Built using maven scala plugin on intellij IDEA with 
> Maven 3.3.9. Ran on Azure HDInsight Spark cluster using Yarn. 
> Spark version: 2.4.0
> Scala version: 2.11.12
> hbase-spark version: 1.0.0
>            Reporter: Noah Banholzer
>            Priority: Blocker
>
> I am attempting to do a left outer join (though any join with a push down 
> filter causes this issue) between a Spark Structured Streaming DataFrame and 
> a DataFrame read from HBase. I get the following stack trace when running a 
> simple spark app that reads from a streaming source and attempts to left 
> outer join with a dataframe read from HBase:
> {{19/07/30 18:30:25 INFO DAGScheduler: ShuffleMapStage 1 (start at 
> SparkAppTest.scala:88) failed in 3.575 s due to Job aborted due to stage 
> failure: Task 0 in stage 1.0 failed 4 times, most recent failure: Lost task 
> 0.3 in stage 1.0 (TID 10, 
> wn5-edpspa.hnyo2upsdeau1bffc34wwrkgwc.ex.internal.cloudapp.net, executor 2): 
> org.apache.hadoop.hbase.DoNotRetryIOException: 
> org.apache.hadoop.hbase.DoNotRetryIOException: 
> java.lang.reflect.InvocationTargetException at 
> org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.toFilter(ProtobufUtil.java:1609)
>  at 
> org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.toScan(ProtobufUtil.java:1154)
>  at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.newRegionScanner(RSRpcServices.java:2967)
>  at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:3301)
>  at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:42002)
>  at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:413) at 
> org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131) at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324) at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304) 
> Caused by: java.lang.reflect.InvocationTargetException at 
> sun.reflect.GeneratedMethodAccessor15461.invoke(Unknown Source) at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.toFilter(ProtobufUtil.java:1605)
>  }}
> {{... 8 more }}
> {{Caused by: java.lang.NoClassDefFoundError: 
> org/apache/hadoop/hbase/spark/datasources/JavaBytesEncoder$ at 
> org.apache.hadoop.hbase.spark.datasources.JavaBytesEncoder.create(JavaBytesEncoder.scala)
>  at 
> org.apache.hadoop.hbase.spark.SparkSQLPushDownFilter.parseFrom(SparkSQLPushDownFilter.java:196)
>  }}
> {{... 12 more }}
> {{at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>  at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at 
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException.instantiateException(RemoteWithExtrasException.java:100)
>  at 
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException.unwrapRemoteException(RemoteWithExtrasException.java:90)
>  at 
> org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.makeIOExceptionOfException(ProtobufUtil.java:359)
>  at 
> org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.handleRemoteException(ProtobufUtil.java:347)
>  at 
> org.apache.hadoop.hbase.client.ScannerCallable.openScanner(ScannerCallable.java:344)
>  at 
> org.apache.hadoop.hbase.client.ScannerCallable.rpcCall(ScannerCallable.java:242)
>  at 
> org.apache.hadoop.hbase.client.ScannerCallable.rpcCall(ScannerCallable.java:58)
>  at 
> org.apache.hadoop.hbase.client.RegionServerCallable.call(RegionServerCallable.java:127)
>  at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithoutRetries(RpcRetryingCallerImpl.java:192)
>  at 
> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:387)
>  at 
> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:361)
>  at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:107)
>  at 
> org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture.run(ResultBoundedCompletionService.java:80)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)}}
>  
> It appears to be attempting to reference a file called 
> "JavaBytesEncoder$.class" resulting in a NoClassDefFoundError. Interestingly, 
> when I unzipped the jar I found that both "JavaBytesEncoder.class" and 
> "JavaBytesEncoder$.class" exist, but the latter is simply an empty file. This 
> might just be a case of me misunderstanding how Java links classes upon build 
> however.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to