[GitHub] [incubator-seatunnel] wang-zhiang opened a new issue, #4455: seatunnel An error occurs when importing hbase

via GitHub Thu, 30 Mar 2023 03:28:11 -0700


wang-zhiang opened a new issue, #4455:
URL: https://github.com/apache/incubator-seatunnel/issues/4455


   ### Search before asking
   
   - [X] I had searched in the 
[issues](https://github.com/apache/incubator-seatunnel/issues?q=is%3Aissue+label%3A%22bug%22)
 and found no similar issues.
   
   
   ### What happened
   
   I want to import a test data of mongo into hbase, and I have built a table 
in hbase, but an error was reported during execution, and the error message was 
not obvious. I suspect this is a bug, and I hope you can give me an answer
   
   ### SeaTunnel Version
   
   2.1.2
   
   ### SeaTunnel Config
   
   ```conf
   #!/bin/bash
   
   env {
       execution.parallelism = 20
       spark.executor.cores = 1
       spark.executor.memory = "6g"
   }
   
   
   source {
     mongodb {
         readconfig.uri = 
"mongodb://smartpath:[email protected]:27017,192.168.5.102:27017,192.168.5.103:27017/admin"
         readconfig.database = "test2"
         readconfig.collection = ${sqlserver_table}
         readconfig.spark.mongodb.input.partitioner = 
"MongoPaginateBySizePartitioner"
         schema="{\"_id\": \"string\",\"name\": \"string\"}"
         result_table_name = "mongodb_result_table"
     }
   }
   
   
   transform {
   }
   
   sink {
    hbase {
       source_table_name = "mongodb_result_table"
       hbase.zookeeper.quorum = 
"hadoop104:2181,hadoop105:2181,hadoop106:2181,hadoop107:2181,hadoop108:2181,hadoop109:2181,hadoop110:2181"
       catalog ="{\"table\":{ \"namespace\":\"test1\", 
\"name\":\"test66\"},\"rowkey\":\"_id\",\"columns\":{\"_id\":{\"cf\":\"rowkey\",
 \"col\":\"_id\", \"type\":\"string\"},\"name\":{\"cf\":\"info\", 
\"col\":\"name\", \"type\":\"string\"}}}"
       staging_dir = "/hbase/test1/test66/"
       save_mode = "overwrite"
       hbase.bulkload.retries.number = "0"
    }
   }
   ```
   
   
   ### Running Command
   
   ```shell
   /opt/module/seatunnel-2.1.2/bin/start-seatunnel-spark.sh \
           --master spark://192.168.5.104:7077 \
           --deploy-mode client \
           --config 
/opt/module/seatunnel-2.1.2/script_spark/test/mongo-hbase-test.conf\
           --variable sqlserver_table="copy1"
   ```
   
   
   ### Error Exception
   
   ```log
   2023-03-30 05:48:44,701 INFO storage.BlockManagerInfo: Removed 
broadcast_5_piece0 on hadoop104:41444 in memory (size: 2.8 KB, free: 366.3 MB)
   2023-03-30 05:48:44,724 INFO storage.BlockManagerInfo: Removed 
broadcast_5_piece0 on 192.168.5.107:44381 in memory (size: 2.8 KB, free: 3.0 GB)
   2023-03-30 05:48:44,770 WARN tool.LoadIncrementalHFiles: Attempt to bulk 
load region containing  into table test1:test88 with files [family:info 
path:hdfs://mycluster/hbase/test1/test88/1680169709262/info/d73b5e5892e94c59ab162a55d233f8e2]
 failed.  This is recoverable and they will be retried.
   2023-03-30 05:48:44,777 INFO tool.LoadIncrementalHFiles: Split occurred 
while grouping HFiles, retry attempt 1 with 1 files remaining to group or split
   2023-03-30 05:48:44,778 INFO hfile.CacheConfig: Created cacheConfig: 
CacheConfig:disabled
   2023-03-30 05:48:44,786 INFO tool.LoadIncrementalHFiles: Trying to load 
hfile=hdfs://mycluster/hbase/test1/test88/1680169709262/info/d73b5e5892e94c59ab162a55d233f8e2
 first=Optional[62e8df0cb7020000830054b2] last=Optional[ewfefw]
   2023-03-30 05:48:44,801 WARN tool.LoadIncrementalHFiles: Attempt to bulk 
load region containing  into table test1:test88 with files [family:info 
path:hdfs://mycluster/hbase/test1/test88/1680169709262/info/d73b5e5892e94c59ab162a55d233f8e2]
 failed.  This is recoverable and they will be retried.
   2023-03-30 05:48:44,806 INFO tool.LoadIncrementalHFiles: Split occurred 
while grouping HFiles, retry attempt 2 with 1 files remaining to group or split
   2023-03-30 05:48:44,835 ERROR tool.LoadIncrementalHFiles: 
-------------------------------------------------
   Bulk load aborted with some files not yet loaded:
   -------------------------------------------------
     
hdfs://mycluster/hbase/test1/test88/1680169709262/info/d73b5e5892e94c59ab162a55d233f8e2
   
   2023-03-30 05:48:44,836 INFO client.ConnectionImplementation: Closing master 
protocol: MasterService
   2023-03-30 05:48:44,838 INFO zookeeper.ReadOnlyZKClient: Close zookeeper 
connection 0x44f23927 to 
hadoop104:2181,hadoop105:2181,hadoop106:2181,hadoop107:2181,hadoop108:2181,hadoop109:2181,hadoop110:2181
   2023-03-30 05:48:44,842 INFO zookeeper.ZooKeeper: Session: 0x70052749b6f004c 
closed
   2023-03-30 05:48:44,842 INFO zookeeper.ClientCnxn: EventThread shut down
   2023-03-30 05:48:44,955 ERROR base.Seatunnel: 
   
   
===============================================================================
   
   
   2023-03-30 05:48:44,956 ERROR base.Seatunnel: Fatal Error, 
   
   2023-03-30 05:48:44,956 ERROR base.Seatunnel: Please submit bug report in 
https://github.com/apache/incubator-seatunnel/issues
   
   2023-03-30 05:48:44,956 ERROR base.Seatunnel: Reason:Execute Spark task 
error 
   
   2023-03-30 05:48:44,960 ERROR base.Seatunnel: Exception 
StackTrace:java.lang.RuntimeException: Execute Spark task error
        at 
org.apache.seatunnel.core.spark.command.SparkTaskExecuteCommand.execute(SparkTaskExecuteCommand.java:79)
        at org.apache.seatunnel.core.base.Seatunnel.run(Seatunnel.java:39)
        at 
org.apache.seatunnel.core.spark.SeatunnelSpark.main(SeatunnelSpark.java:32)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at 
org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
        at 
org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:855)
        at 
org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161)
        at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184)
        at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
        at 
org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:930)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:939)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
   Caused by: java.io.IOException: Retry attempted 2 times without completing, 
bailing out
        at 
org.apache.hadoop.hbase.tool.LoadIncrementalHFiles.performBulkLoad(LoadIncrementalHFiles.java:420)
        at 
org.apache.hadoop.hbase.tool.LoadIncrementalHFiles.doBulkLoad(LoadIncrementalHFiles.java:343)
        at 
org.apache.hadoop.hbase.tool.LoadIncrementalHFiles.doBulkLoad(LoadIncrementalHFiles.java:256)
        at org.apache.seatunnel.spark.hbase.sink.Hbase.output(Hbase.scala:132)
        at org.apache.seatunnel.spark.hbase.sink.Hbase.output(Hbase.scala:41)
        at 
org.apache.seatunnel.spark.SparkEnvironment.sinkProcess(SparkEnvironment.java:179)
        at 
org.apache.seatunnel.spark.batch.SparkBatchExecution.start(SparkBatchExecution.java:54)
        at 
org.apache.seatunnel.core.spark.command.SparkTaskExecuteCommand.execute(SparkTaskExecuteCommand.java:76)
        ... 14 more
    
   2023-03-30 05:48:44,960 ERROR base.Seatunnel:
   ```
   
   
   ### Flink or Spark Version
   
   spark2.4
   
   ### Java or Scala Version
   
   1.8
   
   ### Screenshots
   
   fail in import
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [incubator-seatunnel] wang-zhiang opened a new issue, #4455: seatunnel An error occurs when importing hbase

Reply via email to