Hi Leonard,
我想强调,如果我在hive里定义一个external tabled读取指定的hdfs location,比如
"hive_external_table", 如下:
CREATE EXTERNAL TABLE `hive_external_table`(
`sid` int,
`sname` string)
ROW FORMAT SERDE
'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
WITH SERDEPROPERTIES (
'field.delim'=',')
LOCATION
'hdfs://nameservice1:8020/opt/user/hive/warehouse/external_table/s'
然后,我在flink-sql里,的确是可以查询的:
Flink SQL> select * from hive_external_table ;
+-----+-------------+----------------------+
| +/- | sid | sname |
+-----+-------------+----------------------+
| + | 2 | monica |
| + | 1 | daluo |
+-----+-------------+----------------------+
Received a total of 2 rows
Flink SQL>
现在是flink-sql不能使用在hive里定义的hbase external table。
另外,HbaseTableSource 有没有计划什么时候支持 SupportsFilterPushDown.
关于"select * from hive_hbase_t1"的异常日志如下。
Flink SQL> select * from hive_hbase_t1;
2020-08-28 13:20:19,985 WARN org.apache.hadoop.hive.conf.HiveConf
[] - HiveConf of name hive.vectorized.use.checked.expressions does not exist
2020-08-28 13:20:19,985 WARN org.apache.hadoop.hive.conf.HiveConf
[] - HiveConf of name hive.strict.checks.no.partition.filter does not exist
2020-08-28 13:20:19,985 WARN org.apache.hadoop.hive.conf.HiveConf
[] - HiveConf of name hive.strict.checks.orderby.no.limit does not exist
2020-08-28 13:20:19,985 WARN org.apache.hadoop.hive.conf.HiveConf
[] - HiveConf of name hive.vectorized.input.format.excludes does not exist
2020-08-28 13:20:19,986 WARN org.apache.hadoop.hive.conf.HiveConf
[] - HiveConf of name hive.strict.checks.bucketing does not exist
[ERROR] Could not execute SQL statement. Reason:
org.apache.flink.runtime.rest.util.RestClientException: [Internal server
error., <Exception on server side:
org.apache.flink.runtime.client.JobSubmissionException: Failed to submit
job.
at
org.apache.flink.runtime.dispatcher.Dispatcher.lambda$internalSubmitJob$3(Dispatcher.java:344)
at
java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:836)
at
java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:811)
at
java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:456)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
at
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(ForkJoinExecutorConfigurator.scala:44)
at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at
akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at
akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: org.apache.flink.runtime.client.JobExecutionException: Could not
instantiate JobManager.
at
org.apache.flink.runtime.dispatcher.Dispatcher.lambda$createJobManagerRunner$6(Dispatcher.java:398)
at
java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604)
... 6 more
Caused by: org.apache.flink.runtime.JobException: Creating the input splits
caused an error: Unable to instantiate the hadoop input format
at
org.apache.flink.runtime.executiongraph.ExecutionJobVertex.<init>(ExecutionJobVertex.java:272)
at
org.apache.flink.runtime.executiongraph.ExecutionGraph.attachJobGraph(ExecutionGraph.java:814)
at
org.apache.flink.runtime.executiongraph.ExecutionGraphBuilder.buildGraph(ExecutionGraphBuilder.java:228)
at
org.apache.flink.runtime.scheduler.SchedulerBase.createExecutionGraph(SchedulerBase.java:269)
at
org.apache.flink.runtime.scheduler.SchedulerBase.createAndRestoreExecutionGraph(SchedulerBase.java:242)
at
org.apache.flink.runtime.scheduler.SchedulerBase.<init>(SchedulerBase.java:229)
at
org.apache.flink.runtime.scheduler.DefaultScheduler.<init>(DefaultScheduler.java:119)
at
org.apache.flink.runtime.scheduler.DefaultSchedulerFactory.createInstance(DefaultSchedulerFactory.java:103)
at
org.apache.flink.runtime.jobmaster.JobMaster.createScheduler(JobMaster.java:284)
at
org.apache.flink.runtime.jobmaster.JobMaster.<init>(JobMaster.java:272)
at
org.apache.flink.runtime.jobmaster.factories.DefaultJobMasterServiceFactory.createJobMasterService(DefaultJobMasterServiceFactory.java:98)
at
org.apache.flink.runtime.jobmaster.factories.DefaultJobMasterServiceFactory.createJobMasterService(DefaultJobMasterServiceFactory.java:40)
at
org.apache.flink.runtime.jobmaster.JobManagerRunnerImpl.<init>(JobManagerRunnerImpl.java:140)
at
org.apache.flink.runtime.dispatcher.DefaultJobManagerRunnerFactory.createJobManagerRunner(DefaultJobManagerRunnerFactory.java:84)
at
org.apache.flink.runtime.dispatcher.Dispatcher.lambda$createJobManagerRunner$6(Dispatcher.java:388)
... 7 more
Caused by: org.apache.flink.connectors.hive.FlinkHiveException: Unable to
instantiate the hadoop input format
at
org.apache.flink.connectors.hive.read.HiveTableInputFormat.createInputSplits(HiveTableInputFormat.java:307)
at
org.apache.flink.connectors.hive.read.HiveTableInputFormat.createInputSplits(HiveTableInputFormat.java:282)
at
org.apache.flink.connectors.hive.read.HiveTableInputFormat.createInputSplits(HiveTableInputFormat.java:66)
at
org.apache.flink.runtime.executiongraph.ExecutionJobVertex.<init>(ExecutionJobVertex.java:258)
... 21 more
Caused by: java.lang.NullPointerException
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at
org.apache.flink.connectors.hive.read.HiveTableInputFormat.createInputSplits(HiveTableInputFormat.java:305)
... 24 more
End of exception on server side>]
Flink SQL>
--
Sent from: http://apache-flink.147419.n8.nabble.com/