Hi Till, we are not using HBase at the moment. We managed to run successfully the job but it was a pain to find the right combination of dependencies, library shading and the right HADOOP_CLASSPATH. The problem was the combination of parquet, jaxrs, hadoop and jackson. Moreover we had to run the cluster with parent-first class loading in order to make it run.
However we still have the big problem of being able to submit jobs via rest API (as I wrote in another thread it seems that there's no way to execute any code after env.execute if using REST APIs). Best, Flavio On Wed, Nov 7, 2018 at 6:15 PM Till Rohrmann <trohrm...@apache.org> wrote: > Hi Flavio, > > I haven't seen this problem before. Are you using Flink's HBase connector? > According to similar problems with Spark one needs to make sure that the > hbase jars are on the classpath [1, 2]. If not, then it might be a problem > with the MR1 version 2.6.0-mr1-cdh5.11.2 which caused problems for CDH 5.2 > [2]. It could also be worthwhile to try it out with the latest CDH version. > > [1] > https://stackoverflow.com/questions/34901331/spark-hbase-error-java-lang-illegalstateexception-unread-block-data > [2] > https://mapr.com/community/s/question/0D50L00006BIthGSAT/javalangillegalstateexception-unread-block-data-when-running-spark-with-yarn > [3] > https://issues.apache.org/jira/browse/SPARK-1867?focusedCommentId=14322647&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14322647 > > Cheers, > Till > > On Wed, Nov 7, 2018 at 12:05 PM Flavio Pompermaier <pomperma...@okkam.it> > wrote: > >> I forgot to mention that I'm using Flink 1.6.2 compiled for cloudera CDH >> 5.11.2: >> >> /opt/shared/devel/apache-maven-3.3.9/bin/mvn clean install >> -Dhadoop.version=2.6.0-cdh5.11.2 -Dhbase.version=1.2.0-cdh5.11.2 >> -Dhadoop.core.version=2.6.0-mr1-cdh5.11.2 -DskipTests -Pvendor-repos >> >> On Wed, Nov 7, 2018 at 11:48 AM Flavio Pompermaier <pomperma...@okkam.it> >> wrote: >> >>> Hi to all, >>> we tried to upgrade our jobs to Flink 1.6.2 but now we get the following >>> error (we saw a similar issue with spark that was caused by different java >>> version on the cluster servers so we checked them and they are all to the >>> same version - oracle-8-191): >>> >>> Caused by: org.apache.flink.runtime.client.JobExecutionException: Cannot >>> initialize task 'DataSink (Parquet write: >>> hdfs:/rivela/1/1/0_staging/parquet)': Deserializing the OutputFormat >>> (org.apache.flink.api.java.hadoop.mapreduce.HadoopOutputFormat@54a4c7c8) >>> failed: unread block data >>> at >>> org.apache.flink.runtime.executiongraph.ExecutionGraphBuilder.buildGraph(ExecutionGraphBuilder.java:220) >>> at >>> org.apache.flink.runtime.executiongraph.ExecutionGraphBuilder.buildGraph(ExecutionGraphBuilder.java:100) >>> at >>> org.apache.flink.runtime.jobmaster.JobMaster.createExecutionGraph(JobMaster.java:1151) >>> at >>> org.apache.flink.runtime.jobmaster.JobMaster.createAndRestoreExecutionGraph(JobMaster.java:1131) >>> at >>> org.apache.flink.runtime.jobmaster.JobMaster.<init>(JobMaster.java:294) >>> at >>> org.apache.flink.runtime.jobmaster.JobManagerRunner.<init>(JobManagerRunner.java:157) >>> ... 10 more >>> Caused by: java.lang.Exception: Deserializing the OutputFormat >>> (org.apache.flink.api.java.hadoop.mapreduce.HadoopOutputFormat@54a4c7c8) >>> failed: unread block data >>> at >>> org.apache.flink.runtime.jobgraph.OutputFormatVertex.initializeOnMaster(OutputFormatVertex.java:63) >>> at >>> org.apache.flink.runtime.executiongraph.ExecutionGraphBuilder.buildGraph(ExecutionGraphBuilder.java:216) >>> ... 15 more >>> Caused by: java.lang.IllegalStateException: unread block data >>> at >>> java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode(ObjectInputStream.java:2783) >>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1605) >>> at >>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2287) >>> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2211) >>> at >>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069) >>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573) >>> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:431) >>> at >>> org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:502) >>> at >>> org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:489) >>> at >>> org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:477) >>> at >>> org.apache.flink.util.InstantiationUtil.readObjectFromConfig(InstantiationUtil.java:438) >>> at >>> org.apache.flink.runtime.operators.util.TaskConfig.getStubWrapper(TaskConfig.java:288) >>> at >>> org.apache.flink.runtime.jobgraph.OutputFormatVertex.initializeOnMaster(OutputFormatVertex.java:60) >>> ... 16 more >>> >>> >>> Has anyone faced this problem before? How can we try to solve it? >>> Best,Flavio >>> >> >>