This seems strange. When I pull the ( copyToLocal ) the part file to local FS, it has the same length as reported by the length file. The fileStatus from hadoop seems to have a wrong length. This seems to be true for all these type of discrepancies. It might be that the block information did not get updated ?
Either am wondering whether the recover ( the one that does a truncate ) need to account for the length in the length file or the length reported by the FileStatus ? On Thu, Mar 7, 2019 at 5:00 PM Vishal Santoshi <vishal.santo...@gmail.com> wrote: > Hello folks, > I have flink 1.7.2 working with hadoop 2.6 and b'coz > there is no in build truncate ( in hadoop 2.6 ) I am writing a method to > cleanup ( truncate ) part files based on the length in the valid-length > files dropped by flink during restore. I see some thing very strange > > hadoop fs -cat > hdfs://n*********/*******/dt=2019-03-07/_part-9-0.valid-length > > > *1765887805* > > > > > > hadoop fs -ls > hdfs://nn-crunchy:8020/tmp/kafka-to-hdfs/ls_kraken_events/dt=2019-03-07/part-9-0 > > -rw-r--r-- 3 root hadoop *1280845815* 2019-03-07 16:00 > hdfs://**********/dt=2019-03-07/part-9-0 > > I see the valid-length file reporting a larger length then the part > file itself. > > Any clue why would that be the case ? > > Regards. > > >