Can you see pictures here? https://github.com/apache/hudi/issues/3755 Thanks! let me read that article , Im trying to create another Bloom index mor table to see if problem still exists
On Wed, Oct 6, 2021 at 2:54 PM 管梓越 <[email protected]> wrote: > Hi JianFeng > It seems that there might be something wrong with the image so that I'm > not able to get the image in my side. Pleased to share some info about your > first question. > The name of baseFile is comprised by {fileID}_writeToken_instant. For > write token, the method makeWriteToken in org.apache.hudi.common.fs.FSUtils > indicates how it is generated with three spark task information. As far as > I know, write token is designed to distinguish the files in same filegroup > generated by different task attempt. > Let me share a scenario. In spark compaction job, speculation is > allowed. Two task attempt try to generate base file for the same filegroup, > so only the file written by the succeeded task can finally be picked by > hudi. We will use the file name returned by succeeded task to get the one > we want. reconcileAgainstMarkers method in class HoodieTable shows how this > process work. > No idea on how this problem occur, it should not happen with default > config and hdfs. Hope these info could help you. > By the way, there is a Wechat account shared some perfect articles in > chinese about hudi. For guys who are good at chinese, following article may > provide more information. Great thanks to the author. > > > https://mp.weixin.qq.com/s?__biz=MzIyMzQ0NjA0MQ==&mid=2247484306&idx=1&sn=1d853469159a600d82050c17e6a2a075&chksm=e81f56e4df68dff2da417109c4a971aef54f056bc0519558c58e23fe60b90dc6e4f8d7e92774&token=1688466117&lang=zh_CN#rd > > On Wed, Oct 6, 2021 at 1:35 PM Jian Feng <[email protected]> wrote: > > > when I run delta streamer(version 0.9) to ingest data from kafka to a > > Hbase indexed mor table , after few commits, met this error when > > compaction running > > [image: image.png] > > > > In hdfs there is a file has same fileId and commit instant but different > > in the middle: > > > hdfs://tl5/projects/data_vite/mysql_ingestion/rti_vite/shopee_item_v4_db__item_v4_tab_newHbase/BR/2021-10/813800cd-1aaf-43ea-829f-4feef4a51cb3-0_19-2672-4427765_ > > *20211006051032*.parquet > > > > below is 20211006051032.commit's content, > > > > > > [image: image.png] > > > > > > What does 2672-4427765 and 2657-4368242 mean? and how can I fix this > error? > > > > I tried recreate table , it happens again > > > > > > -- > > *Jian Feng,冯健* > > Shopee | Engineer | Data Infrastructure > > > -- *Jian Feng,冯健* Shopee | Engineer | Data Infrastructure
