Minji Kim created HBASE-27790:
---------------------------------
Summary: Data is deleted after the compaction and MOB cleaning
operation
Key: HBASE-27790
URL: https://issues.apache.org/jira/browse/HBASE-27790
Project: HBase
Issue Type: Bug
Affects Versions: 2.5.3
Environment: hbase-2.5.3
(revision=d385524561f771dcb402905c2bdcaeb4a8fecbdb)
hadoop 2.10.2 (revision=965fd380006fa78b2315668fbc7eb432e1d8200f)
Reporter: Minji Kim
Attachments: original.png
I'm trying to save image binaries in the MOB column.
The column option is as follows:
{INDEX_BLOCK_ENCODING => 'NONE', MOB_THRESHOLD => '102400', VERSIONS => '1',
KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER',
MIN_VERSIONS => '0', REPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', IN_MEMORY
=> 'false', IS_MOB => 'true', COMPRESSION => 'NONE', BLOCKCACHE => 'true',
BLOCKSIZE => '65536 B (64KB)}
I set the `hbase.master.mob.cleaner.period` as 3600 (1h) and called
`major_compact` in the shell,
to test if the compaction works as I expected.
As a result, StorefileSize became 0 MB without any other operations being done
on the table.
Is it possible that the compaction job deletes all the data from the column?
The lost data size was about 150 TB and I am using 2000 region servers. (Data
from other columns is fine)
File count from the MOB directory decreased from 525K to 78K after the
compaction + mob cleaning.
I could not find anything about deleting data from the hbase logs, and it is
unlikely that any other process deleted the data. (I wrote the data to the
newly build hbase cluster)
I wonder if there can be a reason why all the data is gone.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)