If a file is unreferenced anywhere in the metadata table, then there's probably a bug, and I can easily imagine it would go undetected.
There are small moments in time when a file is ready to be used, and the tablet server dies, which would create an unreferenced file. As noted in ACCUMULO-2381<https://issues.apache.org/jira/browse/ACCUMULO-2381> the GC should look for these abandoned files periodically. Right now the GC just removes files that have references in the metadata table (delete markers). -Eric On Wed, May 21, 2014 at 9:00 PM, Dickson, Matt MR < [email protected]> wrote: > *UNOFFICIAL* > I've run scan on hdfs under /accumulo/tables/<table_id> for all rfiles > older than our ageoff filter on that table. When I then scan for these > rfiles in the metadata table most are not listed. > > Should all rfiles be referenced in the metadata table? My goal had been > to get the rowid from the metadata and then force a compaction on that > range. Eg for row 4n;234234234 file:/fdi-2342/234234.rf run a > compaction for 234234234 to 234234234~ > > Thanks in advance. > Matt > > >
