Hi, Luke, As I understand, those message were produced when MasterGC remove files according LiveFileTracker::m_live (the file list would be inserted into METADATA's "Files" column_family). The filelist will be updated every compaction.
My questions is, if there are many GC executions between compactions, there would be more 'remove failure' components; because all files on LiveFileTracker::m_live have been removed for more than one time. They would complain if they have been removed already. But it seems that is NOT the case actually. Why ?? ? I think GC is a periodical, routine operations, but compaction must refer to many factors, such as size of CellCache, number of CellStores, etc. Thanks. -- kuer On 7月16日, 上午9时17分, Luke <[email protected]> wrote: > Did you delete a lot cells recently (like delete * from sometable)? > > The algorithm is quite simple: it tries look at history of the > Files:<access_group> column in the METADATA table and push all the > latest file list into a hash map with reference count 1, and the file > list of older cells. reference count of 0. At the end the scan, we > delete anything with reference count of 0. The older cells are deleted > as well, so they should not show up on the delete list in the next > scan. I think the reason you see these warning is that > AccessGroup.cc:505 (Global::dfs->remove(fname);) deletes cellstores > that are empty as a result of merge compaction of deleted cells. The > GC algorithm then tried to remove these files that were already > deleted, hence the warnings. > > I think the whole else clause starting on AccessGroup.cc:501 should be > removed and let the GC algorithm do the job. Do you want to give it > try to see if it works for you? > > __Luke > > 2009/7/15 kuer <[email protected]>: > > > > > Thanks, Doug, > > > I would try it. > > Of course, I have to figure out which files fall into the filelist, > > firstly. > > > thanks > > > -- kuer > > > On 7月16日, 上午1时02分, Doug Judd <[email protected]> wrote: > >> Hi kuer, > > >> Thanks for the report. According to Luke (who wrote the code), these > >> messages are innocuous. I agree that the code should be fixed so that it > >> doesn't try to delete these files that don't exist. However, I think the > >> code does not ever delete files that are in use. This code has been in > >> place for about a year and we've never experienced that kind of problem. > > >> If you want to take a shot at fixing the non-existent file delete problem, > >> go for it. We'll pull it in if it looks good. > > >> - Doug > > >> On Wed, Jul 15, 2009 at 2:15 AM, kuer <[email protected]> wrote: > > >> > Hi, all. > > >> > After restarting hypertable + kfs in my work-boxes, there are some > >> > endless messages logging into Hypertable.Master.log : > > >> > 2009-07-15 16:45:42,431 1350199616 Hypertable.Master [WARN] (Master/ > >> > MasterGc.cc:197) Error removing DFS file: /hypertable/tables/ > >> > storage_se/agPORT/4F852B7EB462D2CC311753C0/cs12 > >> > 2009-07-15 16:45:42,432 1350199616 Hypertable.Master [WARN] (Master/ > >> > MasterGc.cc:197) Error removing DFS file: /hypertable/tables/ > >> > storage_se/agPORT/4F852B7EB462D2CC311753C0/cs14 > >> > 2009-07-15 16:45:42,432 1350199616 Hypertable.Master [WARN] (Master/ > >> > MasterGc.cc:197) Error removing DFS file: /hypertable/tables/ > >> > storage_se/default/590FCA21F8FB70BB2A1F56A2/cs3 > >> > 2009-07-15 16:45:42,433 1350199616 Hypertable.Master [WARN] (Master/ > >> > MasterGc.cc:197) Error removing DFS file: /hypertable/tables/ > >> > storage_se/default/590FCA21F8FB70BB2A1F56A2/cs4 > >> > ... > >> > ... > >> > ... > >> > 2009-07-15 16:45:42,433 1350199616 Hypertable.Master [WARN] (Master/ > >> > MasterGc.cc:197) Error removing DFS file: /hypertable/tables/ > >> > storage_se/default/590FCA21F8FB70BB2A1F56A2/cs8 > >> > 2009-07-15 16:45:42,434 1350199616 Hypertable.Master [WARN] (Master/ > >> > MasterGc.cc:197) Error removing DFS file: /hypertable/tables/ > >> > storage_se/default/13C3704B81B2919AE7F6A936/cs12 > > >> > Then, I check the dfs.log : > > >> > 2009-07-15 16:55:44,060 1319373120 kosmosBroker [ERROR] (kosmos/ > >> > KosmosBroker.cc:273) unlink failed: file='/hypertable/tables/ > >> > storage_se/agPORT/4F852B7EB462D2CC311753C0/cs10' - No such file or > >> > directory > >> > 2009-07-15 16:55:44,061 1308883264 kosmosBroker [ERROR] (kosmos/ > >> > KosmosBroker.cc:273) unlink failed: file='/hypertable/tables/ > >> > storage_se/agPORT/4F852B7EB462D2CC311753C0/cs12' - No such file or > >> > directory > >> > 2009-07-15 16:55:44,061 1130555712 kosmosBroker [ERROR] (kosmos/ > >> > KosmosBroker.cc:273) unlink failed: file='/hypertable/tables/ > >> > storage_se/agPORT/4F852B7EB462D2CC311753C0/cs14' - No such file or > >> > directory > >> > 2009-07-15 16:55:44,062 1151535424 kosmosBroker [ERROR] (kosmos/ > >> > KosmosBroker.cc:273) unlink failed: file='/hypertable/tables/ > >> > storage_se/default/590FCA21F8FB70BB2A1F56A2/cs3' - No such file or > >> > directory > >> > 2009-07-15 16:55:44,062 1141045568 kosmosBroker [ERROR] (kosmos/ > >> > KosmosBroker.cc:273) unlink failed: file='/hypertable/tables/ > >> > storage_se/default/590FCA21F8FB70BB2A1F56A2/cs4' - No such file or > >> > directory > >> > ... > >> > ... > >> > ... > >> > 2009-07-15 16:55:44,063 1162025280 kosmosBroker [ERROR] (kosmos/ > >> > KosmosBroker.cc:273) unlink failed: file='/hypertable/tables/ > >> > storage_se/default/590FCA21F8FB70BB2A1F56A2/cs8' - No such file or > >> > directory > >> > 2009-07-15 16:55:44,064 1172515136 kosmosBroker [ERROR] (kosmos/ > >> > KosmosBroker.cc:273) unlink failed: file='/hypertable/tables/ > >> > storage_se/default/13C3704B81B2919AE7F6A936/cs12' - No such file or > >> > directory > > >> > When reading source code src/cc/Hypertable/Master/MasterGc.cc : > >> > 182 void > >> > 183 reap(CountMap &files_map) { > >> > 184 size_t nf = 0, nf_done = 0, nd = 0, nd_done = 0; > >> > 185 CountMap dirs_map; // reap empty range directories as well > >> > 186 > >> > 187 foreach (const CountMap::value_type &v, files_map) { > >> > 188 if (!v.second) { > >> > 189 HT_DEBUGF("MasterGc: removing file %s", v.first); > >> > 190 > >> > 191 if (!m_dryrun) { > >> > 192 try { > >> > 193 m_fs->remove(v.first); > >> > 194 ++nf_done; > >> > 195 } > >> > 196 catch (Exception &e) { > >> > 197 HT_WARNF("%s", e.what()); > >> > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ THIS LINE IS Logging-MESSAGE > >> > COME FROM > >> > 198 } > >> > 199 } > >> > 200 ++nf; > >> > 201 } > >> > 202 char *p = strrchr(v.first, '/'); > >> > 203 > >> > 204 if (p) { > >> > 205 string dir_name(v.first, p - v.first); > >> > 206 insert_file(dirs_map, dir_name.c_str(), v.second); > >> > 207 } > >> > 208 } > >> > 209 foreach (const CountMap::value_type &v, dirs_map) { > >> > 210 if (!v.second) { > >> > 211 HT_DEBUGF("MasterGc: removing directory %s", v.first); > >> > 212 > > >> > I found that reap(file_map) just remove the specified files, but do > >> > NOT notify/mark the status of deletion to anyone else. Recursively, > >> > the `file_map` was built in scan_metadata() function. Apparently, > >> > scan_metadata() does NOT check the existence of the file to delete > >> > when building deletion-file-list. > > >> > When Master.GC-ing, this reap operations would repeat. At present, I > >> > donot read the source code totally, I donot understand the rational of > >> > scan_metadata(). But the reap() do *delete* the exist file; So there > >> > is risk that reap() will delete some WORKING ( IN-USE) file by name > >> > which should be kept alive!!! > > >> > Any one can help me out ? > > >> > Thanks a lot. > > >> > -- kuer --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Hypertable Development" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/hypertable-dev?hl=en -~----------~----~----~----~------~----~------~--~---
