I said "new metadata will be lost" means the following metadata operation which happened to the existing table will be lost until the table's version catch up with the older version. I think any operation can not recover it because impalad update local cached metadata by comparing new version and older version.
I try some operations which can trigger table's metadata reloading and new version generated such as refresh, alter table. But new metadata always lost until the catalog_version bigger than the older one, for new created catalog object(such as create table/ create database..), the metadata is up-to-date. I think it is a bug, we need keep older catalogServiceId_ until full newly metadata applied(non-delta one, pushed by statestored), even all of metadata operations will be EXCEPTION at this time gap. perhaps there are some better solutions. Thanks a lot. 2017-06-07 0:58 GMT+08:00 Dimitris Tsirogiannis <[email protected]> : > Hi, > > It could be that there is a looming bug here. Can you clarify what "new > metadata will be lost" means? I suspect that in most cases you can recover > by running either refresh (if only files were added) or recover partitions > (if a new partition was dynamically created). > > Dimitris > > On Tue, Jun 6, 2017 at 5:04 AM, yu feng <[email protected]> wrote: > > > Hi impala community: > > > > I having been using impala in our env. Here is our cluster deployment: > > 20+ impalad backend. > > 4 of all impalads act as coordinator. > > one catalogd and one statestored > > > > > > I encounter one problem that one impalad's metadata is out of sync after > > catalogd restart.I find that while catalogd restarting, a DML operation > is > > executing. > > After I analyze impala source code, I reappear the problem. this is my > > steps and analysis: > > > > 1. Start the impala cluster. > > 2. The cluster run a long time, lots of metadata operations, and current > > catalogVersion_ is big(such as bigger than 10000) > > 3. Submit a DML query(such as 'insert into xx partition() select xxx') to > > one impalad, and the query run about 1m. > > 4. While the query running, I stop catalogd, and I start catalogd just > > before the query execute QueryExecState->UpdateCatalog(). > > 5. UpdateCatalog() will request catalogd for UpdateCatalog and catalogd > > will update the metadata of the table and response the newest metadata of > > the table. > > 6. After catalogd response, UpdateCatalog() update metadata cached in > > impalad(call updateCatalogCache()), and the run the following code: > > > > if (!catalogServiceId_.equals(req.getCatalog_service_id())) { > > boolean firstRun = > > catalogServiceId_.equals(INITIAL_CATALOG_SERVICE_ID); > > catalogServiceId_ = req.getCatalog_service_id(); > > if (!firstRun) { > > // Throw an exception which will trigger a full topic update > > request. > > throw new CatalogException("Detected catalog service ID change. > > Aborting " + > > "updateCatalog()"); > > } > > } > > > > serviceId is the new started catalogd's serviceId and do not equals to > the > > impalad's catalogServiceId_, so the function throw CatalogException and > the > > query get EXCEPTION, what is more, the impalad's catalogServiceId_ is set > > to the new one. > > > > 7. After catalogd start successfully, and publish all metadata to > > statestored, then push to the impalad, After step 6, impalad's > > catalogServiceId_ equals to the catalogd's serviceId, no exception > throws. > > > > 8. In normal steps, step 7 will throw the CatalogException and set the > > from_version to 0 and statestored send full metadatas to impalad in next > > UpdateState(). > > > > 9. After all steps finish, the impalad is out of sync, all new metadata > > operation will be lost because CatalogObjectCache.add() need 'new item > will > > only be added if it has a larger catalog version'. > > > > Please help to confirm whether it is correct. If not, Is there any other > > possibility of the problem? If so, maybe it is a bug or do you have some > > suggestions to avoiding the problem. > > > > Thanks a lot. > > >
