Hi,

It could be that there is a looming bug here. Can you clarify what "new
metadata will be lost" means? I suspect that in most cases you can recover
by running either refresh (if only files were added) or recover partitions
(if a new partition was dynamically created).

Dimitris

On Tue, Jun 6, 2017 at 5:04 AM, yu feng <[email protected]> wrote:

> Hi impala community:
>
> I having been using impala in our env. Here is our cluster deployment:
> 20+ impalad backend.
> 4 of all impalads act as coordinator.
> one catalogd and one statestored
>
>
> I encounter one problem that one impalad's metadata is out of sync after
> catalogd restart.I find that while catalogd restarting, a DML operation is
> executing.
> After I analyze impala source code, I reappear the problem. this is my
> steps and analysis:
>
> 1. Start the impala cluster.
> 2. The cluster run a long time, lots of metadata operations, and current
> catalogVersion_ is big(such as bigger than 10000)
> 3. Submit a DML query(such as 'insert into xx partition() select xxx') to
> one impalad, and the query run about 1m.
> 4. While the query running, I stop catalogd, and I start catalogd just
> before the query execute QueryExecState->UpdateCatalog().
> 5. UpdateCatalog() will request catalogd for UpdateCatalog and catalogd
> will update the metadata of the table and response the newest metadata of
> the table.
> 6. After catalogd response, UpdateCatalog() update metadata cached in
> impalad(call updateCatalogCache()), and the run the following code:
>
>      if (!catalogServiceId_.equals(req.getCatalog_service_id())) {
>       boolean firstRun =
> catalogServiceId_.equals(INITIAL_CATALOG_SERVICE_ID);
>       catalogServiceId_ = req.getCatalog_service_id();
>       if (!firstRun) {
>         // Throw an exception which will trigger a full topic update
> request.
>         throw new CatalogException("Detected catalog service ID change.
> Aborting " +
>             "updateCatalog()");
>       }
>     }
>
> serviceId is the new started catalogd's serviceId and do not equals to the
> impalad's catalogServiceId_, so the function throw CatalogException and the
> query get EXCEPTION, what is more, the impalad's catalogServiceId_ is set
> to the new one.
>
> 7. After catalogd start successfully, and publish all metadata to
> statestored, then push to the impalad, After step 6, impalad's
> catalogServiceId_ equals to the catalogd's serviceId, no exception throws.
>
> 8. In normal steps, step 7 will throw the CatalogException and set the
> from_version to 0 and statestored send full metadatas to impalad in next
> UpdateState().
>
> 9. After all steps finish, the impalad is out of sync, all new metadata
> operation will be lost because CatalogObjectCache.add() need 'new item will
> only be added if it has a larger catalog version'.
>
> Please help to confirm whether it is correct. If not, Is there any other
> possibility of the problem? If so, maybe it is a bug or do you have some
> suggestions to avoiding the problem.
>
> Thanks a lot.
>

Reply via email to