Hi,

I think swapping tables is indeed a common need and not only for Kudu
tables. For this reason this workaround was not particularly good in my
opinion as it was Kudu-specific. Since Impala tables may have different
names than their corresponding tables in Kudu, this could be used to
provide the additional layer of indirection needed for swapping Kudu
tables, but not for other kind of tables (e.g. Parquet ones).

I think at this point the most widely applicable "layer of indirection" is
using VIEWs, which works for any kind of table. If that is not optimal for
this use case for whatever reason, maybe some more lightweight alternative
could be considered for table aliasing in the future.

Br,

Zoltan

On Mon, Jul 30, 2018 at 9:07 AM Gabor Kaszab <gaborkas...@cloudera.com>
wrote:

> Thanks Tim for answering this!
>
> One note for the very first mail in this thread:
> https://issues.apache.org/jira/browse/IMPALA-6375 won't fix this issue
> either. It will allow the user to make a managed Kudu table external and to
> modify the underlying kudu.table_name in one step. With the current
> implementation this has to be done in two steps. However, modifying
> kudu.table_name of a managed table still won't be feasible.
>
> Cheers,
> Gabor
>
>
> On Sat, Jul 28, 2018 at 2:56 AM, Boris Tyukin <bo...@boristyukin.com>
> wrote:
>
>> thanks so much, Tim. I do feel much better now that you've explained the
>> reasons behind.
>>
>> Using another client makes sense - will check that out. I did see a bunch
>> of methods in Kudu API but was hoping to use Impala all the way.
>>
>> It would be really cool if this Jira will get traction as this is a very
>> common technique with Hive and Impala on HDFS to swap tables and partitions
>> to ensure safe process of moving large amounts of data to production tables
>> https://jira.apache.org/jira/browse/KUDU-2327
>> Support atomic swap of tables or partitions
>>
>>
>>
>>
>> On Fri, Jul 27, 2018 at 7:24 PM Tim Armstrong <tarmstr...@cloudera.com>
>> wrote:
>>
>>> Hi,
>>>   Sorry you ran into this - we don't deliberately want to break
>>> workflows but it can be tricky if we accidentally expose implementation
>>> details. There was a previous CVE that resulted from creative use of this
>>> functionality
>>> https://lists.apache.org/thread.html/74a163df0cdefcd738c8d18821e69aa69eed2ba5384c0cc255d15c4b@%3Cannounce.apache.org%3E
>>> so part of the motivation was to simplify the table states and state
>>> transitions we need to deal with and make it easier to reason about and
>>> test thoroughly.
>>>
>>> We do have a "kudu" label in JIRA if you want to find Kudu-related
>>> Impala JIRAs. It would be unusual for one open-source project to have veto
>>> power over changes in another project or to create duplicate JIRAs in
>>> multiple apache projects for the same work. We do generally work closely
>>> with the Kudu project.
>>>
>>> I think the main workaround would be to use a different Kudu client to
>>> directly drop the tables. AFAIK the intent of external Kudu tables was
>>> generally that they would be dropped externally to Impala - I don't think
>>> we anticipated the "attach then drop" method for
>>>
>>> On Fri, Jul 27, 2018 at 8:27 AM, Boris Tyukin <bo...@boristyukin.com>
>>> wrote:
>>>
>>>> So the change was made by Impala developer but it is only relevant to
>>>> Kudu, taking away the only way to swap tables.
>>>>
>>>> I am curious if this change was agreed with Kudu devs. And if changes
>>>> like that should be tracked by both Kudu and impala JIRAs since Impala is
>>>> the only way right now to work with Kudu, besides APIs, that requires
>>>> coding.
>>>>
>>>> Is there someone who chairs this type of decisions that impact Impala /
>>>> Kudu users?
>>>>
>>>> This is important for me to understand before we invest into Kudu.
>>>>
>>>>
>>>>
>>>>
>>>> On Fri, Jul 27, 2018 at 8:53 AM Boris Tyukin <bo...@boristyukin.com>
>>>> wrote:
>>>>
>>>>> oh no....why?? just why?? we are about to upgrade to 2.12...
>>>>>
>>>>> Todd, can this "improvement" get rolled back? This a breaking change
>>>>> and does not contribute to making anything better. And now the only good
>>>>> way to swap Kudu tables is gone.
>>>>>
>>>>> I am really frustrated. IMPALA-5654
>>>>> <https://issues.apache.org/jira/browse/IMPALA-5654> should never been
>>>>> approved without giving users a good alternative.
>>>>>
>>>>> Boris
>>>>>
>>>>> On Fri, Jul 27, 2018 at 7:10 AM Cliff Resnick <cre...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> We sometimes need to replace dimension tables in Kudu in a live
>>>>>> database. The technique is described here:
>>>>>>
>>>>>>
>>>>>> https://boristyukin.com/how-to-hot-swap-apache-kudu-tables-with-apache-impala/
>>>>>>
>>>>>> After 2.12 and IMPALA-5654
>>>>>> <https://issues.apache.org/jira/browse/IMPALA-5654> it seems there
>>>>>> is no longer a way to perform the final step, where the hot swap Kudu
>>>>>> target table is renamed back to the original. It looks like
>>>>>> IMPALA-6375 <https://issues.apache.org/jira/browse/IMPALA-6375> is
>>>>>> going to address this, but in the meantime is there another workaround we
>>>>>> can use?
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>
>

Reply via email to