g"
Subject: Re: "broadcast" tablet replication for kudu?
sorry to revive the old thread but curious if there is a better solution 1 year
after...We have a few small tables (under 300k rows) which are practically used
with every single query and to make things worse joined mor
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Impala could definitely be smarter, just a matter of
>>>>>>>>>>>> programming Kudu-specific join strategi
>>>>>>>
>>>>>>>>>>> -Todd
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> -Cliff
>>>>>>
t;>>>>>>>>>>>
>>>>>>>>>>>>> I thought I had read that the Kudu client can configure a scan
>>>>>>>>>>>>> for CLOSEST_REPLICA and assumed this was a way to take advantage
>>>>
n.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Yea, when a client uses CLOSEST_REPLICA it will read a local one
>>>>>>>>>>> if available. However, that doesn
;user@kudu.apache.org<mailto:user@kudu.apache.org>"
mailto:user@kudu.apache.org>>
Subject: Re: "broadcast" tablet replication for kudu?
Impala 2.12. The external RPC protocol is still Thrift.
Todd
On Mon, Jul 23, 2018, 7:02 AM Clifford Resnick
mailto:cresn...@mediamath.c
ot;
> Date: Monday, July 23, 2018 at 9:46 AM
> To: "user@kudu.apache.org"
> Subject: Re: "broadcast" tablet replication for kudu?
>
> Are you on the latest release of Impala? It switched from using Thrift for
> RPC to a new implementation (actually bor
3, 2018 at 9:46 AM
To: "user@kudu.apache.org<mailto:user@kudu.apache.org>"
mailto:user@kudu.apache.org>>
Subject: Re: "broadcast" tablet replication for kudu?
Are you on the latest release of Impala? It switched from using Thrift for RPC
to a new implementation (actually borro
;>>>>>>>> strategies. Given
>>>>>>>>>> statistics, it will choose to broadcast the small table, which means
>>>>>>>>>> that
>>>>>>>>>> it will create a plan that looks like:
>>>>&
strategy such as nested-loop join, with the inner
>>>> "loop" actually being a Kudu PK lookup, but that strategy isn't implemented
>>>> by Impala.
>>>>
>>>> -Todd
>>>>
>>>>
>>>>
>>>>> If th
s then how far out of context is my understanding of it?
>>>> Reading about HDFS cache replication, I do know that Impala will choose a
>>>> random replica there to more evenly distribute load. But especially
>>>> compared to Kudu upsert, managing mutable data using Parqu
eading about HDFS cache replication, I do know that Impala will choose a
>>> random replica there to more evenly distribute load. But especially
>>> compared to Kudu upsert, managing mutable data using Parquet is painful.
>>> So, perhaps to sum thing up, if nearly 100% of my
really just
>> splitting hairs performance-wise between Kudu and HDFS-cached parquet?
>>
>> From: Todd Lipcon <t...@cloudera.com>
>> Reply-To: "user@kudu.apache.org" <user@kudu.apache.org>
>> Date: Friday, March 16, 2018 at 2:51 PM
>>
>> To: &
day, March 16, 2018 at 2:51 PM
To: "user@kudu.apache.org<mailto:user@kudu.apache.org>"
<user@kudu.apache.org<mailto:user@kudu.apache.org>>
Subject: Re: "broadcast" tablet replication for kudu?
It's worth noting that, even if your table is replicated, Impal
you don't
need to worry about odd/even WRT number of tablet servers.
- Dan
>
> From: Dan Burkert <danburk...@apache.org>
> Reply-To: "user@kudu.apache.org" <user@kudu.apache.org>
> Date: Friday, March 16, 2018 at 2:09 PM
> To: "user@kudu.apache.org" <us
; From: Dan Burkert <danburk...@apache.org>
> Reply-To: "user@kudu.apache.org" <user@kudu.apache.org>
> Date: Friday, March 16, 2018 at 2:09 PM
> To: "user@kudu.apache.org" <user@kudu.apache.org>
> Subject: Re: "broadcast" tablet replication for
The replication count is the number of tablet servers which Kudu will host
copies on. So if you set the replication level to 5, Kudu will put the
data on 5 separate tablet servers. There's no built-in broadcast table
feature; upping the replication factor is the closest thing. A couple of
I'm new to Kudu but we are also going to use Impala mostly with Kudu. We
have a few tables that are small but used a lot. My plan is replicate them
more than 3 times. When you create a kudu table, you can specify number of
replicated copies (3 by default) and I guess you can put there a number,
18 matches
Mail list logo