Hi Julian Jaffe,
Thank you very much. I haven't tried it yet. Can you provide a more specific
example. In theory, adding indexes will slow down the speed of adding and
updating operations. In your scenario, what percentage is this performance loss
reached? Yes, for the bottleneck of
Hey Benedict,
Have you tried creating indices on your segments table? I’ve managed Druid
clusters with orders of magnitude more segments without this issue by indexing
key filter columns. (The coordinator is still a painful bottle neck, just not
due to query times to the metadata server )
Hi Jihoon Son,
Yes, it does bring some compatibility issues. I was checking the latest
metadata information just now. At present, the total number of records in the
metadata table is five million, of which nearly half are marked as used, and
the physical resources of the machine where the
For this sort of issue, we should think about if there is any other
way that can address the same problem without modifying metadata table
schema.
Because, modifying metadata table schema introduces compatibility
issues, such as the upgrade path for existing users.
Benedict, as Samarth and Lucas
Hi Ben Krug,
+1 for adding the is_deleted column, and then we can create a timing trigger to
clear these old records.
Regards,
Benedict Jin
On 2021/04/06 18:28:45, Ben Krug wrote:
> Oh, that's easier than tombstones. flag is_deleted and update timestamp
> (so it gets pulled again).
>
> On
Hi Samarth Jain,
Thanks. The main reason is the huge amount of metadata, which leads to a very
slow process of scanning the full table of metadata storage and deserializing
metadata. Yes, I have tried to clean up the metadata.
Regards,
Benedict Jin
On 2021/04/06 17:20:26, Samarth Jain wrote:
Hi Ben Krug,
Thank you very much for your ideas, but I also feel that the introduction of
Cassandra is too heavy. The tombstones feature in Cassandra you mentioned can
actually be supported by timed tasks in MySQL or PostgreSQL.
Regards,
Benedict Jin
On 2021/04/06 15:08:03, Ben Krug wrote:
Hi Abhishek Agarwal,
You made a very important point, thank you very much.
Regards,
Benedict Jin
On 2021/04/06 11:02:34, Abhishek Agarwal wrote:
> If an entry is deleted from the metadata, how is the coordinator going to
> update its own state?
>
> On Tue, Apr 6, 2021 at 3:38 PM Itai Yaffe
Hi Itai Yaffe,
Thank you very much for your support, thank you.
Regards,
Benedict Jin
On 2021/04/06 10:06:45, Itai Yaffe wrote:
> Hey,
> I'm not a Druid developer, so it's quite possible I'm missing many
> considerations here, but from a first glance, I like your offer, as it
> resembles the
Hey Benedict,
Adding on to what Samarth says in their reply, could you provide some more
context on this one to help the group understand more about your issue:
- Is this the area of the code that you are saying in non-performant?
Link
Oh, that's easier than tombstones. flag is_deleted and update timestamp
(so it gets pulled again).
On Tue, Apr 6, 2021 at 10:48 AM Tijo Thomas wrote:
> Abhishek,
> Good point. Do we need one more col for storing if it's deleted or not?
>
> On Tue, Apr 6, 2021 at 4:32 PM Abhishek Agarwal >
>
Abhishek,
Good point. Do we need one more col for storing if it's deleted or not?
On Tue, Apr 6, 2021 at 4:32 PM Abhishek Agarwal
wrote:
> If an entry is deleted from the metadata, how is the coordinator going to
> update its own state?
>
> On Tue, Apr 6, 2021 at 3:38 PM Itai Yaffe wrote:
>
>
Hi Benedict,
I am curious to understand what functionality of Druid are you seeing the
slowness in? Is it the coordinator work of assigning segments to
historicals that is slower or is it the querying of segment information
that is slower? Have you looked into CPU/network metrics for your
I suppose, if we were going down this path, something like tombstones in
Cassandra could be used.
But it would increase the complexity significantly.
Ie, a new row is inserted with a deletion marker and a timestamp, that
indicates that the corresponding row is deleted.
Now, when anyone does scan
If an entry is deleted from the metadata, how is the coordinator going to
update its own state?
On Tue, Apr 6, 2021 at 3:38 PM Itai Yaffe wrote:
> Hey,
> I'm not a Druid developer, so it's quite possible I'm missing many
> considerations here, but from a first glance, I like your offer, as it
>
Hey,
I'm not a Druid developer, so it's quite possible I'm missing many
considerations here, but from a first glance, I like your offer, as it
resembles the *tsColumn *in JDBC lookups (
https://druid.apache.org/docs/latest/development/extensions-core/lookups-cached-global.html#jdbc-lookup
).
Hi all,
Recently, when the Coordinator in our company's Druid cluster pulls metadata,
there is a performance bottleneck. The main reason is the huge amount of
metadata, which leads to a very slow process of scanning the full table of
metadata storage and deserializing metadata. The size of the
17 matches
Mail list logo