Thanks for looking into it! We are creating tables in parallel, but it's still 
quite slow. Seems like we just have too many tables at the moment. I'll move 
further discussion to the ticket, thanks again!

Emilio Lahr-Vivaz
General Atomics, CCRi
________________________________
From: Dave Marion <[email protected]>
Sent: Thursday, June 20, 2024 10:57 AM
To: [email protected] <[email protected]>
Subject: Re: -EXT-Re: debugging slow table creation

WARNING:  This message is from an external source.  Evaluate the message 
carefully BEFORE clicking on links or opening attachments.

I was able to reproduce your issue and created 
https://usg02.safelinks.protection.office365.us/?url=https%3A%2F%2Fgithub.com%2Fapache%2Faccumulo%2Fissues%2F4684&data=05%7C02%7Cemilio.lahr-vivaz%40ga-ccri.com%7Cf6bd1908b686421256b108dc91394932%7C05e53887e4b3459587f73ae79f0e723e%7C0%7C0%7C638544922428779246%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=56QsUGubL4qZDJndabP4RQTEjgD%2FRpk8sRc%2BM27krN4%3D&reserved=0<https://github.com/apache/accumulo/issues/4684>.
 I'm not sure if creating the tables using several threads may speed things up 
or not. If you are doing things serially now, you may want to give that a try.

On 2024/06/19 01:00:09 "Lahr-Vivaz, Emilio" wrote:
> Hello again! Thanks for the tips.
> I tried checking for fate operations, but there didn't seem to be any hanging 
> around very long. I tried increasing the fate threads, but it didn't help. 
> After that, I tried profiling the manager process, and it seems like the bulk 
> of CPU time is spent talking to zookeeper. Initially, that was in the 
> TableLoadBalancer, but I changed to use the SimpleLoadBalancer and the cpu 
> time shifted to MetadataTableUtils.addTablet/TableZooHelper/ZooCache, where 
> it is populating table ids. Looking in zookeeper, it seems like all the 
> tables end up under /accumulo/<uuid>/tables, which keeps growing in size. CPU 
> doesn't seem particularly high, so I'm not entirely sure this is the culprit. 
> But it seems to me that it's taking an increasingly long time to populate the 
> table cache as the number of zk table nodes increases. Does that seem 
> feasible? Is there anything I can do to mitigate the issue?
>
> Thanks,
>
> Emilio Lahr-Vivaz
> General Atomics, CCRi
>
> ________________________________
> From: Dave Marion <[email protected]>
> Sent: Thursday, June 13, 2024 7:56 PM
> To: [email protected] <[email protected]>
> Subject: -EXT-Re: debugging slow table creation
>
>
> WARNING:  This message is from an external source.  Evaluate the message 
> carefully BEFORE clicking on links or opening attachments.
>
> Emilio,
>
>   The create table operation is a Fate operation that runs in the Manager. My 
> immediate thought is that maybe the number of Fate operations that you are 
> creating for your other tables is making the create table operation wait for 
> an available thread. I don't have the code in front of me, but I believe 
> there are Fate commands in the shell and via the admin utility that will let 
> you see the status of the Fate operations. If your create operation is 
> sitting there in a submitted state, then it's waiting for a thread. There is 
> a property that you can modify to increase the number of Fate threads. If 
> it's in the running state for a long time, then stacking the Manager to 
> determine where it's spending its time would help us.
>
> Dave
>
> On Jun 13, 2024 7:23 PM, "Lahr-Vivaz, Emilio" <[email protected]> 
> wrote:
> Hello,
>
> We've noticed that creating a table in Accumulo 2.1 tends to get slower and 
> slower as the number of tables in the system increases, and once we get have 
> several thousand tables creating more really bogs down (on the order of 
> minutes). Does anyone have any tips on debugging this issue, or known 
> configurations that might help? Or is this not a use case that Accumulo was 
> designed for? I can provide more details on the cluster setup, if it would be 
> helpful.
>
> Thanks,
>
> Emilio Lahr-Vivaz
> General Atomics, CCRi
>
>

Reply via email to