[jira] [Updated] (HUDI-2116) sync 10w partitions to hive by using HiveSyncTool lead to the oom of hive MetaStore
[ https://issues.apache.org/jira/browse/HUDI-2116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udit Mehrotra updated HUDI-2116: Status: In Progress (was: Open) > sync 10w partitions to hive by using HiveSyncTool lead to the oom of hive > MetaStore > > > Key: HUDI-2116 > URL: https://issues.apache.org/jira/browse/HUDI-2116 > Project: Apache Hudi > Issue Type: Bug > Components: Hive Integration >Affects Versions: 0.8.0 > Environment: hive3.1.1 > hadoop 3.1.1 >Reporter: tao meng >Assignee: tao meng >Priority: Major > Labels: pull-request-available > Fix For: 0.10.0 > > > when we try to sync 10w partitions to hive by using HiveSyncTool lead to the > oom of hive MetaStore。 > > here is a stress test for HiveSyncTool > env: > hive metastore -Xms16G -Xmx16G > hive.metastore.client.socket.timeout=10800 > > ||partitionNum||time consume|| > |100|37s| > |1000|168s| > |5000|1830s| > |1|timeout| > |10|hive metastore oom| > HiveSyncTools sync all partitions to hive metastore at once。 when the > partitions num is large ,it puts a lot of pressure on hive metastore。 for > large partition num we should support batch sync 。 > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HUDI-2116) sync 10w partitions to hive by using HiveSyncTool lead to the oom of hive MetaStore
[ https://issues.apache.org/jira/browse/HUDI-2116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-2116: - Labels: pull-request-available (was: ) > sync 10w partitions to hive by using HiveSyncTool lead to the oom of hive > MetaStore > > > Key: HUDI-2116 > URL: https://issues.apache.org/jira/browse/HUDI-2116 > Project: Apache Hudi > Issue Type: Bug > Components: Hive Integration >Affects Versions: 0.8.0 > Environment: hive3.1.1 > hadoop 3.1.1 >Reporter: tao meng >Assignee: tao meng >Priority: Major > Labels: pull-request-available > Fix For: 0.10.0 > > > when we try to sync 10w partitions to hive by using HiveSyncTool lead to the > oom of hive MetaStore。 > > here is a stress test for HiveSyncTool > env: > hive metastore -Xms16G -Xmx16G > hive.metastore.client.socket.timeout=10800 > > ||partitionNum||time consume|| > |100|37s| > |1000|168s| > |5000|1830s| > |1|timeout| > |10|hive metastore oom| > HiveSyncTools sync all partitions to hive metastore at once。 when the > partitions num is large ,it puts a lot of pressure on hive metastore。 for > large partition num we should support batch sync 。 > -- This message was sent by Atlassian Jira (v8.3.4#803005)