[jira] [Commented] (HIVE-20282) HiveServer2 incorrect queue name when using Tez instead of MR
[ https://issues.apache.org/jira/browse/HIVE-20282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565879#comment-16565879 ] Steve Yeom commented on HIVE-20282: --- Hi [~prasanth_j] I have a patch for the Ambari views Do you think you can look at this? Thanks, Steve. > HiveServer2 incorrect queue name when using Tez instead of MR > - > > Key: HIVE-20282 > URL: https://issues.apache.org/jira/browse/HIVE-20282 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Critical > Fix For: 4.0.0 > > Attachments: HIVE-20282.01.patch > > > Ambari -> Tez view has > "Hive Queries" and "All DAGs" view pages. > The queue names from a query id and from its DAG id does not match for Tez > engine context. > The one from a query is not correct. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20282) HiveServer2 incorrect queue name when using Tez instead of MR
[ https://issues.apache.org/jira/browse/HIVE-20282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Yeom updated HIVE-20282: -- Status: Patch Available (was: Open) > HiveServer2 incorrect queue name when using Tez instead of MR > - > > Key: HIVE-20282 > URL: https://issues.apache.org/jira/browse/HIVE-20282 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Critical > Fix For: 4.0.0 > > Attachments: HIVE-20282.01.patch > > > Ambari -> Tez view has > "Hive Queries" and "All DAGs" view pages. > The queue names from a query id and from its DAG id does not match for Tez > engine context. > The one from a query is not correct. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20282) HiveServer2 incorrect queue name when using Tez instead of MR
[ https://issues.apache.org/jira/browse/HIVE-20282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Yeom updated HIVE-20282: -- Attachment: HIVE-20282.01.patch > HiveServer2 incorrect queue name when using Tez instead of MR > - > > Key: HIVE-20282 > URL: https://issues.apache.org/jira/browse/HIVE-20282 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Critical > Fix For: 4.0.0 > > Attachments: HIVE-20282.01.patch > > > Ambari -> Tez view has > "Hive Queries" and "All DAGs" view pages. > The queue names from a query id and from its DAG id does not match for Tez > engine context. > The one from a query is not correct. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20282) HiveServer2 incorrect queue name when using Tez instead of MR
[ https://issues.apache.org/jira/browse/HIVE-20282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Yeom updated HIVE-20282: -- Description: Ambari -> Tez view has "Hive Queries" and "All DAGs" view pages. The queue names from a query id and from its DAG id does not match for Tez engine context. The one from a query is not correct. > HiveServer2 incorrect queue name when using Tez instead of MR > - > > Key: HIVE-20282 > URL: https://issues.apache.org/jira/browse/HIVE-20282 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Critical > Fix For: 4.0.0 > > > Ambari -> Tez view has > "Hive Queries" and "All DAGs" view pages. > The queue names from a query id and from its DAG id does not match for Tez > engine context. > The one from a query is not correct. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HIVE-20282) HiveServer2 incorrect queue name when using Tez instead of MR
[ https://issues.apache.org/jira/browse/HIVE-20282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Yeom reassigned HIVE-20282: - > HiveServer2 incorrect queue name when using Tez instead of MR > - > > Key: HIVE-20282 > URL: https://issues.apache.org/jira/browse/HIVE-20282 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Critical > Fix For: 4.0.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (HIVE-20186) Incorrect results when hive.tez.cartesian-product.enabled=false and vectorization enabled
[ https://issues.apache.org/jira/browse/HIVE-20186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545600#comment-16545600 ] Steve Yeom edited comment on HIVE-20186 at 7/16/18 7:03 PM: This returns wrong results. You can compare when any one of the two variables has the other value. I.e., either hive.tez.cartesian-product.enabled=true or hive.vectorized.execution.enabled=false. was (Author: steveyeom2017): This returns wrong results. You can compare when any one of the two variables has the other value. > Incorrect results when hive.tez.cartesian-product.enabled=false and > vectorization enabled > -- > > Key: HIVE-20186 > URL: https://issues.apache.org/jira/browse/HIVE-20186 > Project: Hive > Issue Type: Bug > Components: Vectorization >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > > select > CAST(180615145920496 AS BIGINT) AS IDP_AUDIT_ID > , CAST('2017-12-28' AS DATE) AS IDP_DATA_DATE > , fi.TEST_CODED > , inv.IDP_WAREHOUSE_ID > , inv.IDP_AUDIT_ID > , inv.baseline_is_current > , case when inv.baseline_is_current = '1' then 'Y' else 'N' end as > inv_baseline_is_current > , INV.PROJECT_KEY AS INV_KEY > from clarity__L3_SNAP_NUMBER snap, l3_monthly_dw_dimproject inv > LEFT OUTER JOIN L3_MONTHLY_FI_PROG_PROJ_NEWHIERARCHY hier > on INV.PROJECT_KEY = HIER.PROJECT_KEY AND HIER.IDP_DATA_DATE = '2017-12-28' > LEFT OUTER JOIN clarity__l3_monthly_dw_dimproject_fi fi > ON HIER.FUNDING_ITEM_KEY = FI.PROJECT_KEY AND FI.IDP_DATA_DATE = '2017-12-28' > INNER JOIN l3_dw_snapshot_control sc > ON 1=1 AND SC.IDP_DATA_DATE = '2017-12-28' AND SC.SNAPSHOT_ALIAS LIKE 'Month%' > WHERE INV.L3_SNAPSHOT_NUMBER= snap.L3_snapshot_number AND INV.IDP_DATA_DATE = > '2017-12-28' > order by idp_warehouse_id; -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20186) Incorrect results when hive.tez.cartesian-product.enabled=false and vectorization enabled
[ https://issues.apache.org/jira/browse/HIVE-20186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Yeom updated HIVE-20186: -- Description: select CAST(180615145920496 AS BIGINT) AS IDP_AUDIT_ID , CAST('2017-12-28' AS DATE) AS IDP_DATA_DATE , fi.TEST_CODED , inv.IDP_WAREHOUSE_ID , inv.IDP_AUDIT_ID , inv.baseline_is_current , case when inv.baseline_is_current = '1' then 'Y' else 'N' end as inv_baseline_is_current , INV.PROJECT_KEY AS INV_KEY from clarity__L3_SNAP_NUMBER snap, l3_monthly_dw_dimproject inv LEFT OUTER JOIN L3_MONTHLY_FI_PROG_PROJ_NEWHIERARCHY hier on INV.PROJECT_KEY = HIER.PROJECT_KEY AND HIER.IDP_DATA_DATE = '2017-12-28' LEFT OUTER JOIN clarity__l3_monthly_dw_dimproject_fi fi ON HIER.FUNDING_ITEM_KEY = FI.PROJECT_KEY AND FI.IDP_DATA_DATE = '2017-12-28' INNER JOIN l3_dw_snapshot_control sc ON 1=1 AND SC.IDP_DATA_DATE = '2017-12-28' AND SC.SNAPSHOT_ALIAS LIKE 'Month%' WHERE INV.L3_SNAPSHOT_NUMBER= snap.L3_snapshot_number AND INV.IDP_DATA_DATE = '2017-12-28' order by idp_warehouse_id; was: select CAST(180615145920496 AS BIGINT) AS IDP_AUDIT_ID , CAST('2017-12-28' AS DATE) AS IDP_DATA_DATE , fi.BMO_CODED , inv.IDP_WAREHOUSE_ID , inv.IDP_AUDIT_ID , inv.baseline_is_current , case when inv.baseline_is_current = '1' then 'Y' else 'N' end as inv_baseline_is_current , INV.PROJECT_KEY AS INV_KEY from clarity__L3_SNAP_NUMBER snap, l3_monthly_dw_dimproject inv LEFT OUTER JOIN L3_MONTHLY_FI_PROG_PROJ_NEWHIERARCHY hier on INV.PROJECT_KEY = HIER.PROJECT_KEY AND HIER.IDP_DATA_DATE = '2017-12-28' LEFT OUTER JOIN clarity__l3_monthly_dw_dimproject_fi fi ON HIER.FUNDING_ITEM_KEY = FI.PROJECT_KEY AND FI.IDP_DATA_DATE = '2017-12-28' INNER JOIN l3_dw_snapshot_control sc ON 1=1 AND SC.IDP_DATA_DATE = '2017-12-28' AND SC.SNAPSHOT_ALIAS LIKE 'Month%' WHERE INV.L3_SNAPSHOT_NUMBER= snap.L3_snapshot_number AND INV.IDP_DATA_DATE = '2017-12-28' order by idp_warehouse_id; > Incorrect results when hive.tez.cartesian-product.enabled=false and > vectorization enabled > -- > > Key: HIVE-20186 > URL: https://issues.apache.org/jira/browse/HIVE-20186 > Project: Hive > Issue Type: Bug > Components: Vectorization >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > > select > CAST(180615145920496 AS BIGINT) AS IDP_AUDIT_ID > , CAST('2017-12-28' AS DATE) AS IDP_DATA_DATE > , fi.TEST_CODED > , inv.IDP_WAREHOUSE_ID > , inv.IDP_AUDIT_ID > , inv.baseline_is_current > , case when inv.baseline_is_current = '1' then 'Y' else 'N' end as > inv_baseline_is_current > , INV.PROJECT_KEY AS INV_KEY > from clarity__L3_SNAP_NUMBER snap, l3_monthly_dw_dimproject inv > LEFT OUTER JOIN L3_MONTHLY_FI_PROG_PROJ_NEWHIERARCHY hier > on INV.PROJECT_KEY = HIER.PROJECT_KEY AND HIER.IDP_DATA_DATE = '2017-12-28' > LEFT OUTER JOIN clarity__l3_monthly_dw_dimproject_fi fi > ON HIER.FUNDING_ITEM_KEY = FI.PROJECT_KEY AND FI.IDP_DATA_DATE = '2017-12-28' > INNER JOIN l3_dw_snapshot_control sc > ON 1=1 AND SC.IDP_DATA_DATE = '2017-12-28' AND SC.SNAPSHOT_ALIAS LIKE 'Month%' > WHERE INV.L3_SNAPSHOT_NUMBER= snap.L3_snapshot_number AND INV.IDP_DATA_DATE = > '2017-12-28' > order by idp_warehouse_id; -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20186) Incorrect results when hive.tez.cartesian-product.enabled=false and vectorization enabled
[ https://issues.apache.org/jira/browse/HIVE-20186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545600#comment-16545600 ] Steve Yeom commented on HIVE-20186: --- This returns wrong results. You can compare when any one of the two variables has the other value. > Incorrect results when hive.tez.cartesian-product.enabled=false and > vectorization enabled > -- > > Key: HIVE-20186 > URL: https://issues.apache.org/jira/browse/HIVE-20186 > Project: Hive > Issue Type: Bug > Components: Vectorization >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > > select > CAST(180615145920496 AS BIGINT) AS IDP_AUDIT_ID > , CAST('2017-12-28' AS DATE) AS IDP_DATA_DATE > , fi.BMO_CODED > , inv.IDP_WAREHOUSE_ID > , inv.IDP_AUDIT_ID > , inv.baseline_is_current > , case when inv.baseline_is_current = '1' then 'Y' else 'N' end as > inv_baseline_is_current > , INV.PROJECT_KEY AS INV_KEY > from clarity__L3_SNAP_NUMBER snap, l3_monthly_dw_dimproject inv > LEFT OUTER JOIN L3_MONTHLY_FI_PROG_PROJ_NEWHIERARCHY hier > on INV.PROJECT_KEY = HIER.PROJECT_KEY AND HIER.IDP_DATA_DATE = '2017-12-28' > LEFT OUTER JOIN clarity__l3_monthly_dw_dimproject_fi fi > ON HIER.FUNDING_ITEM_KEY = FI.PROJECT_KEY AND FI.IDP_DATA_DATE = '2017-12-28' > INNER JOIN l3_dw_snapshot_control sc > ON 1=1 AND SC.IDP_DATA_DATE = '2017-12-28' AND SC.SNAPSHOT_ALIAS LIKE 'Month%' > WHERE INV.L3_SNAPSHOT_NUMBER= snap.L3_snapshot_number AND INV.IDP_DATA_DATE = > '2017-12-28' > order by idp_warehouse_id; -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (HIVE-19975) Checking writeIdList per table may not check the commit level of a partition on a partitioned table
[ https://issues.apache.org/jira/browse/HIVE-19975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527244#comment-16527244 ] Steve Yeom edited comment on HIVE-19975 at 7/9/18 7:32 PM: --- Added a simple patch for me to continue other jiras otherwise other jiras' test cases do not work. It looks like stats_part.q is working. Added this patch as HIVE-19975.00.nogen.patch for reference. was (Author: steveyeom2017): Added a simple patch for me to continue other jiras otherwise other jiras' test cases do not work. It looks like stats_part.q is working. > Checking writeIdList per table may not check the commit level of a partition > on a partitioned table > --- > > Key: HIVE-19975 > URL: https://issues.apache.org/jira/browse/HIVE-19975 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Sergey Shelukhin >Priority: Major > Fix For: txnstats > > Attachments: HIVE-19975.00.nogen.patch, HIVE-19975.01.patch, > HIVE-19975.patch, branch-19975.01.nogen.patch, branch-19975.02.nogen.patch, > branch-19975.nogen.patch > > > writeIdList is per table entity but stats for a partitioned table are per > partition. > I.e., each record in PARTITIONS has an independent stats. > So if we check the validity of a partition's stats, we need to check in the > context of > a partiton. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19975) Checking writeIdList per table may not check the commit level of a partition on a partitioned table
[ https://issues.apache.org/jira/browse/HIVE-19975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Yeom updated HIVE-19975: -- Attachment: HIVE-19975.00.nogen.patch > Checking writeIdList per table may not check the commit level of a partition > on a partitioned table > --- > > Key: HIVE-19975 > URL: https://issues.apache.org/jira/browse/HIVE-19975 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Sergey Shelukhin >Priority: Major > Fix For: txnstats > > Attachments: HIVE-19975.00.nogen.patch, HIVE-19975.01.patch, > HIVE-19975.patch, branch-19975.01.nogen.patch, branch-19975.02.nogen.patch, > branch-19975.nogen.patch > > > writeIdList is per table entity but stats for a partitioned table are per > partition. > I.e., each record in PARTITIONS has an independent stats. > So if we check the validity of a partition's stats, we need to check in the > context of > a partiton. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HIVE-20115) acid_no_buckets.q fails
[ https://issues.apache.org/jira/browse/HIVE-20115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Yeom reassigned HIVE-20115: - Assignee: Steve Yeom > acid_no_buckets.q fails > --- > > Key: HIVE-20115 > URL: https://issues.apache.org/jira/browse/HIVE-20115 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20110) Bypass HMS CachedStore for transactional stats
[ https://issues.apache.org/jira/browse/HIVE-20110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535382#comment-16535382 ] Steve Yeom commented on HIVE-20110: --- The patch 04 of HIVE-19532 already has a fix. > Bypass HMS CachedStore for transactional stats > -- > > Key: HIVE-20110 > URL: https://issues.apache.org/jira/browse/HIVE-20110 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (HIVE-20110) Bypass HMS CachedStore for transactional stats
[ https://issues.apache.org/jira/browse/HIVE-20110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Yeom resolved HIVE-20110. --- Resolution: Fixed > Bypass HMS CachedStore for transactional stats > -- > > Key: HIVE-20110 > URL: https://issues.apache.org/jira/browse/HIVE-20110 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HIVE-20110) Bypass HMS CachedStore for transactional stats
[ https://issues.apache.org/jira/browse/HIVE-20110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Yeom reassigned HIVE-20110: - Assignee: Steve Yeom > Bypass HMS CachedStore for transactional stats > -- > > Key: HIVE-20110 > URL: https://issues.apache.org/jira/browse/HIVE-20110 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20107) stats_part2.q fails
[ https://issues.apache.org/jira/browse/HIVE-20107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Yeom updated HIVE-20107: -- Description: https://builds.apache.org/job/PreCommit-HIVE-Build/12425/testReport/org.apache.hadoop.hive.cli/TestCliDriver/testCliDriver_stats_part2_/ > stats_part2.q fails > --- > > Key: HIVE-20107 > URL: https://issues.apache.org/jira/browse/HIVE-20107 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > > https://builds.apache.org/job/PreCommit-HIVE-Build/12425/testReport/org.apache.hadoop.hive.cli/TestCliDriver/testCliDriver_stats_part2_/ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20005) acid_table_stats, acid_no_buckets, etc - query result change on the branch
[ https://issues.apache.org/jira/browse/HIVE-20005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Yeom updated HIVE-20005: -- Attachment: HIVE-20005.01.patch > acid_table_stats, acid_no_buckets, etc - query result change on the branch > -- > > Key: HIVE-20005 > URL: https://issues.apache.org/jira/browse/HIVE-20005 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Steve Yeom >Priority: Major > Attachments: HIVE-20005.01.patch > > > Queries in the following tests have explain changes from running the query to > using stats (that is probably by design for most queries). > However, some of the queries also has the result changes; new results are > likely invalid. > TestCliDriver: acid_table_stats > TestMiniLlapLocalCliDriver: acid_no_buckets -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20005) acid_table_stats, acid_no_buckets, etc - query result change on the branch
[ https://issues.apache.org/jira/browse/HIVE-20005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Yeom updated HIVE-20005: -- Status: Patch Available (was: Open) > acid_table_stats, acid_no_buckets, etc - query result change on the branch > -- > > Key: HIVE-20005 > URL: https://issues.apache.org/jira/browse/HIVE-20005 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Steve Yeom >Priority: Major > Attachments: HIVE-20005.01.patch > > > Queries in the following tests have explain changes from running the query to > using stats (that is probably by design for most queries). > However, some of the queries also has the result changes; new results are > likely invalid. > TestCliDriver: acid_table_stats > TestMiniLlapLocalCliDriver: acid_no_buckets -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HIVE-19975) Checking writeIdList per table may not check the commit level of a partition on a partitioned table
[ https://issues.apache.org/jira/browse/HIVE-19975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Yeom reassigned HIVE-19975: - Assignee: Steve Yeom (was: Sergey Shelukhin) > Checking writeIdList per table may not check the commit level of a partition > on a partitioned table > --- > > Key: HIVE-19975 > URL: https://issues.apache.org/jira/browse/HIVE-19975 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > Attachments: HIVE-19975.01.patch > > > writeIdList is per table entity but stats for a partitioned table are per > partition. > I.e., each record in PARTITIONS has an independent stats. > So if we check the validity of a partition's stats, we need to check in the > context of > a partiton. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19975) Checking writeIdList per table may not check the commit level of a partition on a partitioned table
[ https://issues.apache.org/jira/browse/HIVE-19975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Yeom updated HIVE-19975: -- Status: Patch Available (was: Open) Simple one to continue other jiras under HIVE-19416. > Checking writeIdList per table may not check the commit level of a partition > on a partitioned table > --- > > Key: HIVE-19975 > URL: https://issues.apache.org/jira/browse/HIVE-19975 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Sergey Shelukhin >Priority: Major > Fix For: 4.0.0 > > Attachments: HIVE-19975.01.patch > > > writeIdList is per table entity but stats for a partitioned table are per > partition. > I.e., each record in PARTITIONS has an independent stats. > So if we check the validity of a partition's stats, we need to check in the > context of > a partiton. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19975) Checking writeIdList per table may not check the commit level of a partition on a partitioned table
[ https://issues.apache.org/jira/browse/HIVE-19975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527244#comment-16527244 ] Steve Yeom commented on HIVE-19975: --- Added a simple patch for me to continue other jiras otherwise other jiras' test cases do not work. It looks like stats_part.q is working. > Checking writeIdList per table may not check the commit level of a partition > on a partitioned table > --- > > Key: HIVE-19975 > URL: https://issues.apache.org/jira/browse/HIVE-19975 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Sergey Shelukhin >Priority: Major > Fix For: 4.0.0 > > Attachments: HIVE-19975.01.patch > > > writeIdList is per table entity but stats for a partitioned table are per > partition. > I.e., each record in PARTITIONS has an independent stats. > So if we check the validity of a partition's stats, we need to check in the > context of > a partiton. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19975) Checking writeIdList per table may not check the commit level of a partition on a partitioned table
[ https://issues.apache.org/jira/browse/HIVE-19975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Yeom updated HIVE-19975: -- Attachment: HIVE-19975.01.patch > Checking writeIdList per table may not check the commit level of a partition > on a partitioned table > --- > > Key: HIVE-19975 > URL: https://issues.apache.org/jira/browse/HIVE-19975 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Sergey Shelukhin >Priority: Major > Fix For: 4.0.0 > > Attachments: HIVE-19975.01.patch > > > writeIdList is per table entity but stats for a partitioned table are per > partition. > I.e., each record in PARTITIONS has an independent stats. > So if we check the validity of a partition's stats, we need to check in the > context of > a partiton. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19975) Checking writeIdList per table may not check the commit level of a partition on a partitioned table
[ https://issues.apache.org/jira/browse/HIVE-19975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16526776#comment-16526776 ] Steve Yeom commented on HIVE-19975: --- [~sershe] I checked with "stats_part.q". Yes it is calling HiveMetaStore::get_partitions_statistics_req method. But your additional change to AcidUtils.getTableSnapshot(): to disable to get "validWriteIdList" by HIVE_IN_TEST check: if (validWriteIdList == null && !HiveConf.getBoolVar(conf, ConfVars.HIVE_IN_TEST)) { it passes null. But according to [~jcamachorodriguez] i think you understand it is ok when we call this: validWriteIdList = getTableValidWriteIdListWithTxnList(..) I think you jumped into conclusion that multiple inserts for partitioned table case is not working at stats_part.q due to validWriteIdList == null. But that is not the case. > Checking writeIdList per table may not check the commit level of a partition > on a partitioned table > --- > > Key: HIVE-19975 > URL: https://issues.apache.org/jira/browse/HIVE-19975 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Sergey Shelukhin >Priority: Major > Fix For: 4.0.0 > > > writeIdList is per table entity but stats for a partitioned table are per > partition. > I.e., each record in PARTITIONS has an independent stats. > So if we check the validity of a partition's stats, we need to check in the > context of > a partiton. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19867) Test and verify Concurrent INSERTS
[ https://issues.apache.org/jira/browse/HIVE-19867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Yeom updated HIVE-19867: -- Status: Patch Available (was: Open) > Test and verify Concurrent INSERTS > > > Key: HIVE-19867 > URL: https://issues.apache.org/jira/browse/HIVE-19867 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > Attachments: HIVE-19867.01.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19867) Test and verify Concurrent INSERTS
[ https://issues.apache.org/jira/browse/HIVE-19867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16526755#comment-16526755 ] Steve Yeom commented on HIVE-19867: --- patch 01 is based on commit 61c55a3f66a71b7ff513fc45d68b749c061fa820 Merge: 8f32832428 b5160e7441 Author: sergey Date: Mon Jun 25 21:02:25 2018 -0700 HIVE-19416 : merge master into branch (Sergey Shelukhin) It may fail due to conflicts > Test and verify Concurrent INSERTS > > > Key: HIVE-19867 > URL: https://issues.apache.org/jira/browse/HIVE-19867 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > Attachments: HIVE-19867.01.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19867) Test and verify Concurrent INSERTS
[ https://issues.apache.org/jira/browse/HIVE-19867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Yeom updated HIVE-19867: -- Attachment: HIVE-19867.01.patch > Test and verify Concurrent INSERTS > > > Key: HIVE-19867 > URL: https://issues.apache.org/jira/browse/HIVE-19867 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > Attachments: HIVE-19867.01.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19975) Checking writeIdList per table may not check the commit level of a partition on a partitioned table
[ https://issues.apache.org/jira/browse/HIVE-19975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16526615#comment-16526615 ] Steve Yeom commented on HIVE-19975: --- [~sershe] I think a quick verification may be that with your check please run stats_part.q. It will show whether it works or not. But my code changes are based on debugger sessions so it should work (I mean multiple inserts). Also if a method like get_partitions_statistics_req does not propagate txnId and write id list then either of the two: 1. it is not used by StatsOptimizer in case SELECT. 2. it is not used in the execution patch for a SELECT. > Checking writeIdList per table may not check the commit level of a partition > on a partitioned table > --- > > Key: HIVE-19975 > URL: https://issues.apache.org/jira/browse/HIVE-19975 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Sergey Shelukhin >Priority: Major > Fix For: 4.0.0 > > > writeIdList is per table entity but stats for a partitioned table are per > partition. > I.e., each record in PARTITIONS has an independent stats. > So if we check the validity of a partition's stats, we need to check in the > context of > a partiton. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19975) Checking writeIdList per table may not check the commit level of a partition on a partitioned table
[ https://issues.apache.org/jira/browse/HIVE-19975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16526034#comment-16526034 ] Steve Yeom commented on HIVE-19975: --- Do you think you can check "stats_part.q" and "stats_part2.q" (that are added by me) to see if these new file is doing the same tests as yours described above? > Checking writeIdList per table may not check the commit level of a partition > on a partitioned table > --- > > Key: HIVE-19975 > URL: https://issues.apache.org/jira/browse/HIVE-19975 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Sergey Shelukhin >Priority: Major > Fix For: 4.0.0 > > > writeIdList is per table entity but stats for a partitioned table are per > partition. > I.e., each record in PARTITIONS has an independent stats. > So if we check the validity of a partition's stats, we need to check in the > context of > a partiton. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19916) master-txnstats branch - don't get write IDs from metastore when it's not safe
[ https://issues.apache.org/jira/browse/HIVE-19916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16525734#comment-16525734 ] Steve Yeom commented on HIVE-19916: --- Hi [~sershe] As we discussed during the meeting, do you think you can send an email with a "test case" for this jira issue? To [~jcamachorodriguez] and I? Thank you. Steve. > master-txnstats branch - don't get write IDs from metastore when it's not safe > -- > > Key: HIVE-19916 > URL: https://issues.apache.org/jira/browse/HIVE-19916 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HIVE-19916.patch, branch-19916.patch > > > There's some code in original txn stats patch that may go to metastore to get > write Ids. This code should not go to metastore, it should fail instead. > HIVE-19382 should ensure that we have correct IDs already present during > optimizer - they are using by e.g. materialized view optimizer, so they > should be there; if they are not present, some integration might be needed so > that txn stats optimizations also have access to those write Ids. > cc [~jcamachorodriguez] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19975) Checking writeIdList per table may not check the commit level of a partition on a partitioned table.
[ https://issues.apache.org/jira/browse/HIVE-19975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16525659#comment-16525659 ] Steve Yeom commented on HIVE-19975: --- [~sershe] has added test case for partition here: " Consider two scenarios: 1) Serially, with no parallel txns: Write ID 1 inserts into partition k=1, partition stats table in metastore has valid write ID list is (1). Reader tries to read, table’s valid write ID list is (1), partition k=1 list in stats table is (1), they are equivalent, all good. Write ID 2 inserts into partition k=2, partition stats table in metastore has valid write ID list is (1,2). Reader tries to read, table’s valid write ID list is (1,2). For partition k=2, the list is (1,2), equivalent returns true, all good. But for partition k=1, the list is still (1), because writer 2 doesn’t touch it. Equivalent returns false, stats cannot be used. " My answer to this test case scenarios is For #1, I have already simulated all the possible cases regarding reader’s starting point in time. It should work with the current patch. > Checking writeIdList per table may not check the commit level of a partition > on a partitioned table. > > > Key: HIVE-19975 > URL: https://issues.apache.org/jira/browse/HIVE-19975 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Sergey Shelukhin >Priority: Major > Fix For: 4.0.0 > > > writeIdList is per table entity but stats for a partitioned table are per > partition. > I.e., each record in PARTITIONS has an independent stats. > So if we check the validity of a partition's stats, we need to check in the > context of > a partiton. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19916) master-txnstats branch - don't get write IDs from metastore when it's not safe
[ https://issues.apache.org/jira/browse/HIVE-19916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16525647#comment-16525647 ] Steve Yeom commented on HIVE-19916: --- Hey [~sershe] do you think you can send me the patch you created for this jira. Let me check the changes you made. > master-txnstats branch - don't get write IDs from metastore when it's not safe > -- > > Key: HIVE-19916 > URL: https://issues.apache.org/jira/browse/HIVE-19916 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HIVE-19916.patch, branch-19916.patch > > > There's some code in original txn stats patch that may go to metastore to get > write Ids. This code should not go to metastore, it should fail instead. > HIVE-19382 should ensure that we have correct IDs already present during > optimizer - they are using by e.g. materialized view optimizer, so they > should be there; if they are not present, some integration might be needed so > that txn stats optimizations also have access to those write Ids. > cc [~jcamachorodriguez] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19532) fix tests for master-txnstats branch
[ https://issues.apache.org/jira/browse/HIVE-19532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16525358#comment-16525358 ] Steve Yeom commented on HIVE-19532: --- I checked the 124 test failures in metastore.client. Those were tests inside TestListPartitions.java module. I tested in my laptop with patch 11 but it worked without a failure. Probably I may resubmit the same patch after a little more checking. > fix tests for master-txnstats branch > > > Key: HIVE-19532 > URL: https://issues.apache.org/jira/browse/HIVE-19532 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 3.0.0 >Reporter: Steve Yeom >Assignee: Sergey Shelukhin >Priority: Major > Fix For: 3.1.0 > > Attachments: HIVE-19532.01.patch, HIVE-19532.01.prepatch, > HIVE-19532.02.patch, HIVE-19532.02.prepatch, HIVE-19532.03.patch, > HIVE-19532.04.patch, HIVE-19532.05.patch, HIVE-19532.06.patch, > HIVE-19532.07.patch, HIVE-19532.08.patch, HIVE-19532.09.patch, > HIVE-19532.10.patch, HIVE-19532.11.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19975) Checking writeIdList per table may not check the commit level of a partition on a partitioned table.
[ https://issues.apache.org/jira/browse/HIVE-19975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16525228#comment-16525228 ] Steve Yeom commented on HIVE-19975: --- Hey [~sershe] What do you mean by " additional changes to db/checks to isValid from isEquivalent"? > Checking writeIdList per table may not check the commit level of a partition > on a partitioned table. > > > Key: HIVE-19975 > URL: https://issues.apache.org/jira/browse/HIVE-19975 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Sergey Shelukhin >Priority: Major > Fix For: 4.0.0 > > > writeIdList is per table entity but stats for a partitioned table are per > partition. > I.e., each record in PARTITIONS has an independent stats. > So if we check the validity of a partition's stats, we need to check in the > context of > a partiton. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19939) Verify any other aggregation functions other than COUNT
[ https://issues.apache.org/jira/browse/HIVE-19939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16524347#comment-16524347 ] Steve Yeom commented on HIVE-19939: --- Hi [~ashutoshc] Do you have any comments for this jira. I only added COUNT/MAX aggregation queries to my new stats_nonpart.q, stats_part.q, stats_part2.q. Thank you, Steve. > Verify any other aggregation functions other than COUNT > --- > > Key: HIVE-19939 > URL: https://issues.apache.org/jira/browse/HIVE-19939 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Priority: Major > > 1. More on MAX > I have added MAX query into stats_part.q and stats_nonpart.q but showed a > slightly different > explain.out which might be a bug. > 2. Other functions than MAX and COUNT. > Also I think we need to check other possible aggregation functions than MAX > and COUNT. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19914) master-txnstats branch - make sure SQL changes are in correct upgrade scripts
[ https://issues.apache.org/jira/browse/HIVE-19914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16524230#comment-16524230 ] Steve Yeom commented on HIVE-19914: --- I think Vineet verified. > master-txnstats branch - make sure SQL changes are in correct upgrade scripts > - > > Key: HIVE-19914 > URL: https://issues.apache.org/jira/browse/HIVE-19914 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Steve Yeom >Priority: Major > > The initial commit changed multiple files e.g. > {noformat} > standalone-metastore/src/main/sql/mysql/hive-schema-3.0.0.mysql.sql > standalone-metastore/src/main/sql/mysql/hive-schema-4.0.0.mysql.sql > standalone-metastore/src/main/sql/mysql/upgrade-2.3.0-to-3.0.0.mysql.sql > standalone-metastore/src/main/sql/mysql/upgrade-3.1.0-to-4.0.0.mysql.sql > {noformat} > The target version is currently 4.0 (or 3.1? cc [~hagleitn]), so all the > changes should be in the scripts upgrading to 4.0 > cc [~vgarg] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19914) master-txnstats branch - make sure SQL changes are in correct upgrade scripts
[ https://issues.apache.org/jira/browse/HIVE-19914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16524229#comment-16524229 ] Steve Yeom commented on HIVE-19914: --- It is already done. So we can close this. > master-txnstats branch - make sure SQL changes are in correct upgrade scripts > - > > Key: HIVE-19914 > URL: https://issues.apache.org/jira/browse/HIVE-19914 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Steve Yeom >Priority: Major > > The initial commit changed multiple files e.g. > {noformat} > standalone-metastore/src/main/sql/mysql/hive-schema-3.0.0.mysql.sql > standalone-metastore/src/main/sql/mysql/hive-schema-4.0.0.mysql.sql > standalone-metastore/src/main/sql/mysql/upgrade-2.3.0-to-3.0.0.mysql.sql > standalone-metastore/src/main/sql/mysql/upgrade-3.1.0-to-4.0.0.mysql.sql > {noformat} > The target version is currently 4.0 (or 3.1? cc [~hagleitn]), so all the > changes should be in the scripts upgrading to 4.0 > cc [~vgarg] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19820) add ACID stats support to background stats updater
[ https://issues.apache.org/jira/browse/HIVE-19820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16523945#comment-16523945 ] Steve Yeom commented on HIVE-19820: --- [~sershe] I did not check much details yet but I think the current testTxnTable and testTxnPartitions at the test module uses 0 as txnId but txnId is starting from 1. Also noticed that the way they check and get transactional stats may not be the way to do.I will try more specifics. > add ACID stats support to background stats updater > -- > > Key: HIVE-19820 > URL: https://issues.apache.org/jira/browse/HIVE-19820 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HIVE-19820.01-master-txnstats.patch > > > Follow-up from HIVE-19418. > Right now it checks whether stats are valid in an old-fashioned way... and > also gets ACID state, and discards it without using. > When ACID stats are implemented, ACID state needs to be used to do > version-aware valid stats checks. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (HIVE-19532) fix tests for master-txnstats branch
[ https://issues.apache.org/jira/browse/HIVE-19532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16523289#comment-16523289 ] Steve Yeom edited comment on HIVE-19532 at 6/26/18 6:36 AM: Hey [~sershe] Do you think you can send me the differences? I mean code changes you made? Thanks, Steve. was (Author: steveyeom2017): Hey [~sershe] Do you think you can send me the differences? Thanks, Steve. > fix tests for master-txnstats branch > > > Key: HIVE-19532 > URL: https://issues.apache.org/jira/browse/HIVE-19532 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 3.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 3.1.0 > > Attachments: HIVE-19532.01.patch, HIVE-19532.01.prepatch, > HIVE-19532.02.patch, HIVE-19532.02.prepatch, HIVE-19532.03.patch, > HIVE-19532.04.patch, HIVE-19532.05.patch, HIVE-19532.06.patch, > HIVE-19532.07.patch, HIVE-19532.08.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19532) fix tests for master-txnstats branch
[ https://issues.apache.org/jira/browse/HIVE-19532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16523289#comment-16523289 ] Steve Yeom commented on HIVE-19532: --- Hey [~sershe] Do you think you can send me the differences? Thanks, Steve. > fix tests for master-txnstats branch > > > Key: HIVE-19532 > URL: https://issues.apache.org/jira/browse/HIVE-19532 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 3.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 3.1.0 > > Attachments: HIVE-19532.01.patch, HIVE-19532.01.prepatch, > HIVE-19532.02.patch, HIVE-19532.02.prepatch, HIVE-19532.03.patch, > HIVE-19532.04.patch, HIVE-19532.05.patch, HIVE-19532.06.patch, > HIVE-19532.07.patch, HIVE-19532.08.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (HIVE-19867) Test and verify Concurrent INSERTS
[ https://issues.apache.org/jira/browse/HIVE-19867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16522959#comment-16522959 ] Steve Yeom edited comment on HIVE-19867 at 6/26/18 12:45 AM: - Eugene and I talked. What I missed from the above is that, when we have two concurrent INSERTs only one can have the other's write id in its writeIdList. But a possible solution is, if atomicity is guaranteed, to check either of the two condition is true 1. old stats' writeIdList in TBLS/PARTITIONS has the new updater's writeId 2. new updater's writeIdList has the old stats' writeId (to be saved in TBLS/PARTITIONS). If then, we can say we have a concurrent INSERTs. But we have to make sure these two cases only happen for concurrent INSERTs, not for the other cases to prevent a miscalculation. was (Author: steveyeom2017): Eugene and I talked. What I missed from the above is that, when we have two concurrent INSERT only one can have the other's write id in its writeIdList. But a possible solution is, if atomicity is guaranteed, to check either of the two condition is true 1. old stats' writeIdList in TBLS/PARTITIONS has the new updater's writeId 2. new updater's writeIdList has the old stats' writeId (to be saved in TBLS/PARTITIONS). If then, we can say we have a concurrent INSERTs. > Test and verify Concurrent INSERTS > > > Key: HIVE-19867 > URL: https://issues.apache.org/jira/browse/HIVE-19867 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (HIVE-19867) Test and verify Concurrent INSERTS
[ https://issues.apache.org/jira/browse/HIVE-19867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16522959#comment-16522959 ] Steve Yeom edited comment on HIVE-19867 at 6/26/18 12:31 AM: - Eugene and I talked. What I missed from the above is that, when we have two concurrent INSERT only one can have the other's write id in its writeIdList. But a possible solution is, if atomicity is guaranteed, to check either of the two condition is true 1. old stats' writeIdList in TBLS/PARTITIONS has the new updater's writeId 2. new updater's writeIdList has the old stats' writeId (to be saved in TBLS/PARTITIONS). If then, we can say we have a concurrent INSERTs. was (Author: steveyeom2017): A simple idea is that 1. We save writeId of the stats updater into TBLS/PARTITIONS. 2. When we update stats, we check whether the new stats updater's writeId is in the old stats updater's writeIdList and check whether the old stats updater's writeId is in the current stats updater's writeIdList. If both are true it is concurrent update. Thus we turn to false the COLUMN_STATS_ACCURATE of the current table/partition. > Test and verify Concurrent INSERTS > > > Key: HIVE-19867 > URL: https://issues.apache.org/jira/browse/HIVE-19867 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19867) Test and verify Concurrent INSERTS
[ https://issues.apache.org/jira/browse/HIVE-19867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16522959#comment-16522959 ] Steve Yeom commented on HIVE-19867: --- A simple idea is that 1. We save writeId of the stats updater into TBLS/PARTITIONS. 2. When we update stats, we check whether the new stats updater's writeId is in the old stats updater's writeIdList and check whether the old stats updater's writeId is in the current stats updater's writeIdList. If both are true it is concurrent update. Thus we turn to false the COLUMN_STATS_ACCURATE of the current table/partition. > Test and verify Concurrent INSERTS > > > Key: HIVE-19867 > URL: https://issues.apache.org/jira/browse/HIVE-19867 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19975) Checking writeIdList per table may not check the commit level of a partition on a partitioned table.
[ https://issues.apache.org/jira/browse/HIVE-19975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16522899#comment-16522899 ] Steve Yeom commented on HIVE-19975: --- Actually I am checking Sergey's new test for partitioned table at his stats updater patch to find a test case scenario where we have a hole. > Checking writeIdList per table may not check the commit level of a partition > on a partitioned table. > > > Key: HIVE-19975 > URL: https://issues.apache.org/jira/browse/HIVE-19975 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > > writeIdList is per table entity but stats for a partitioned table are per > partition. > I.e., each record in PARTITIONS has an independent stats. > So if we check the validity of a partition's stats, we need to check in the > context of > a partiton. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19532) fix tests for master-txnstats branch
[ https://issues.apache.org/jira/browse/HIVE-19532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16522759#comment-16522759 ] Steve Yeom commented on HIVE-19532: --- OK. let me check. > fix tests for master-txnstats branch > > > Key: HIVE-19532 > URL: https://issues.apache.org/jira/browse/HIVE-19532 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 3.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 3.1.0 > > Attachments: HIVE-19532.01.patch, HIVE-19532.01.prepatch, > HIVE-19532.02.patch, HIVE-19532.02.prepatch, HIVE-19532.03.patch, > HIVE-19532.04.patch, HIVE-19532.05.patch, HIVE-19532.06.patch, > HIVE-19532.07.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19532) fix tests for master-txnstats branch
[ https://issues.apache.org/jira/browse/HIVE-19532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16522627#comment-16522627 ] Steve Yeom commented on HIVE-19532: --- Added patch 07 for test results. This is the same as patch 06. I have verified that there are no conflicts on the current code base. > fix tests for master-txnstats branch > > > Key: HIVE-19532 > URL: https://issues.apache.org/jira/browse/HIVE-19532 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 3.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 3.1.0 > > Attachments: HIVE-19532.01.patch, HIVE-19532.01.prepatch, > HIVE-19532.02.patch, HIVE-19532.02.prepatch, HIVE-19532.03.patch, > HIVE-19532.04.patch, HIVE-19532.05.patch, HIVE-19532.06.patch, > HIVE-19532.07.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19532) fix tests for master-txnstats branch
[ https://issues.apache.org/jira/browse/HIVE-19532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Yeom updated HIVE-19532: -- Attachment: HIVE-19532.07.patch > fix tests for master-txnstats branch > > > Key: HIVE-19532 > URL: https://issues.apache.org/jira/browse/HIVE-19532 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 3.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 3.1.0 > > Attachments: HIVE-19532.01.patch, HIVE-19532.01.prepatch, > HIVE-19532.02.patch, HIVE-19532.02.prepatch, HIVE-19532.03.patch, > HIVE-19532.04.patch, HIVE-19532.05.patch, HIVE-19532.06.patch, > HIVE-19532.07.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19975) Checking writeIdList per table may not check the commit level of a partition on a partitioned table.
[ https://issues.apache.org/jira/browse/HIVE-19975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16520914#comment-16520914 ] Steve Yeom commented on HIVE-19975: --- A simple fix might be that, to filter out writes to other partitions in a checker method, we may need extra infomation like part_id for each write. > Checking writeIdList per table may not check the commit level of a partition > on a partitioned table. > > > Key: HIVE-19975 > URL: https://issues.apache.org/jira/browse/HIVE-19975 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > > writeIdList is per table entity but stats for a partitioned table are per > partition. > I.e., each record in PARTITIONS has an independent stats. > So if we check the validity of a partition's stats, we need to check in the > context of > a partiton. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (HIVE-19975) Checking writeIdList per table may not check the commit level of a partition on a partitioned table.
[ https://issues.apache.org/jira/browse/HIVE-19975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16520903#comment-16520903 ] Steve Yeom edited comment on HIVE-19975 at 6/22/18 11:47 PM: - [~sershe] found this issue while he is testing his stats updater patch. was (Author: steveyeom2017): Sergey found this issue while he is testing his stats updater patch. > Checking writeIdList per table may not check the commit level of a partition > on a partitioned table. > > > Key: HIVE-19975 > URL: https://issues.apache.org/jira/browse/HIVE-19975 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > > writeIdList is per table entity but stats for a partitioned table are per > partition. > I.e., each record in PARTITIONS has an independent stats. > So if we check the validity of a partition's stats, we need to check in the > context of > a partiton. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19975) Checking writeIdList per table may not check the commit level of a partition on a partitioned table.
[ https://issues.apache.org/jira/browse/HIVE-19975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16520903#comment-16520903 ] Steve Yeom commented on HIVE-19975: --- Sergey found this issue while he is testing his stats updater patch. > Checking writeIdList per table may not check the commit level of a partition > on a partitioned table. > > > Key: HIVE-19975 > URL: https://issues.apache.org/jira/browse/HIVE-19975 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > > writeIdList is per table entity but stats for a partitioned table are per > partition. > I.e., each record in PARTITIONS has an independent stats. > So if we check the validity of a partition's stats, we need to check in the > context of > a partiton. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HIVE-19975) Checking writeIdList per table may not check the commit level of a partition on a partitioned table.
[ https://issues.apache.org/jira/browse/HIVE-19975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Yeom reassigned HIVE-19975: - Assignee: Steve Yeom > Checking writeIdList per table may not check the commit level of a partition > on a partitioned table. > > > Key: HIVE-19975 > URL: https://issues.apache.org/jira/browse/HIVE-19975 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > > writeIdList is per table entity but stats for a partitioned table are per > partition. > I.e., each record in PARTITIONS has an independent stats. > So if we check the validity of a partition's stats, we need to check in the > context of > a partiton. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19865) Full ACID table stats has wrong rawDataSize
[ https://issues.apache.org/jira/browse/HIVE-19865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16520833#comment-16520833 ] Steve Yeom commented on HIVE-19865: --- OK. OrcRecordUpdater.getStats() or a (possibly new) subsequent is supposed to calculate rawDataSize if needed. OrcRecordUpdater.java has: @Override public SerDeStats getStats() { SerDeStats stats = new SerDeStats(); stats.setRowCount(rowCountDelta); // Don't worry about setting raw data size diff. I have no idea how to calculate that // without finding the row we are updating or deleting, which would be a mess. return stats; } > Full ACID table stats has wrong rawDataSize > --- > > Key: HIVE-19865 > URL: https://issues.apache.org/jira/browse/HIVE-19865 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19865) Full ACID table stats has wrong rawDataSize
[ https://issues.apache.org/jira/browse/HIVE-19865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16520827#comment-16520827 ] Steve Yeom commented on HIVE-19865: --- Ashutosh replied No. > Full ACID table stats has wrong rawDataSize > --- > > Key: HIVE-19865 > URL: https://issues.apache.org/jira/browse/HIVE-19865 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work started] (HIVE-19865) Full ACID table stats has wrong rawDataSize
[ https://issues.apache.org/jira/browse/HIVE-19865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-19865 started by Steve Yeom. - > Full ACID table stats has wrong rawDataSize > --- > > Key: HIVE-19865 > URL: https://issues.apache.org/jira/browse/HIVE-19865 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19865) Full ACID table stats has wrong rawDataSize
[ https://issues.apache.org/jira/browse/HIVE-19865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16520823#comment-16520823 ] Steve Yeom commented on HIVE-19865: --- Hi [~ashutoshc] Do we need rawDataSize for any aggregation query for an ACID table? Thanks, Steve. > Full ACID table stats has wrong rawDataSize > --- > > Key: HIVE-19865 > URL: https://issues.apache.org/jira/browse/HIVE-19865 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19865) Full ACID table stats has wrong rawDataSize
[ https://issues.apache.org/jira/browse/HIVE-19865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16520821#comment-16520821 ] Steve Yeom commented on HIVE-19865: --- Found out the location where table stats esp., rawDataSize is supposed to be set in FileSinkOperator.java for ACID table.The best is of course to set accurate value for it. But one question is, before spending time to do that, whether rawDataSize is needed for ACID table aggregation query for this transactional stats feature. > Full ACID table stats has wrong rawDataSize > --- > > Key: HIVE-19865 > URL: https://issues.apache.org/jira/browse/HIVE-19865 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (HIVE-19867) Test and verify Concurrent INSERTS
[ https://issues.apache.org/jira/browse/HIVE-19867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516053#comment-16516053 ] Steve Yeom edited comment on HIVE-19867 at 6/22/18 5:40 AM: I found the patch 03 does not address the cases of concurrent writes to a table. If writeId for transactional stats is saved into TBLS/PARTITIONS, then it can be used to figure out concurrent writes by comparing a write with its writeIdList. Simply by checking it is in the list or not. I.e., if we assume two concurrent NSERTs are subsequently committed and updated stats of, let's say, table1. Then first INSERT's writeId is now saved and so can be used to see if it is in second INSERT's writeIdList when second INSERT comes in. Also second INSERT's writeId can be checked to see it is in first INSERT's writeIdList that is saved in TBLS/PARTITIONS. If each writeId is contained in the other's writeIdList, we can say it is concurrent INSERT case and turn the COLUMN_STATS_ACCURATE flag off. was (Author: steveyeom2017): I found the patch 03 does not address the cases of concurrent writes to a table. If writeId for transactional stats is saved into TBLS/PARTITIONS, then it can be used to figure out concurrent writes by comparing a write with its writeIdList. Simply by checking it is in the list or not. > Test and verify Concurrent INSERTS > > > Key: HIVE-19867 > URL: https://issues.apache.org/jira/browse/HIVE-19867 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (HIVE-19867) Test and verify Concurrent INSERTS
[ https://issues.apache.org/jira/browse/HIVE-19867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16519787#comment-16519787 ] Steve Yeom edited comment on HIVE-19867 at 6/22/18 5:33 AM: Sergey and I talked about this. He mentioned several cases from the perspective of Readers. If I summarize cases of both Stats updaters (esp., concurrent INSERTs) and readers, then, let's assume the txnId of a stats entity(table or partition) is 1 and its write id on table1 is 11. Then we may have case1: Suppose concurrent writes 12 (with txnId 2) and 13 (txnId 3) and concurrent reader 14 (txnId 4). Each write has 12, 13 in its writeIdList. Here concurrent reader 14 has 12,13 as open writes in its writeIdList. 1) Write 12 comes and updates the stats of table1 and its transaction is committed. So now txnid of the stats is 2. 2) Then write 13 comes in and, if we have the code for this jira, checks itself (number 13) from stats's writeIdList (written by txnId 2). 13 should be there in the writeIdList. Also, if we save 12 into TXNS/PARTITIONS, we can check whether 12 is in write 13's writeIdList. This detects the case of concurrent writes and can turn the flag (COLUMN_STATS_ACCURATE) off. 3) Now reader comes in and finds the stats is not valid by simply checking the flag. (the reader also can determine the stats' invalidity by comparing writeIdLists of itself and the stats. But the next case2 shows the reason why we need to turn the flag off.) case2: Suppose concurrent writes 12 and 13. But assume we have a reader 14 (txnId 4) that started its transaction after writes 12 and 13 are committed. If the flag is still on and the txnId in TBLS/PARTITIONS is 3, then reader 14 does not have a way to figure out the stats are invalid due to concurrent writes since its own writeIdList for table1 does not have 12, 13 as open writes and both are committed. was (Author: steveyeom2017): Sergey and I talked about this. He mentioned several cases from the perspective of Readers like: Let's assume the txnId of a stats entity(table or partition) is 1 and its write id on table1 is 11. Then we may have case1: Suppose concurrent writes 12 (with txnId 2) and 13 (txnId 3) and concurrent reader 14 (txnId 4). Here concurrent reader 14 has 12,13 as open writes in its writeIdList. 1) Write 12 comes and updates the stats of table1 and its transaction is committed. So now txnid of the stats is txnid is 2. 2) Then write 13 comes in and checks itself (number 13) from stats's writeIdList by txnId 2. 13 should be there in the writeIdList. So it detects concurrent writes and can turn the flag (COLUMN_STATS_ACCURATE) off. 3) Now reader comes in and finds the stats is not valid by simply checking the flag. (the reader also can determine the stats' validity by comparing writeIdLists of itself and the stats) case2: Suppose concurrent writes 12 and 13. But assume we have a reader 14 (txnId 4) that started its transaction after writes 12 and 13 are done. If the flag is still on and the txnId in TBLS/PARTITIONS is 3, then reader 14 does not have a way to figure out the stats are invalid due to concurrent writes since its own writeIdList for table1 does not have 12, 13 as open writes and both are committed. > Test and verify Concurrent INSERTS > > > Key: HIVE-19867 > URL: https://issues.apache.org/jira/browse/HIVE-19867 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19963) metadata_only_queries.q fails
[ https://issues.apache.org/jira/browse/HIVE-19963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16519832#comment-16519832 ] Steve Yeom commented on HIVE-19963: --- Hi [~sershe] please look at the patch for this jira. Thanks, Steve. > metadata_only_queries.q fails > - > > Key: HIVE-19963 > URL: https://issues.apache.org/jira/browse/HIVE-19963 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > Attachments: HIVE-19963.01.master-txnstats.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19963) metadata_only_queries.q fails
[ https://issues.apache.org/jira/browse/HIVE-19963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Yeom updated HIVE-19963: -- Attachment: HIVE-19963.01.master-txnstats.patch > metadata_only_queries.q fails > - > > Key: HIVE-19963 > URL: https://issues.apache.org/jira/browse/HIVE-19963 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > Attachments: HIVE-19963.01.master-txnstats.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19963) metadata_only_queries.q fails
[ https://issues.apache.org/jira/browse/HIVE-19963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Yeom updated HIVE-19963: -- Status: Patch Available (was: In Progress) > metadata_only_queries.q fails > - > > Key: HIVE-19963 > URL: https://issues.apache.org/jira/browse/HIVE-19963 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > Attachments: HIVE-19963.01.master-txnstats.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work started] (HIVE-19963) metadata_only_queries.q fails
[ https://issues.apache.org/jira/browse/HIVE-19963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-19963 started by Steve Yeom. - > metadata_only_queries.q fails > - > > Key: HIVE-19963 > URL: https://issues.apache.org/jira/browse/HIVE-19963 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19867) Test and verify Concurrent INSERTS
[ https://issues.apache.org/jira/browse/HIVE-19867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16519787#comment-16519787 ] Steve Yeom commented on HIVE-19867: --- Sergey and I talked about this. He mentioned several cases from the perspective of Readers like: Let's assume the txnId of a stats entity(table or partition) is 1 and its write id on table1 is 11. Then we may have case1: Suppose concurrent writes 12 (with txnId 2) and 13 (txnId 3) and concurrent reader 14 (txnId 4). Here concurrent reader 14 has 12,13 as open writes in its writeIdList. 1) Write 12 comes and updates the stats of table1 and its transaction is committed. So now txnid of the stats is txnid is 2. 2) Then write 13 comes in and checks itself (number 13) from stats's writeIdList by txnId 2. 13 should be there in the writeIdList. So it detects concurrent writes and can turn the flag (COLUMN_STATS_ACCURATE) off. 3) Now reader comes in and finds the stats is not valid by simply checking the flag. (the reader also can determine the stats' validity by comparing writeIdLists of itself and the stats) case2: Suppose concurrent writes 12 and 13. But assume we have a reader 14 (txnId 4) that started its transaction after writes 12 and 13 are done. If the flag is still on and the txnId in TBLS/PARTITIONS is 3, then reader 14 does not have a way to figure out the stats are invalid due to concurrent writes since its own writeIdList for table1 does not have 12, 13 as open writes and both are committed. > Test and verify Concurrent INSERTS > > > Key: HIVE-19867 > URL: https://issues.apache.org/jira/browse/HIVE-19867 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HIVE-19963) metadata_only_queries.q fails
[ https://issues.apache.org/jira/browse/HIVE-19963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Yeom reassigned HIVE-19963: - Assignee: Steve Yeom > metadata_only_queries.q fails > - > > Key: HIVE-19963 > URL: https://issues.apache.org/jira/browse/HIVE-19963 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (HIVE-19953) query9.q fails
[ https://issues.apache.org/jira/browse/HIVE-19953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Yeom resolved HIVE-19953. --- Resolution: Fixed > query9.q fails > --- > > Key: HIVE-19953 > URL: https://issues.apache.org/jira/browse/HIVE-19953 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19931) stats_nonpart.q test run shows possibly wrong results.
[ https://issues.apache.org/jira/browse/HIVE-19931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16518748#comment-16518748 ] Steve Yeom commented on HIVE-19931: --- Hi [~sershe] this patch includes NPE fix, query9 fix, and the last comments from the reviews. Thanks, Steve. > stats_nonpart.q test run shows possibly wrong results. > -- > > Key: HIVE-19931 > URL: https://issues.apache.org/jira/browse/HIVE-19931 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > Attachments: HIVE-19931.01.master-txnstats.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19931) stats_nonpart.q test run shows possibly wrong results.
[ https://issues.apache.org/jira/browse/HIVE-19931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Yeom updated HIVE-19931: -- Status: Patch Available (was: Reopened) > stats_nonpart.q test run shows possibly wrong results. > -- > > Key: HIVE-19931 > URL: https://issues.apache.org/jira/browse/HIVE-19931 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > Attachments: HIVE-19931.01.master-txnstats.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19931) stats_nonpart.q test run shows possibly wrong results.
[ https://issues.apache.org/jira/browse/HIVE-19931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Yeom updated HIVE-19931: -- Attachment: HIVE-19931.01.master-txnstats.patch > stats_nonpart.q test run shows possibly wrong results. > -- > > Key: HIVE-19931 > URL: https://issues.apache.org/jira/browse/HIVE-19931 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > Attachments: HIVE-19931.01.master-txnstats.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Reopened] (HIVE-19931) stats_nonpart.q test run shows possibly wrong results.
[ https://issues.apache.org/jira/browse/HIVE-19931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Yeom reopened HIVE-19931: --- This is 19931. > stats_nonpart.q test run shows possibly wrong results. > -- > > Key: HIVE-19931 > URL: https://issues.apache.org/jira/browse/HIVE-19931 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19954) TestTxnCommands2#testNonAcidToAcidConversion1 fails
[ https://issues.apache.org/jira/browse/HIVE-19954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16518744#comment-16518744 ] Steve Yeom commented on HIVE-19954: --- For the case of conversion from non-acid to acid, the TBLS has no writeIdList, which caused NPE. I have a patch to add null checking in ObjectStore.isCurrentStatsValidForTheQuery() and tested the patch. This will be merged to the patch of HIVE-19931. > TestTxnCommands2#testNonAcidToAcidConversion1 fails > --- > > Key: HIVE-19954 > URL: https://issues.apache.org/jira/browse/HIVE-19954 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (HIVE-19954) TestTxnCommands2#testNonAcidToAcidConversion1 fails
[ https://issues.apache.org/jira/browse/HIVE-19954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Yeom resolved HIVE-19954. --- Resolution: Fixed > TestTxnCommands2#testNonAcidToAcidConversion1 fails > --- > > Key: HIVE-19954 > URL: https://issues.apache.org/jira/browse/HIVE-19954 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HIVE-19954) TestTxnCommands2#testNonAcidToAcidConversion1 fails
[ https://issues.apache.org/jira/browse/HIVE-19954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Yeom reassigned HIVE-19954: - Assignee: Steve Yeom > TestTxnCommands2#testNonAcidToAcidConversion1 fails > --- > > Key: HIVE-19954 > URL: https://issues.apache.org/jira/browse/HIVE-19954 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (HIVE-19931) stats_nonpart.q test run shows possibly wrong results.
[ https://issues.apache.org/jira/browse/HIVE-19931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Yeom resolved HIVE-19931. --- Resolution: Fixed > stats_nonpart.q test run shows possibly wrong results. > -- > > Key: HIVE-19931 > URL: https://issues.apache.org/jira/browse/HIVE-19931 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19931) stats_nonpart.q test run shows possibly wrong results.
[ https://issues.apache.org/jira/browse/HIVE-19931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16518741#comment-16518741 ] Steve Yeom commented on HIVE-19931: --- The one line incorrect code caused wrong results: +++ standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java @@ -4094,7 +4094,7 @@ public void alterTable(String catName, String dbname, String name, Table newTabl if (newTable.getValidWriteIdList() != null && TxnUtils.isTransactionalTable(newTable)) { // Check concurrent INSERT case and set false to the flag. -if (isCurrentStatsValidForTheQuery(oldt, newt.getTxnId(), newt.getWriteIdList(), +if (!isCurrentStatsValidForTheQuery(oldt, newt.getTxnId(), newt.getWriteIdList(), I will add this patch to the patch for HIVE-19931 > stats_nonpart.q test run shows possibly wrong results. > -- > > Key: HIVE-19931 > URL: https://issues.apache.org/jira/browse/HIVE-19931 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (HIVE-19953) query9.q fails
[ https://issues.apache.org/jira/browse/HIVE-19953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16518687#comment-16518687 ] Steve Yeom edited comment on HIVE-19953 at 6/20/18 10:53 PM: - Unneeded TXN_ID into TAB_COL_STATS table is causing a p-commit unit test failure of query9.q. This may represent a set of test failures of "master-txnstats" branch base patch. was (Author: steveyeom2017): Unneeded TXN_ID into TAB_COL_STATS table is causing a failure. This may represent a set of test failures of "master-txnstats" branch base patch. > query9.q fails > --- > > Key: HIVE-19953 > URL: https://issues.apache.org/jira/browse/HIVE-19953 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19953) query9.q fails
[ https://issues.apache.org/jira/browse/HIVE-19953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16518690#comment-16518690 ] Steve Yeom commented on HIVE-19953: --- A patch for this jira will be merged into HIVE-19931 patch. > query9.q fails > --- > > Key: HIVE-19953 > URL: https://issues.apache.org/jira/browse/HIVE-19953 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work started] (HIVE-19953) query9.q fails
[ https://issues.apache.org/jira/browse/HIVE-19953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-19953 started by Steve Yeom. - > query9.q fails > --- > > Key: HIVE-19953 > URL: https://issues.apache.org/jira/browse/HIVE-19953 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HIVE-19953) query9.q fails
[ https://issues.apache.org/jira/browse/HIVE-19953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Yeom reassigned HIVE-19953: - Assignee: Steve Yeom > query9.q fails > --- > > Key: HIVE-19953 > URL: https://issues.apache.org/jira/browse/HIVE-19953 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19953) query9.q fails
[ https://issues.apache.org/jira/browse/HIVE-19953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16518687#comment-16518687 ] Steve Yeom commented on HIVE-19953: --- Unneeded TXN_ID into TAB_COL_STATS table is causing a failure. This may represent a set of test failures of "master-txnstats" branch base patch. > query9.q fails > --- > > Key: HIVE-19953 > URL: https://issues.apache.org/jira/browse/HIVE-19953 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work started] (HIVE-19931) stats_nonpart.q test run shows possibly wrong results.
[ https://issues.apache.org/jira/browse/HIVE-19931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-19931 started by Steve Yeom. - > stats_nonpart.q test run shows possibly wrong results. > -- > > Key: HIVE-19931 > URL: https://issues.apache.org/jira/browse/HIVE-19931 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19532) fix tests for master-txnstats branch
[ https://issues.apache.org/jira/browse/HIVE-19532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16518479#comment-16518479 ] Steve Yeom commented on HIVE-19532: --- Regarding the test failures on patch 04. 1. org.apache.hadoop.hive.cli: 198 failures 1.1 Golden: by enabling stats-using queries on acid tables w/ accurate table stats org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_nullscan] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_table_stats] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[autoColumnStats_4] 1.2 Looks like unrelated to the patch org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query9] > fix tests for master-txnstats branch > > > Key: HIVE-19532 > URL: https://issues.apache.org/jira/browse/HIVE-19532 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 3.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 3.1.0 > > Attachments: HIVE-19532.01.patch, HIVE-19532.01.prepatch, > HIVE-19532.02.patch, HIVE-19532.02.prepatch, HIVE-19532.03.patch, > HIVE-19532.04.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (HIVE-19532) fix tests for master-txnstats branch
[ https://issues.apache.org/jira/browse/HIVE-19532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516503#comment-16516503 ] Steve Yeom edited comment on HIVE-19532 at 6/19/18 1:03 AM: 1. pre-commit tests run results for patch 04 is not yet available. We have to wait. 2. I have checked several tests that failed for patch 03 but most (all out of sampled tests) of the sampled tests of non-explain.out-difference ran successfully in my environment. Also maybe the unwanted MetastoreConf.java changes for CachedStore testing may have caused metadata related tests or more to fail. The following tests came clean without an error in my environment: mvn test -Dtest=TestSparkPerfCliDriver -Dqfile=query18.q mvn test -Dtest=TestReplicationScenarios#testConcatenatePartitionedTable mvn test -Dtest=TestTezPerfCliDriver -Dqfile=query14.q mvn test -Dtest=TestUpdateDeleteSemanticAnalyzer#testInsertValuesPartitioned was (Author: steveyeom2017): 1. pre-commit tests run results for patch 04 is not yet available. We have to wait. 2. I have checked several tests that failed for patch 03 but most (all out of sampled tests) of the sampled tests of non-explain.out-difference ran successfully in my environment. So maybe the unwanted MetastoreConf.java changes for CachedStore testing may be related. > fix tests for master-txnstats branch > > > Key: HIVE-19532 > URL: https://issues.apache.org/jira/browse/HIVE-19532 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 3.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 3.1.0 > > Attachments: HIVE-19532.01.patch, HIVE-19532.01.prepatch, > HIVE-19532.02.patch, HIVE-19532.02.prepatch, HIVE-19532.03.patch, > HIVE-19532.04.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19532) fix tests for master-txnstats branch
[ https://issues.apache.org/jira/browse/HIVE-19532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516503#comment-16516503 ] Steve Yeom commented on HIVE-19532: --- 1. pre-commit tests run results for patch 04 is not yet available. We have to wait. 2. I have checked several tests that failed for patch 03 but most (all out of sampled tests) of the sampled tests of non-explain.out-difference ran successfully in my environment. So maybe the unwanted MetastoreConf.java changes for CachedStore testing may be related. > fix tests for master-txnstats branch > > > Key: HIVE-19532 > URL: https://issues.apache.org/jira/browse/HIVE-19532 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 3.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 3.1.0 > > Attachments: HIVE-19532.01.patch, HIVE-19532.01.prepatch, > HIVE-19532.02.patch, HIVE-19532.02.prepatch, HIVE-19532.03.patch, > HIVE-19532.04.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19916) master-txnstats branch - integrate with HIVE-19382
[ https://issues.apache.org/jira/browse/HIVE-19916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516355#comment-16516355 ] Steve Yeom commented on HIVE-19916: --- Got your point. If my memory is correct, materialization view was retrieving write id list during compilation and I did the same way at transactional stats. Let me check the details. > master-txnstats branch - integrate with HIVE-19382 > -- > > Key: HIVE-19916 > URL: https://issues.apache.org/jira/browse/HIVE-19916 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Steve Yeom >Priority: Major > > There's some code in original txn stats patch that may go to metastore to get > write Ids. This code should not go to metastore, it should fail instead. > HIVE-19382 should ensure that we have correct IDs already present during > optimizer - they are using by e.g. materialized view optimizer, so they > should be there; if they are not present, some integration might be needed so > that txn stats optimizations also have access to those write Ids. > cc [~jcamachorodriguez] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HIVE-19931) stats_nonpart.q test run shows possibly wrong results.
[ https://issues.apache.org/jira/browse/HIVE-19931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Yeom reassigned HIVE-19931: - Assignee: Steve Yeom > stats_nonpart.q test run shows possibly wrong results. > -- > > Key: HIVE-19931 > URL: https://issues.apache.org/jira/browse/HIVE-19931 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19865) Full ACID table stats has wrong rawDataSize
[ https://issues.apache.org/jira/browse/HIVE-19865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516057#comment-16516057 ] Steve Yeom commented on HIVE-19865: --- Not yet fixed. I have added "stats_sizebug.q" to address this issue. > Full ACID table stats has wrong rawDataSize > --- > > Key: HIVE-19865 > URL: https://issues.apache.org/jira/browse/HIVE-19865 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19867) Test and verify Concurrent INSERTS
[ https://issues.apache.org/jira/browse/HIVE-19867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516053#comment-16516053 ] Steve Yeom commented on HIVE-19867: --- I found the patch 03 does not address the cases of concurrent writes to a table. If writeId for transactional stats is saved into TBLS/PARTITIONS, then it can be used to figure out concurrent writes by comparing a write with its writeIdList. Simply by checking it is in the list or not. > Test and verify Concurrent INSERTS > > > Key: HIVE-19867 > URL: https://issues.apache.org/jira/browse/HIVE-19867 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19532) fix tests for master-txnstats branch
[ https://issues.apache.org/jira/browse/HIVE-19532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516028#comment-16516028 ] Steve Yeom commented on HIVE-19532: --- Hi [~sershe] I think this patch 04 can be the base of a new project branch if we need. I definitely need to check the results of the p-commit test suite run. > fix tests for master-txnstats branch > > > Key: HIVE-19532 > URL: https://issues.apache.org/jira/browse/HIVE-19532 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 3.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 3.1.0 > > Attachments: HIVE-19532.01.patch, HIVE-19532.01.prepatch, > HIVE-19532.02.patch, HIVE-19532.02.prepatch, HIVE-19532.03.patch, > HIVE-19532.04.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19532) fix tests for master-txnstats branch
[ https://issues.apache.org/jira/browse/HIVE-19532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Yeom updated HIVE-19532: -- Attachment: HIVE-19532.04.patch > fix tests for master-txnstats branch > > > Key: HIVE-19532 > URL: https://issues.apache.org/jira/browse/HIVE-19532 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 3.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 3.1.0 > > Attachments: HIVE-19532.01.patch, HIVE-19532.01.prepatch, > HIVE-19532.02.patch, HIVE-19532.02.prepatch, HIVE-19532.03.patch, > HIVE-19532.04.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (HIVE-19903) Disable temporary insert-only transactional table
[ https://issues.apache.org/jira/browse/HIVE-19903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16515022#comment-16515022 ] Steve Yeom edited comment on HIVE-19903 at 6/17/18 9:03 AM: Per conversation with Jason. Hi [~jdere] can you review the patch 03? This does cover/came clean after test run for the QA bug test case and the following test cases: TestJdbcWithMiniLlapArrow. TestJdbcWithMiniLlapRow Thanks, Steve. was (Author: steveyeom2017): Hi [~EricWohlstadter] can you review the patch 03? This will disable temporary table MM related properties. Thanks, Steve. > Disable temporary insert-only transactional table > - > > Key: HIVE-19903 > URL: https://issues.apache.org/jira/browse/HIVE-19903 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > Attachments: HIVE-19903.01.patch, HIVE-19903.02.patch, > HIVE-19903.03.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19903) Disable temporary insert-only transactional table
[ https://issues.apache.org/jira/browse/HIVE-19903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Yeom updated HIVE-19903: -- Attachment: HIVE-19903.03.patch > Disable temporary insert-only transactional table > - > > Key: HIVE-19903 > URL: https://issues.apache.org/jira/browse/HIVE-19903 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > Attachments: HIVE-19903.01.patch, HIVE-19903.02.patch, > HIVE-19903.03.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19903) Disable temporary insert-only transactional table
[ https://issues.apache.org/jira/browse/HIVE-19903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Yeom updated HIVE-19903: -- Attachment: (was: HIVE-19903.03.patch) > Disable temporary insert-only transactional table > - > > Key: HIVE-19903 > URL: https://issues.apache.org/jira/browse/HIVE-19903 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > Attachments: HIVE-19903.01.patch, HIVE-19903.02.patch, > HIVE-19903.03.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19903) Disable temporary insert-only transactional table
[ https://issues.apache.org/jira/browse/HIVE-19903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Yeom updated HIVE-19903: -- Attachment: (was: HIVE-19903.03.patch) > Disable temporary insert-only transactional table > - > > Key: HIVE-19903 > URL: https://issues.apache.org/jira/browse/HIVE-19903 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > Attachments: HIVE-19903.01.patch, HIVE-19903.02.patch, > HIVE-19903.03.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19903) Disable temporary insert-only transactional table
[ https://issues.apache.org/jira/browse/HIVE-19903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Yeom updated HIVE-19903: -- Attachment: HIVE-19903.03.patch > Disable temporary insert-only transactional table > - > > Key: HIVE-19903 > URL: https://issues.apache.org/jira/browse/HIVE-19903 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > Attachments: HIVE-19903.01.patch, HIVE-19903.02.patch, > HIVE-19903.03.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19903) Disable temporary insert-only transactional table
[ https://issues.apache.org/jira/browse/HIVE-19903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16515022#comment-16515022 ] Steve Yeom commented on HIVE-19903: --- Hi [~EricWohlstadter] can you review the patch 03? This will disable temporary table MM related properties. Thanks, Steve. > Disable temporary insert-only transactional table > - > > Key: HIVE-19903 > URL: https://issues.apache.org/jira/browse/HIVE-19903 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > Attachments: HIVE-19903.01.patch, HIVE-19903.02.patch, > HIVE-19903.03.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19903) Disable temporary insert-only transactional table
[ https://issues.apache.org/jira/browse/HIVE-19903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Yeom updated HIVE-19903: -- Attachment: HIVE-19903.03.patch > Disable temporary insert-only transactional table > - > > Key: HIVE-19903 > URL: https://issues.apache.org/jira/browse/HIVE-19903 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > Attachments: HIVE-19903.01.patch, HIVE-19903.02.patch, > HIVE-19903.03.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19903) Disable temporary insert-only transactional table
[ https://issues.apache.org/jira/browse/HIVE-19903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16514925#comment-16514925 ] Steve Yeom commented on HIVE-19903: --- Hi [~jdere] please look at the patch 02. I have tested CREATE TABLE LIKE case it is already clear so we do not need a test. Also ran the two tests mentioned by Eric without failures. 23:05:10.393 [summary] :TestJdbcWithMiniLlapArrow.testComplexQuery 23:05:10.393 [summary] :TestJdbcWithMiniLlapRow.testComplexQuery > Disable temporary insert-only transactional table > - > > Key: HIVE-19903 > URL: https://issues.apache.org/jira/browse/HIVE-19903 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > Attachments: HIVE-19903.01.patch, HIVE-19903.02.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19903) Disable temporary insert-only transactional table
[ https://issues.apache.org/jira/browse/HIVE-19903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Yeom updated HIVE-19903: -- Attachment: HIVE-19903.02.patch > Disable temporary insert-only transactional table > - > > Key: HIVE-19903 > URL: https://issues.apache.org/jira/browse/HIVE-19903 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > Attachments: HIVE-19903.01.patch, HIVE-19903.02.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (HIVE-19916) master-txnstats branch - integrate with HIVE-19382
[ https://issues.apache.org/jira/browse/HIVE-19916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16514679#comment-16514679 ] Steve Yeom edited comment on HIVE-19916 at 6/16/18 7:43 AM: I think I did not clearly mention. Sorry and thank you for your great help Sergey. 1. The fact is the write id lists for the query for conf is generated and saved into "conf" at acquireLocks() right at the start of execution. 2. Sergey may be right in saying that the current patch may access the Metastore. I will double check. 3. For stats updaters, the current or original patch gets a writeId list from "conf". The reason why I should take conf-version-writeId-list for a write entity table is that at acquireLocks() we get a new write id (after +1) for a write entity table of the query. 4. But in case of StatsOptimizer I need to get a writeId list for a read entity table and use it. But "conf" may not have a writeIdList for the read entity table. So I think I have obtained a writeId list for a read entity table based on the global transaction database snapshot that is already taken at the start of query plan optimization. But I think this can be the final version for the transaction since I think we can determine commit level of a table write with this version. I.e., with this version, we can decide open->committed write instance, open->aborted instance, new open write instance when the equivalence method for the patch is invoked, But I will double check. P.S.: I thought the current patch of HIVE-19382 is for checking and regenerating write entity table write id lists for I guess UPDATE/DELETE conflict resolution at TxnHandler.commitTxns(). Again only for write id lists for write entity tables. So this does not affect read entity tables. Hey [~jcamachorodriguez] please let me know if my understanding on the current temporary patch is not correct. I may add more after checking.. was (Author: steveyeom2017): I think I did not clearly mention. Sorry and thank you for your great help Sergey. 1. The fact is the write id lists for the query for conf is generated and saved into "conf" at acquireLocks() right at the start of execution. 2. Sergey may be right in saying that the current patch may access the Metastore. I will double check. 3. For stats updaters, the current or original patch gets a writeId list from "conf". The reason why I should take conf-version-writeId-list for a write entity table is that at acquireLocks() we get a new write id (after +1) for a write entity table of the query. 4. But in case of StatsOptimizer I need to get a writeId list for a read entity table and use it. But "conf" may not have a writeIdList for the read entity table. So I think I have obtained a writeId list for a read entity table based on the global transaction database snapshot that is already taken at the start of query plan optimization. But I think this can be the final version for the transaction since I think we can determine commit level of a table write with this version. I.e., with this version, we can decide open->committed write instance, open->aborted instance, new open write instance when the equivalence method for the patch is invoked, But I will double check. P.S.: I thought the current patch of HIVE-19382 is for checking and regenerating write entity table write id lists for I guess UPDATE/DELETE conflict resolution at TxnHandler.commitTxns(). Again only for write id lists for write entity tables. So this does not affect read entity tables. I may add more after checking.. > master-txnstats branch - integrate with HIVE-19382 > -- > > Key: HIVE-19916 > URL: https://issues.apache.org/jira/browse/HIVE-19916 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Steve Yeom >Priority: Major > > There's some code in original txn stats patch that may go to metastore to get > write Ids. This code should not go to metastore, it should fail instead. > HIVE-19382 should ensure that we have correct IDs already present during > optimizer - they are using by e.g. materialized view optimizer, so they > should be there; if they are not present, some integration might be needed so > that txn stats optimizations also have access to those write Ids. > cc [~jcamachorodriguez] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (HIVE-19916) master-txnstats branch - integrate with HIVE-19382
[ https://issues.apache.org/jira/browse/HIVE-19916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16514679#comment-16514679 ] Steve Yeom edited comment on HIVE-19916 at 6/16/18 7:35 AM: I think I did not clearly mention. Sorry and thank you for your great help Sergey. 1. The fact is the write id lists for the query for conf is generated and saved into "conf" at acquireLocks() right at the start of execution. 2. Sergey may be right in saying that the current patch may access the Metastore. I will double check. 3. For stats updaters, the current or original patch gets a writeId list from "conf". The reason why I should take conf-version-writeId-list for a write entity table is that at acquireLocks() we get a new write id (after +1) for a write entity table of the query. 4. But in case of StatsOptimizer I need to get a writeId list for a read entity table and use it. But "conf" may not have a writeIdList for the read entity table. So I think I have obtained a writeId list for a read entity table based on the global transaction database snapshot that is already taken at the start of query plan optimization. But I think this can be the final version for the transaction since I think we can determine commit level of a table write with this version. I.e., with this version, we can decide open->committed write instance, open->aborted instance, new open write instance when the equivalence method for the patch is invoked, But I will double check. P.S.: I thought the current patch of HIVE-19382 is for checking and regenerating write entity table write id lists for I guess UPDATE/DELETE conflict resolution at TxnHandler.commitTxns(). Again only for write id lists for write entity tables. So this does not affect read entity tables. I may add more after checking.. was (Author: steveyeom2017): I think I did not clearly mentioned. Sorry Sergey. The current or original patch gets conf version writeId list. In other words, it uses already taken global transaction snspshot before starting query optimization and get writeIdList based on the transaction snapshot like the case of MV and cache result invalidation. The same way! If needed, I can provide more details. Thank you Sergey for your great help! > master-txnstats branch - integrate with HIVE-19382 > -- > > Key: HIVE-19916 > URL: https://issues.apache.org/jira/browse/HIVE-19916 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Steve Yeom >Priority: Major > > There's some code in original txn stats patch that may go to metastore to get > write Ids. This code should not go to metastore, it should fail instead. > HIVE-19382 should ensure that we have correct IDs already present during > optimizer - they are using by e.g. materialized view optimizer, so they > should be there; if they are not present, some integration might be needed so > that txn stats optimizations also have access to those write Ids. > cc [~jcamachorodriguez] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19916) master-txnstats branch - integrate with HIVE-19382
[ https://issues.apache.org/jira/browse/HIVE-19916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16514679#comment-16514679 ] Steve Yeom commented on HIVE-19916: --- I think I did not clearly mentioned. Sorry Sergey. The current or original patch gets conf version writeId list. In other words, it uses already taken global transaction snspshot before starting query optimization and get writeIdList based on the transaction snapshot like the case of MV and cache result invalidation. The same way! If needed, I can provide more details. Thank you Sergey for your great help! > master-txnstats branch - integrate with HIVE-19382 > -- > > Key: HIVE-19916 > URL: https://issues.apache.org/jira/browse/HIVE-19916 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Steve Yeom >Priority: Major > > There's some code in original txn stats patch that may go to metastore to get > write Ids. This code should not go to metastore, it should fail instead. > HIVE-19382 should ensure that we have correct IDs already present during > optimizer - they are using by e.g. materialized view optimizer, so they > should be there; if they are not present, some integration might be needed so > that txn stats optimizations also have access to those write Ids. > cc [~jcamachorodriguez] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19903) Disable temporary insert-only transactional table
[ https://issues.apache.org/jira/browse/HIVE-19903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16514547#comment-16514547 ] Steve Yeom commented on HIVE-19903: --- Hi [~jdere] Can you look at the patch? Thnks, Steve. > Disable temporary insert-only transactional table > - > > Key: HIVE-19903 > URL: https://issues.apache.org/jira/browse/HIVE-19903 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 4.0.0 > > Attachments: HIVE-19903.01.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)