[jira] [Commented] (HIVE-21877) Change HCatTableInfo to not be transient in PartInfo
[ https://issues.apache.org/jira/browse/HIVE-21877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16864458#comment-16864458 ] Mithun Radhakrishnan commented on HIVE-21877: - No worries, mate. Cheers. > Change HCatTableInfo to not be transient in PartInfo > > > Key: HIVE-21877 > URL: https://issues.apache.org/jira/browse/HIVE-21877 > Project: Hive > Issue Type: New Feature >Reporter: Ankit Jhalaria >Assignee: Ankit Jhalaria >Priority: Minor > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Since HCatTableInfo is serializable, removing the transient annotation from > it. We were running into NPE during serialization while using HCatalogIO with > Beam. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21877) Change HCatTableInfo to not be transient in PartInfo
[ https://issues.apache.org/jira/browse/HIVE-21877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16864433#comment-16864433 ] Ankit Jhalaria commented on HIVE-21877: --- Thanks [~mithun] for your explanation. I will close my PR > Change HCatTableInfo to not be transient in PartInfo > > > Key: HIVE-21877 > URL: https://issues.apache.org/jira/browse/HIVE-21877 > Project: Hive > Issue Type: New Feature >Reporter: Ankit Jhalaria >Assignee: Ankit Jhalaria >Priority: Minor > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Since HCatTableInfo is serializable, removing the transient annotation from > it. We were running into NPE during serialization while using HCatalogIO with > Beam. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21877) Change HCatTableInfo to not be transient in PartInfo
[ https://issues.apache.org/jira/browse/HIVE-21877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16864420#comment-16864420 ] Mithun Radhakrishnan commented on HIVE-21877: - Pasting your question from the PR here: {quote} While using Hcatalog with Apache Beam, we ran into an issue with HCatTableInfo being null during serialization. I don't see a reason why it should be transient. However, there might be use-cases that I may not be aware of and might require it to be transient. Would love to hear some feedback regardless. {quote} This has to do with HIVE-9845. It would not be a good idea to make HCatTableInfo non-transient. Doing so will make Pig/HCatLoader, as well as {{HCatInputFormat}} inefficient for large partition sets. {{HCatTableInfo}} contains table-information that is static for all partition within a partition-set for a given table. {{PartInfo}} is the variable part. Serializing this multiple times for a partition set increases the split-meta-info for a Hadoop job to unreasonable lengths. I would advise perusing the HCat code to see how {{HCatTableInfo}} is restored, post serialization. > Change HCatTableInfo to not be transient in PartInfo > > > Key: HIVE-21877 > URL: https://issues.apache.org/jira/browse/HIVE-21877 > Project: Hive > Issue Type: New Feature >Reporter: Ankit Jhalaria >Assignee: Ankit Jhalaria >Priority: Minor > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Since HCatTableInfo is serializable, removing the transient annotation from > it. We were running into NPE during serialization while using HCatalogIO with > Beam. -- This message was sent by Atlassian JIRA (v7.6.3#76005)