[
https://issues.apache.org/jira/browse/SPARK-44058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Aman Raj updated SPARK-44058:
-----------------------------
Description:
Spark's HiveShim.scala calls this particular method in Hive :
createPartitionMethod.invoke(
hive,
table,
spec,
location,
params, // partParams
null, // inputFormat
null, // outputFormat
-1: JInteger, // numBuckets
null, // cols
null, // serializationLib
null, // serdeParams
null, // bucketCols
null) // sortCols
}
We do not have any such implementation of createPartition in Hive. We only have
this definition :
public Partition createPartition(Table tbl, Map<String, String> partSpec)
throws HiveException {
try
{ org.apache.hadoop.hive.metastore.api.Partition part =
Partition.createMetaPartitionObject(tbl, partSpec, null);
AcidUtils.TableSnapshot tableSnapshot = AcidUtils.getTableSnapshot(conf, tbl);
part.setWriteId(tableSnapshot != null ? tableSnapshot.getWriteId() : 0);
return new Partition(tbl, getMSC().add_partition(part)); }
catch (Exception e)
{ LOG.error(StringUtils.stringifyException(e)); throw new
HiveException(e); }
}
*The 12 parameter implementation was removed in HIVE-5951*
The issue is that this 12 parameter implementation of createPartition method
was added in Hive-0.12 and then was removed in Hive-0.13. When Hive 0.12 was
used in Spark, SPARK-15334 commit in Spark added this 12 parameters
implementation. But after Hive migrated to newer APIs somehow this was not
changed in Spark OSS and it looks to us like a Bug from the Spark end.
We need to migrate to the newest implementation of Hive createPartition method
otherwise this flow can break
was:
Spark's HiveShim.scala calls this particular method in Hive :
createPartitionMethod.invoke(
hive,
table,
spec,
location,
params, // partParams
null, // inputFormat
null, // outputFormat
-1: JInteger, // numBuckets
null, // cols
null, // serializationLib
null, // serdeParams
null, // bucketCols
null) // sortCols
}
We do not have any such implementation of createPartition in Hive. We only have
this definition :
public Partition createPartition(Table tbl, Map<String, String> partSpec)
throws HiveException {
try {
org.apache.hadoop.hive.metastore.api.Partition part =
Partition.createMetaPartitionObject(tbl, partSpec, null);
AcidUtils.TableSnapshot tableSnapshot = AcidUtils.getTableSnapshot(conf,
tbl);
part.setWriteId(tableSnapshot != null ? tableSnapshot.getWriteId() : 0);
return new Partition(tbl, getMSC().add_partition(part));
} catch (Exception e) {
LOG.error(StringUtils.stringifyException(e));
throw new HiveException(e);
}
}
The issue is that this 12 parameter implementation of createPartition method
was added in Hive-0.12 and then was removed in Hive-0.13. When Hive 0.12 was
used in Spark, [SPARK-15334] commit in Spark added this 12 parameters
implementation. But after Hive migrated to newer APIs somehow this was not
changed in Spark OSS and it looks to us like a Bug from the Spark end.
We need to migrate to the newest implementation of Hive createPartition method
otherwise this flow can break
> Remove deprecated API usage in HiveShim.scala
> ---------------------------------------------
>
> Key: SPARK-44058
> URL: https://issues.apache.org/jira/browse/SPARK-44058
> Project: Spark
> Issue Type: Bug
> Components: Spark Submit
> Affects Versions: 3.4.0
> Reporter: Aman Raj
> Priority: Major
>
> Spark's HiveShim.scala calls this particular method in Hive :
> createPartitionMethod.invoke(
> hive,
> table,
> spec,
> location,
> params, // partParams
> null, // inputFormat
> null, // outputFormat
> -1: JInteger, // numBuckets
> null, // cols
> null, // serializationLib
> null, // serdeParams
> null, // bucketCols
> null) // sortCols
> }
>
> We do not have any such implementation of createPartition in Hive. We only
> have this definition :
> public Partition createPartition(Table tbl, Map<String, String> partSpec)
> throws HiveException {
> try
> { org.apache.hadoop.hive.metastore.api.Partition part =
> Partition.createMetaPartitionObject(tbl, partSpec, null);
> AcidUtils.TableSnapshot tableSnapshot = AcidUtils.getTableSnapshot(conf,
> tbl); part.setWriteId(tableSnapshot != null ?
> tableSnapshot.getWriteId() : 0); return new Partition(tbl,
> getMSC().add_partition(part)); }
> catch (Exception e)
> { LOG.error(StringUtils.stringifyException(e)); throw new
> HiveException(e); }
> }
> *The 12 parameter implementation was removed in HIVE-5951*
>
> The issue is that this 12 parameter implementation of createPartition method
> was added in Hive-0.12 and then was removed in Hive-0.13. When Hive 0.12 was
> used in Spark, SPARK-15334 commit in Spark added this 12 parameters
> implementation. But after Hive migrated to newer APIs somehow this was not
> changed in Spark OSS and it looks to us like a Bug from the Spark end.
>
> We need to migrate to the newest implementation of Hive createPartition
> method otherwise this flow can break
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]