armitage420 commented on code in PR #5539: URL: https://github.com/apache/hive/pull/5539#discussion_r1915361539
########## ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java: ########## @@ -4466,6 +4526,71 @@ public List<Partition> getPartitionsByFilter(Table tbl, String filter) return convertFromMetastore(tbl, tParts); } + public List<Partition> getPartitionsWithSpecs(Table tbl, GetPartitionsRequest request) + throws HiveException, TException { + + if (!tbl.isPartitioned()) { + throw new HiveException(ErrorMsg.TABLE_NOT_PARTITIONED, tbl.getTableName()); + } + int batchSize= MetastoreConf.getIntVar(Hive.get().getConf(), MetastoreConf.ConfVars.BATCH_RETRIEVE_MAX); + if(batchSize > 0){ + return new ArrayList<>(getAllPartitionsWithSpecsInBatches(tbl, batchSize, DEFAULT_BATCH_DECAYING_FACTOR, MetastoreConf.getIntVar( + Hive.get().getConf(), MetastoreConf.ConfVars.GETPARTITIONS_BATCH_MAX_RETRIES), request)); + }else{ + return getPartitionsWithSpecsInternal(tbl, request); + } + } + + public List<Partition> getPartitionsWithSpecsInternal(Table tbl, GetPartitionsRequest request) + throws HiveException, TException { + + if (!tbl.isPartitioned()) { + throw new HiveException(ErrorMsg.TABLE_NOT_PARTITIONED, tbl.getTableName()); + } + GetPartitionsResponse response = getMSC().getPartitionsWithSpecs(request); + List<org.apache.hadoop.hive.metastore.api.PartitionSpec> partitionSpecs = response.getPartitionSpec(); + List<Partition> partitions = new ArrayList<>(); + partitions.addAll(convertFromPartSpec(partitionSpecs.iterator(), tbl)); + + return partitions; + } + + List<Partition> getPartitionsWithSpecsByNames(Table tbl, List<String> partNames, GetPartitionsRequest request) Review Comment: This particular case scenario is made for huge partitioned tables where our thrift network would hit a 2GB data threshold. We might need to change the value METASTORE_BATCH_RETRIEVE_MAX in order to benefit from this very method. We will have to choose a very approximate value for the same though. As, I am not able to come up with a real time calculation of max batchsize required. And the data size now is going to be pretty dynamic with dynamic projections for partitions. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org