[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #4107: [CARBONDATA-4149] Query with SI after add partition based on location on partition table gives incorrect results

2021-03-19 Thread GitBox


ShreelekhyaG commented on a change in pull request #4107:
URL: https://github.com/apache/carbondata/pull/4107#discussion_r597429391



##
File path: 
integration/spark/src/main/scala/org/apache/carbondata/spark/rdd/CarbonTableCompactor.scala
##
@@ -276,7 +294,40 @@ class CarbonTableCompactor(
   segmentMetaDataAccumulator)
   } else {
 if (mergeRDD != null) {
-  mergeRDD.collect
+  val result = mergeRDD.collect
+  if (!updatePartitionSpecs.isEmpty) {
+val tableIdentifier = new TableIdentifier(carbonTable.getTableName,
+  Some(carbonTable.getDatabaseName))
+// To update partitionSpec in hive metastore, drop and add with 
latest path.
+val oldPartitions: util.List[TablePartitionSpec] =
+  new util.ArrayList[TablePartitionSpec]()
+val newPartitions: util.List[TablePartitionSpec] =
+  new util.ArrayList[TablePartitionSpec]()
+updatePartitionSpecs.asScala.foreach {
+  partitionSpec =>
+var spec = PartitioningUtils.parsePathFragment(
+  String.join(CarbonCommonConstants.FILE_SEPARATOR, 
partitionSpec.getPartitions))
+oldPartitions.add(spec)
+val addPartition = 
mergeRDD.checkAndUpdatePartitionLocation(partitionSpec)
+spec = PartitioningUtils.parsePathFragment(

Review comment:
   Done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #4107: [CARBONDATA-4149] Query with SI after add partition based on location on partition table gives incorrect results

2021-03-18 Thread GitBox


ShreelekhyaG commented on a change in pull request #4107:
URL: https://github.com/apache/carbondata/pull/4107#discussion_r597398064



##
File path: 
integration/spark/src/main/scala/org/apache/carbondata/spark/rdd/CarbonTableCompactor.scala
##
@@ -276,7 +290,25 @@ class CarbonTableCompactor(
   segmentMetaDataAccumulator)
   } else {
 if (mergeRDD != null) {
-  mergeRDD.collect
+  val result = mergeRDD.collect

Review comment:
   Made changes with old and new partitions list in order to replace 
partitionSpec in `compactedPartitions`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #4107: [CARBONDATA-4149] Query with SI after add partition based on location on partition table gives incorrect results

2021-03-18 Thread GitBox


ShreelekhyaG commented on a change in pull request #4107:
URL: https://github.com/apache/carbondata/pull/4107#discussion_r597396979



##
File path: 
index/secondary-index/src/test/scala/org/apache/carbondata/spark/testsuite/secondaryindex/TestSIWithPartition.scala
##
@@ -380,6 +384,37 @@ class TestSIWithPartition extends QueryTest with 
BeforeAndAfterAll {
 sql("drop table if exists partition_table")
   }
 
+  test("test si with add partition based on location on partition table") {
+sql("drop table if exists partition_table")
+sql("create table partition_table (id int,name String) " +
+"partitioned by(email string) stored as carbondata")
+sql("insert into partition_table select 1,'blue','abc'")
+sql("CREATE INDEX partitionTable_si  on table partition_table (name) as 
'carbondata'")

Review comment:
   Ok done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #4107: [CARBONDATA-4149] Query with SI after add partition based on location on partition table gives incorrect results

2021-03-18 Thread GitBox


ShreelekhyaG commented on a change in pull request #4107:
URL: https://github.com/apache/carbondata/pull/4107#discussion_r597396875



##
File path: 
integration/spark/src/main/scala/org/apache/carbondata/spark/rdd/CarbonMergerRDD.scala
##
@@ -98,6 +98,24 @@ class CarbonMergerRDD[K, V](
 broadCastSplits = sparkContext.broadcast(new 
CarbonInputSplitWrapper(splits))
   }
 
+  // checks for added partition specs with external path.
+  // after compaction, location path to be updated with table path.
+  def checkAndUpdatePartitionLocation(partitionSpec: PartitionSpec) : 
PartitionSpec = {
+if (partitionSpec != null) {
+  carbonLoadModel.getLoadMetadataDetails.asScala.foreach(loadMetaDetail => 
{
+if (loadMetaDetail.getPath != null &&
+
loadMetaDetail.getPath.split(",").contains(partitionSpec.getLocation.toString)) 
{
+  val updatedPartitionLocation = CarbonDataProcessorUtil

Review comment:
   Done

##
File path: 
integration/spark/src/main/scala/org/apache/carbondata/spark/rdd/CarbonTableCompactor.scala
##
@@ -263,6 +267,16 @@ class CarbonTableCompactor(
   if (partitionSpecs != null && partitionSpecs.nonEmpty) {
 compactionCallableModel.compactedPartitions = Some(partitionSpecs)
   }
+  partitionSpecs.foreach(partitionSpec => {
+carbonLoadModel.getLoadMetadataDetails.asScala.foreach(loadMetaDetail 
=> {
+  if (loadMetaDetail.getPath != null &&
+  
loadMetaDetail.getPath.split(",").contains(partitionSpec.getLocation.toString)) 
{

Review comment:
   Done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #4107: [CARBONDATA-4149] Query with SI after add partition based on location on partition table gives incorrect results

2021-03-18 Thread GitBox


ShreelekhyaG commented on a change in pull request #4107:
URL: https://github.com/apache/carbondata/pull/4107#discussion_r596995330



##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/partition/CarbonAlterTableAddHivePartitionCommand.scala
##
@@ -137,6 +153,19 @@ case class CarbonAlterTableAddHivePartitionCommand(
 CarbonUtil.checkAndCreateFolderWithPermission(segmentsLoc)
 val segmentPath = segmentsLoc + CarbonCommonConstants.FILE_SEPARATOR + 
segmentFileName
 SegmentFileStore.writeSegmentFile(segmentFile, segmentPath)
+CarbonLoaderUtil
+  .recordNewLoadMetadata(newMetaEntry, loadModel, false, false)
+operationContext.setProperty(table.getTableUniqueName + "_Segment", 
loadModel.getSegmentId)

Review comment:
   ok, removed.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #4107: [CARBONDATA-4149] Query with SI after add partition based on location on partition table gives incorrect results

2021-03-18 Thread GitBox


ShreelekhyaG commented on a change in pull request #4107:
URL: https://github.com/apache/carbondata/pull/4107#discussion_r596995121



##
File path: 
core/src/main/java/org/apache/carbondata/core/readcommitter/TableStatusReadCommittedScope.java
##
@@ -86,6 +87,13 @@ public TableStatusReadCommittedScope(AbsoluteTableIdentifier 
identifier,
 } else {
   SegmentFileStore fileStore =
   new SegmentFileStore(identifier.getTablePath(), 
segment.getSegmentFileName());
+  Optional

Review comment:
   removed this part, as we are setting the path in loadMetaDataDetails, 
it's no longer required.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #4107: [CARBONDATA-4149] Query with SI after add partition based on location on partition table gives incorrect results

2021-03-18 Thread GitBox


ShreelekhyaG commented on a change in pull request #4107:
URL: https://github.com/apache/carbondata/pull/4107#discussion_r596994614



##
File path: 
core/src/main/java/org/apache/carbondata/core/indexstore/ExtendedBlockletWrapper.java
##
@@ -121,8 +121,9 @@ public ExtendedBlockletWrapper(List 
extendedBlockletList, Stri
 DataOutputStream stream = new DataOutputStream(bos);
 try {
   for (ExtendedBlocklet extendedBlocklet : extendedBlockletList) {
+boolean isExternalPath = 
!extendedBlocklet.getFilePath().contains(tablePath);

Review comment:
   done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #4107: [CARBONDATA-4149] Query with SI after add partition based on location on partition table gives incorrect results

2021-03-18 Thread GitBox


ShreelekhyaG commented on a change in pull request #4107:
URL: https://github.com/apache/carbondata/pull/4107#discussion_r596992325



##
File path: core/src/main/java/org/apache/carbondata/core/index/Segment.java
##
@@ -417,4 +417,8 @@ public void setSegmentMetaDataInfo(SegmentMetaDataInfo 
segmentMetaDataInfo) {
   public boolean isExternalSegment() {
 return isExternalSegment;
   }
+
+  public void setIsExternalSegment(boolean isExternalSegment) {

Review comment:
   added path in loadMetadetail. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org