[GitHub] carbondata pull request #2462: [CARBONDATA-2704] Index file size in describe...
Github user asfgit closed the pull request at: https://github.com/apache/carbondata/pull/2462 ---
[GitHub] carbondata pull request #2462: [CARBONDATA-2704] Index file size in describe...
Github user manishgupta88 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2462#discussion_r202229400 --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/datacompaction/CarbonIndexFileMergeTestCase.scala --- @@ -193,6 +196,30 @@ class CarbonIndexFileMergeTestCase sql("select * from mitable").show() } + // CARBONDATA-2704, test the index file size after merge + test("Verify the size of the index file after merge") { +sql("DROP TABLE IF EXISTS fileSize") +sql( + """ +| CREATE TABLE fileSize(id INT, name STRING, city STRING, age INT) +| STORED BY 'org.apache.carbondata.format' +| TBLPROPERTIES('SORT_COLUMNS'='city,name') + """.stripMargin) +sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE fileSize OPTIONS('header'='false')") +val table = CarbonMetadata.getInstance().getCarbonTable("default", "fileSize") +var loadMetadataDetails = SegmentStatusManager + .readTableStatusFile(CarbonTablePath.getTableStatusFilePath(table.getTablePath)) +var segment0 = loadMetadataDetails.filter(x=> x.getLoadName.equalsIgnoreCase("0")) +Assert.assertEquals(692, segment0.head.getIndexSize.toLong) +new CarbonIndexFileMergeWriter(table) + .mergeCarbonIndexFilesOfSegment("0", table.getTablePath, false) +loadMetadataDetails = SegmentStatusManager + .readTableStatusFile(CarbonTablePath.getTableStatusFilePath(table.getTablePath)) +segment0 = loadMetadataDetails.filter(x=> x.getLoadName.equalsIgnoreCase("0")) +Assert.assertEquals(741, segment0.head.getIndexSize.toLong) --- End diff -- same comment as above ---
[GitHub] carbondata pull request #2462: [CARBONDATA-2704] Index file size in describe...
Github user manishgupta88 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2462#discussion_r202229386 --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/datacompaction/CarbonIndexFileMergeTestCase.scala --- @@ -193,6 +196,30 @@ class CarbonIndexFileMergeTestCase sql("select * from mitable").show() } + // CARBONDATA-2704, test the index file size after merge + test("Verify the size of the index file after merge") { +sql("DROP TABLE IF EXISTS fileSize") +sql( + """ +| CREATE TABLE fileSize(id INT, name STRING, city STRING, age INT) +| STORED BY 'org.apache.carbondata.format' +| TBLPROPERTIES('SORT_COLUMNS'='city,name') + """.stripMargin) +sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE fileSize OPTIONS('header'='false')") +val table = CarbonMetadata.getInstance().getCarbonTable("default", "fileSize") +var loadMetadataDetails = SegmentStatusManager + .readTableStatusFile(CarbonTablePath.getTableStatusFilePath(table.getTablePath)) +var segment0 = loadMetadataDetails.filter(x=> x.getLoadName.equalsIgnoreCase("0")) +Assert.assertEquals(692, segment0.head.getIndexSize.toLong) --- End diff -- Do not assert for hard coded values. Instead create the file path and take its size using CarbonFile getSize interface method ---
[GitHub] carbondata pull request #2462: [CARBONDATA-2704] Index file size in describe...
Github user manishgupta88 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2462#discussion_r202229032 --- Diff: core/src/main/java/org/apache/carbondata/core/metadata/SegmentFileStore.java --- @@ -304,10 +304,15 @@ public static boolean updateSegmentFile(String tablePath, String segmentId, Stri LoadMetadataDetails[] listOfLoadFolderDetailsArray = SegmentStatusManager.readLoadMetadata(metadataPath); +if (null == segmentFileStore) { + segmentFileStore = new SegmentFileStore(tablePath, segmentFile); +} --- End diff -- Do we need to calculate the size even if this object is null? ---
[GitHub] carbondata pull request #2462: [CARBONDATA-2704] Index file size in describe...
Github user manishgupta88 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2462#discussion_r202229089 --- Diff: core/src/main/java/org/apache/carbondata/core/writer/CarbonIndexFileMergeWriter.java --- @@ -155,7 +155,7 @@ private String writeMergeIndexFileBasedOnSegmentFile( + CarbonCommonConstants.FILE_SEPARATOR + newSegmentFileName; SegmentFileStore.writeSegmentFile(sfs.getSegmentFile(), path); SegmentFileStore.updateSegmentFile(table.getTablePath(), segmentId, newSegmentFileName, -table.getCarbonTableIdentifier().getTableId()); +table.getCarbonTableIdentifier().getTableId(), sfs); --- End diff -- change the variable name to segmentFileStore ---
[GitHub] carbondata pull request #2462: [CARBONDATA-2704] Index file size in describe...
GitHub user dhatchayani opened a pull request: https://github.com/apache/carbondata/pull/2462 [CARBONDATA-2704] Index file size in describe formatted command is not updated correctly with the segment file **Problem:** Describe formatted command is not showing correct index files size after index files merge. **Solution:** Segment file should be updated with the actual index files size of that segment after index files merge. - [ ] Any interfaces changed? - [ ] Any backward compatibility impacted? - [ ] Document update required? - [x] Testing done UT added - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. You can merge this pull request into a Git repository by running: $ git pull https://github.com/dhatchayani/carbondata CARBONDATA-2704 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/2462.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2462 commit 11f52bd21d6496764d0841c6f732cb66da583c16 Author: dhatchayani Date: 2018-07-09T05:49:51Z [CARBONDATA-2704] Index file size in describe formatted command is not updated correctly with the segment file ---