[GitHub] carbondata issue #1689: [CARBONDATA-1674] Describe formatted shows partition...
Github user jatin9896 commented on the issue: https://github.com/apache/carbondata/pull/1689 retest this please ---
[GitHub] carbondata issue #1678: [CARBONDATA-1903] Fix code issues in carbondata
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1678 Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1078/ ---
[GitHub] carbondata issue #1716: [CARBONDATA-1933] Support Spark 2.2.1 in carbon part...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/1716 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2537/ ---
[GitHub] carbondata issue #1689: [CARBONDATA-1674] Describe formatted shows partition...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1689 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2295/ ---
[GitHub] carbondata issue #1713: [WIP] [CARBONDATA-1899] Optimize CarbonData concurre...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1713 Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1077/ ---
[GitHub] carbondata issue #1707: [CARBONDATA-1839] [DataLoad] Fix bugs and optimize i...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1707 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2294/ ---
[GitHub] carbondata issue #1713: [WIP] [CARBONDATA-1899] Optimize CarbonData concurre...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/1713 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2536/ ---
[GitHub] carbondata issue #1716: [CARBONDATA-1933] Support Spark 2.2.1 in carbon part...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1716 Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1076/ ---
[GitHub] carbondata issue #1707: [CARBONDATA-1839] [DataLoad] Fix bugs and optimize i...
Github user xuchuanyin commented on the issue: https://github.com/apache/carbondata/pull/1707 retest this please ---
[GitHub] carbondata issue #1713: [WIP] [CARBONDATA-1899] Optimize CarbonData concurre...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/1713 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2535/ ---
[GitHub] carbondata issue #1678: [CARBONDATA-1903] Fix code issues in carbondata
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1678 Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1075/ ---
[GitHub] carbondata issue #1716: [CARBONDATA-1933] Support Spark 2.2.1 in carbon part...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1716 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2293/ ---
[GitHub] carbondata issue #1690: [CI] CI random failure
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1690 Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1074/ ---
[GitHub] carbondata issue #1713: [WIP] [CARBONDATA-1899] Optimize CarbonData concurre...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/1713 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2534/ ---
[GitHub] carbondata issue #1713: [WIP] [CARBONDATA-1899] Optimize CarbonData concurre...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1713 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2292/ ---
[GitHub] carbondata issue #1713: [WIP] [CARBONDATA-1899] Optimize CarbonData concurre...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1713 Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1073/ ---
[GitHub] carbondata issue #1690: [CI] CI random failure
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/1690 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2533/ ---
[GitHub] carbondata issue #1678: [CARBONDATA-1903] Fix code issues in carbondata
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1678 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2291/ ---
[GitHub] carbondata issue #1714: [CARBONDATA-1932] Add version for CarbonData
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1714 Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1072/ ---
[GitHub] carbondata issue #1690: [CI] CI random failure
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1690 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2290/ ---
[GitHub] carbondata issue #1713: [WIP] [CARBONDATA-1899] Optimize CarbonData concurre...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/1713 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2532/ ---
[GitHub] carbondata issue #1690: [CI] CI random failure
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1690 Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1071/ ---
[GitHub] carbondata pull request #1716: [CARBONDATA-1933] Support Spark 2.2.1 in carb...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1716#discussion_r158575738 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonLoadDataCommand.scala --- @@ -564,55 +572,49 @@ case class CarbonLoadDataCommand( val data = new Array[Any](len) var i = 0 val input = value.get() -while (i < input.length) { - // TODO find a way to avoid double conversion of date and time. - data(i) = CarbonScalaUtil.convertToUTF8String( -input(i), -rowDataTypes(i), -timeStampFormat, -dateFormat, -serializationNullFormat) - i = i + 1 +val inputLen = Math.min(input.length, len) +try { + while (i < inputLen) { +// TODO find a way to avoid double conversion of date and time. +data(i) = CarbonScalaUtil.convertToUTF8String( + input(i), + rowDataTypes(i), + timeStampFormat, + dateFormat, + serializationNullFormat, + failAction, + ignoreAction) +i = i + 1 + } + InternalRow.fromSeq(data) +} catch { + case e: BadRecordFoundException => throw e + case e: Exception => InternalRow.empty // It is bad record ignore case } -InternalRow.fromSeq(data) -} + +}.filter(f => f.numFields != 0) // In bad record ignore case filter the empty values --- End diff -- I will move rdd creation to another private function. ---
[GitHub] carbondata pull request #1716: [CARBONDATA-1933] Support Spark 2.2.1 in carb...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1716#discussion_r158575732 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonLoadDataCommand.scala --- @@ -564,55 +572,49 @@ case class CarbonLoadDataCommand( val data = new Array[Any](len) var i = 0 val input = value.get() -while (i < input.length) { - // TODO find a way to avoid double conversion of date and time. - data(i) = CarbonScalaUtil.convertToUTF8String( -input(i), -rowDataTypes(i), -timeStampFormat, -dateFormat, -serializationNullFormat) - i = i + 1 +val inputLen = Math.min(input.length, len) +try { + while (i < inputLen) { +// TODO find a way to avoid double conversion of date and time. +data(i) = CarbonScalaUtil.convertToUTF8String( + input(i), + rowDataTypes(i), + timeStampFormat, + dateFormat, + serializationNullFormat, + failAction, + ignoreAction) +i = i + 1 + } + InternalRow.fromSeq(data) +} catch { + case e: BadRecordFoundException => throw e + case e: Exception => InternalRow.empty // It is bad record ignore case } -InternalRow.fromSeq(data) -} + +}.filter(f => f.numFields != 0) // In bad record ignore case filter the empty values --- End diff -- I am not getting how to filter inside map function. I don't think we can filter some rows inside map function unless we apply filter function. ---
[GitHub] carbondata pull request #1716: [CARBONDATA-1933] Support Spark 2.2.1 in carb...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1716#discussion_r158575701 --- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/spark/util/CarbonScalaUtil.scala --- @@ -158,7 +159,9 @@ object CarbonScalaUtil { dataType: DataType, timeStampFormat: SimpleDateFormat, dateFormat: SimpleDateFormat, - serializationNullFormat: String): UTF8String = { + serializationNullFormat: String, + failAction: Boolean, + ignoreAction: Boolean): UTF8String = { --- End diff -- We passed these parameters to throw an exception or ignore it in case of bad records. It is not good to handle in caller as it is per each column, not for row so caller code will become messy if we try to handle there. I will give comments to the parameters for understanding, ---
[GitHub] carbondata issue #1714: [CARBONDATA-1932] Add version for CarbonData
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1714 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2289/ ---
[GitHub] carbondata issue #1678: [CARBONDATA-1903] Fix code issues in carbondata
Github user xuchuanyin commented on the issue: https://github.com/apache/carbondata/pull/1678 retest this please ---
[GitHub] carbondata issue #1714: [CARBONDATA-1932] Add version for CarbonData
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1714 Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1070/ ---
[GitHub] carbondata issue #1713: [WIP] [CARBONDATA-1899] Optimize CarbonData concurre...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1713 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2288/ ---
[GitHub] carbondata issue #1690: [CI] CI random failure
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/1690 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2531/ ---
[GitHub] carbondata issue #1678: [CARBONDATA-1903] Fix code issues in carbondata
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1678 Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1069/ ---
[GitHub] carbondata issue #1690: [CI] CI random failure
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1690 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2287/ ---
[GitHub] carbondata issue #1690: [CI] CI random failure
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1690 Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1068/ ---
[GitHub] carbondata issue #1714: [CARBONDATA-1932] Add version for CarbonData
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/1714 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2530/ ---
[GitHub] carbondata issue #1714: [CARBONDATA-1932] Add version for CarbonData
Github user xubo245 commented on the issue: https://github.com/apache/carbondata/pull/1714 retest this please ---
[GitHub] carbondata issue #1714: [CARBONDATA-1932] Add version for CarbonData
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1714 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2286/ ---
[GitHub] carbondata issue #1709: [CARBONDATA-1774] [PrestoIntegration] Not able to fe...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1709 Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1067/ ---
[GitHub] carbondata pull request #1716: [CARBONDATA-1933] Support Spark 2.2.1 in carb...
Github user jackylk commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1716#discussion_r158574266 --- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/spark/util/CarbonScalaUtil.scala --- @@ -158,7 +159,9 @@ object CarbonScalaUtil { dataType: DataType, timeStampFormat: SimpleDateFormat, dateFormat: SimpleDateFormat, - serializationNullFormat: String): UTF8String = { + serializationNullFormat: String, + failAction: Boolean, + ignoreAction: Boolean): UTF8String = { --- End diff -- It is not easy to understand why there is bad record related parameter in this conversion function, can you catch exception in caller and handler it there? There is only one place call this function, and it is better to restrict its scope by moving it there ---
[GitHub] carbondata issue #1690: [CI] CI random failure
Github user QiangCai commented on the issue: https://github.com/apache/carbondata/pull/1690 retest this please ---
[GitHub] carbondata pull request #1716: [CARBONDATA-1933] Support Spark 2.2.1 in carb...
Github user jackylk commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1716#discussion_r158574044 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonLoadDataCommand.scala --- @@ -564,55 +572,49 @@ case class CarbonLoadDataCommand( val data = new Array[Any](len) var i = 0 val input = value.get() -while (i < input.length) { - // TODO find a way to avoid double conversion of date and time. - data(i) = CarbonScalaUtil.convertToUTF8String( -input(i), -rowDataTypes(i), -timeStampFormat, -dateFormat, -serializationNullFormat) - i = i + 1 +val inputLen = Math.min(input.length, len) +try { + while (i < inputLen) { +// TODO find a way to avoid double conversion of date and time. +data(i) = CarbonScalaUtil.convertToUTF8String( + input(i), + rowDataTypes(i), + timeStampFormat, + dateFormat, + serializationNullFormat, + failAction, + ignoreAction) +i = i + 1 + } + InternalRow.fromSeq(data) +} catch { + case e: BadRecordFoundException => throw e + case e: Exception => InternalRow.empty // It is bad record ignore case } -InternalRow.fromSeq(data) -} + +}.filter(f => f.numFields != 0) // In bad record ignore case filter the empty values --- End diff -- I think it is better filter it inside the map function in line 571, it can avoid creating a big InternalRow with all columns ---
[GitHub] carbondata issue #1690: [CI] CI random failure
Github user QiangCai commented on the issue: https://github.com/apache/carbondata/pull/1690 retest sdv please ---
[GitHub] carbondata pull request #1716: [CARBONDATA-1933] Support Spark 2.2.1 in carb...
Github user jackylk commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1716#discussion_r158574052 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonLoadDataCommand.scala --- @@ -564,55 +572,49 @@ case class CarbonLoadDataCommand( val data = new Array[Any](len) var i = 0 val input = value.get() -while (i < input.length) { - // TODO find a way to avoid double conversion of date and time. - data(i) = CarbonScalaUtil.convertToUTF8String( -input(i), -rowDataTypes(i), -timeStampFormat, -dateFormat, -serializationNullFormat) - i = i + 1 +val inputLen = Math.min(input.length, len) +try { + while (i < inputLen) { +// TODO find a way to avoid double conversion of date and time. +data(i) = CarbonScalaUtil.convertToUTF8String( + input(i), + rowDataTypes(i), + timeStampFormat, + dateFormat, + serializationNullFormat, + failAction, + ignoreAction) +i = i + 1 + } + InternalRow.fromSeq(data) +} catch { + case e: BadRecordFoundException => throw e + case e: Exception => InternalRow.empty // It is bad record ignore case } -InternalRow.fromSeq(data) -} + +}.filter(f => f.numFields != 0) // In bad record ignore case filter the empty values --- End diff -- It is better to create a private func for that map function ---
[GitHub] carbondata pull request #1689: [CARBONDATA-1674] Describe formatted shows pa...
Github user jackylk commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1689#discussion_r158573828 --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/partition/TestShowPartitions.scala --- @@ -150,6 +150,14 @@ class TestShowPartition extends QueryTest with BeforeAndAfterAll { } + test("show partition table: desc formatted should show partition type"){ +//check for partition type exist in desc formatted +checkExistence(sql("describe formatted hashTable"),true,"Partition Type") +val result: Array[Row] =sql("describe formatted hashTable").collect() --- End diff -- you can run this sql once and get result in Line 155 ---
[GitHub] carbondata pull request #1689: [CARBONDATA-1674] Describe formatted shows pa...
Github user jackylk commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1689#discussion_r158573797 --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/partition/TestShowPartitions.scala --- @@ -150,6 +150,14 @@ class TestShowPartition extends QueryTest with BeforeAndAfterAll { } + test("show partition table: desc formatted should show partition type"){ +//check for partition type exist in desc formatted +checkExistence(sql("describe formatted hashTable"),true,"Partition Type") +val result: Array[Row] =sql("describe formatted hashTable").collect() --- End diff -- add space after `=` ---
[GitHub] carbondata issue #1715: [CARBONDATA-1934] Incorrect results are returned by ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1715 Build Failed with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1066/ ---
[GitHub] carbondata pull request #1718: [CARBONDATA-1929][Validation]carbon property ...
Github user jackylk commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1718#discussion_r158573768 --- Diff: core/src/main/java/org/apache/carbondata/core/util/CarbonProperties.java --- @@ -107,36 +108,103 @@ private void validateAndLoadDefaultProperties() { validateCarbonCSVReadBufferSizeByte(); validateHandoffSize(); validateCombineSmallInputFiles(); +// The method validate the validity of configured carbon.timestamp.format value +// and reset to default value if validation fail +validateCarbonKey(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT, +CarbonCommonConstants.CARBON_TIMESTAMP_DEFAULT_FORMAT); +// The method validate the validity of configured carbon.date.format value +// and reset to default value if validation fail +validateCarbonKey(CarbonCommonConstants.CARBON_DATE_FORMAT, --- End diff -- These validation should also be done when addProperty is called ---
[GitHub] carbondata issue #1678: [CARBONDATA-1903] Fix code issues in carbondata
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1678 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2285/ ---
[GitHub] carbondata pull request #1718: [CARBONDATA-1929][Validation]carbon property ...
Github user jackylk commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1718#discussion_r158573603 --- Diff: core/src/main/java/org/apache/carbondata/core/util/CarbonProperties.java --- @@ -107,36 +108,103 @@ private void validateAndLoadDefaultProperties() { validateCarbonCSVReadBufferSizeByte(); validateHandoffSize(); validateCombineSmallInputFiles(); +// The method validate the validity of configured carbon.timestamp.format value +// and reset to default value if validation fail +validateCarbonKey(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT, +CarbonCommonConstants.CARBON_TIMESTAMP_DEFAULT_FORMAT); +// The method validate the validity of configured carbon.date.format value +// and reset to default value if validation fail +validateCarbonKey(CarbonCommonConstants.CARBON_DATE_FORMAT, +CarbonCommonConstants.CARBON_DATE_DEFAULT_FORMAT); +validateSortFileWriteBufferSize(); +validateSortIntermediateFilesLimit(); } - private void validateCarbonCSVReadBufferSizeByte() { -String csvReadBufferSizeStr = - carbonProperties.getProperty(CarbonCommonConstants.CSV_READ_BUFFER_SIZE); -if (null != csvReadBufferSizeStr) { + /** + * Sort intermediate file size validation and if not valid then reset to the default value + */ + private void validateSortIntermediateFilesLimit() { +validateRange(CarbonCommonConstants.SORT_INTERMEDIATE_FILES_LIMIT, +CarbonCommonConstants.SORT_INTERMEDIATE_FILES_LIMIT_DEFAULT_VALUE, +CarbonCommonConstants.SORT_INTERMEDIATE_FILES_LIMIT_MIN, +CarbonCommonConstants.SORT_INTERMEDIATE_FILES_LIMIT_MAX); + } + + /** + * + * @param key + * @param defaultValue default value for the given key + * @param minValue Minimum value for the given key + * @param maxValue Max value for the given key + */ + private void validateRange(String key, String defaultValue, int minValue, int maxValue) { +String fileBufferSize = carbonProperties +.getProperty(key, defaultValue); +if (null != fileBufferSize) { try { -int bufferSize = Integer.parseInt(csvReadBufferSizeStr); -if (bufferSize < CarbonCommonConstants.CSV_READ_BUFFER_SIZE_MIN -|| bufferSize > CarbonCommonConstants.CSV_READ_BUFFER_SIZE_MAX) { - LOGGER.warn("The value \"" + csvReadBufferSizeStr + "\" configured for key " - + CarbonCommonConstants.CSV_READ_BUFFER_SIZE +int bufferSize = Integer.parseInt(fileBufferSize); + +if (bufferSize < minValue +|| bufferSize > maxValue) { + LOGGER.warn("The value \"" + fileBufferSize + "\" configured for key " + + key + "\" is not in range. Valid range is (byte) \"" - + CarbonCommonConstants.CSV_READ_BUFFER_SIZE_MIN + " to \"" - + CarbonCommonConstants.CSV_READ_BUFFER_SIZE_MAX + ". Using the default value \"" - + CarbonCommonConstants.CSV_READ_BUFFER_SIZE_DEFAULT); - carbonProperties.setProperty(CarbonCommonConstants.CSV_READ_BUFFER_SIZE, - CarbonCommonConstants.CSV_READ_BUFFER_SIZE_DEFAULT); + + minValue + " to \"" + + maxValue + + ". Using the default value \"" + + defaultValue); + carbonProperties.setProperty(key, + defaultValue); } } catch (NumberFormatException nfe) { -LOGGER.warn("The value \"" + csvReadBufferSizeStr + "\" configured for key " -+ CarbonCommonConstants.CSV_READ_BUFFER_SIZE +LOGGER.warn("The value \"" + fileBufferSize + "\" configured for key " ++ key + "\" is invalid. Using the default value \"" -+ CarbonCommonConstants.CSV_READ_BUFFER_SIZE_DEFAULT); - carbonProperties.setProperty(CarbonCommonConstants.CSV_READ_BUFFER_SIZE, -CarbonCommonConstants.CSV_READ_BUFFER_SIZE_DEFAULT); ++ defaultValue); +carbonProperties.setProperty(key, +defaultValue); } } } + /** + * validate carbon.sort.file.write.buffer.size and if not valid then reset to the default value + */ + private void validateSortFileWriteBufferSize() { +validateRange(CarbonCommonConstants.CARBON_SORT_FILE_WRITE_BUFFER_SIZE, + CarbonCommonConstants.CARBON_SORT_FILE_WRITE_BUFFER_SIZE_DEFAULT_VALUE, +CarbonCommonConstants.CARBON_SORT_FILE_WRITE_BUFFER_SIZE_MIN, +CarbonCommonConstants.CARBON_SORT_FILE_WRITE_BUFFER_SIZE_MAX); + } + + /** + * The
[GitHub] carbondata issue #1714: [CARBONDATA-1932] Add version for CarbonData
Github user xubo245 commented on the issue: https://github.com/apache/carbondata/pull/1714 jacky required print spark verison and carbondata version in concurrency test case. Spark have version infomation in code, but carbondata no this. ---
[GitHub] carbondata pull request #1714: [CARBONDATA-1932] Add version for CarbonData
Github user xubo245 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1714#discussion_r158573382 --- Diff: core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java --- @@ -1108,6 +1114,11 @@ */ public static final String CARBON_DATA_FILE_DEFAULT_VERSION = "V3"; + /** + * current CarbonData version + */ + public static final String CARBONDATA_DEFAULT_VERSION = "1.3.0-SNAPSHOT"; --- End diff -- I also think so, but I don't know how to implement it. I will try. ---
[GitHub] carbondata pull request #1714: [CARBONDATA-1932] Add version for CarbonData
Github user xubo245 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1714#discussion_r158573385 --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/version/VersionTest.scala --- @@ -0,0 +1,75 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.spark.testsuite.version + +import org.apache.carbondata.core.constants.CarbonCommonConstants +import org.apache.carbondata.core.util.CarbonProperties +import org.apache.spark.sql.{CarbonSession, SparkSession} +import org.apache.spark.sql.test.util.QueryTest +import org.scalatest.BeforeAndAfterAll + +/** + * Created by root on 12/22/17. --- End diff -- ok ---
[GitHub] carbondata issue #1720: [CARBONDATA-1935]fix the backword compatibility issu...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1720 Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1065/ ---
[GitHub] carbondata issue #1678: [CARBONDATA-1903] Fix code issues in carbondata
Github user xuchuanyin commented on the issue: https://github.com/apache/carbondata/pull/1678 retest this please ---
[GitHub] carbondata issue #1649: [CARBONDATA-1846] Incorrect output on presto CLI whi...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1649 Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1064/ ---
[GitHub] carbondata issue #1690: [CI] CI random failure
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1690 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2284/ ---
[GitHub] carbondata issue #1690: [CI] CI random failure
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1690 Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1063/ ---
[GitHub] carbondata issue #1690: [CI] CI random failure
Github user QiangCai commented on the issue: https://github.com/apache/carbondata/pull/1690 retest this please ---
[GitHub] carbondata issue #1692: [CARBONDATA-1777] Added check to refresh table if ca...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1692 Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1062/ ---
[GitHub] carbondata issue #1702: [CARBONDATA-1896] Clean files operation improvement
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1702 Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1061/ ---
[GitHub] carbondata issue #1708: [CARBONDATA-1928] Seperate the properties for timeou...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1708 Build Failed with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1060/ ---
[GitHub] carbondata issue #1709: [CARBONDATA-1774] [PrestoIntegration] Not able to fe...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1709 Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1059/ ---
[GitHub] carbondata issue #1715: [CARBONDATA-1934] Incorrect results are returned by ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1715 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2283/ ---
[GitHub] carbondata issue #1720: [CARBONDATA-1935]fix the backword compatibility issu...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1720 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2282/ ---
[GitHub] carbondata issue #1715: [CARBONDATA-1934] Incorrect results are returned by ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1715 Build Failed with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1057/ ---
[GitHub] carbondata issue #1720: [CARBONDATA-1935]fix the backword compatibility issu...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/1720 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2529/ ---
[GitHub] carbondata issue #1715: [CARBONDATA-1934] Incorrect results are returned by ...
Github user manishgupta88 commented on the issue: https://github.com/apache/carbondata/pull/1715 retest this please ---
[GitHub] carbondata issue #1720: [CARBONDATA-1935]fix the backword compatibility issu...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/1720 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2528/ ---
[GitHub] carbondata issue #1716: [CARBONDATA-1933] Support Spark 2.2.1 in carbon part...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1716 Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1056/ ---
[GitHub] carbondata issue #1702: [CARBONDATA-1896] Clean files operation improvement
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/1702 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2527/ ---
[GitHub] carbondata issue #1720: [CARBONDATA-1935]fix the backword compatibility issu...
Github user akashrn5 commented on the issue: https://github.com/apache/carbondata/pull/1720 retest this please ---
[GitHub] carbondata issue #1709: [CARBONDATA-1774] [PrestoIntegration] Not able to fe...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1709 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2281/ ---
[GitHub] carbondata issue #1718: [CARBONDATA-1929][Validation]carbon property configu...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1718 Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1055/ ---
[GitHub] carbondata issue #1720: [CARBONDATA-1935]fix the backword compatibility issu...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/1720 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2526/ ---
[GitHub] carbondata issue #1716: [CARBONDATA-1933] Support Spark 2.2.1 in carbon part...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/1716 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2525/ ---
[GitHub] carbondata issue #1720: [CARBONDATA-1935]fix the backword compatibility issu...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1720 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2280/ ---
[GitHub] carbondata issue #1716: [CARBONDATA-1933] Support Spark 2.2.1 in carbon part...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1716 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2279/ ---
[GitHub] carbondata issue #1719: [WIP] [CARBONDATA-1731,CARBONDATA-1728] Update fails...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1719 Build Failed with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1054/ ---
[jira] [Created] (CARBONDATA-1937) NULL values on Non string partition columns throws exception
Ravindra Pesala created CARBONDATA-1937: --- Summary: NULL values on Non string partition columns throws exception Key: CARBONDATA-1937 URL: https://issues.apache.org/jira/browse/CARBONDATA-1937 Project: CarbonData Issue Type: Bug Reporter: Ravindra Pesala If any null values of non string partition columns throws error while doing filter query. It seems lile restrioction from spark {code} Caused by: MetaException(message:Filtering is supported only on partition keys of type string) at org.apache.hadoop.hive.metastore.parser.ExpressionTree$FilterBuilder.setError(ExpressionTree.java:185) at org.apache.hadoop.hive.metastore.parser.ExpressionTree$LeafNode.getJdoFilterPushdownParam(ExpressionTree.java:440) at org.apache.hadoop.hive.metastore.parser.ExpressionTree$LeafNode.generateJDOFilterOverPartitions(ExpressionTree.java:357) at org.apache.hadoop.hive.metastore.parser.ExpressionTree$LeafNode.generateJDOFilter(ExpressionTree.java:279) at org.apache.hadoop.hive.metastore.parser.ExpressionTree.generateJDOFilterFragment(ExpressionTree.java:578) at org.apache.hadoop.hive.metastore.ObjectStore.makeQueryFilterString(ObjectStore.java:2615) at org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsViaOrmFilter(ObjectStore.java:2199) at org.apache.hadoop.hive.metastore.ObjectStore.access$500(ObjectStore.java:160) at org.apache.hadoop.hive.metastore.ObjectStore$5.getJdoResult(ObjectStore.java:2530) at org.apache.hadoop.hive.metastore.ObjectStore$5.getJdoResult(ObjectStore.java:2515) at org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(ObjectStore.java:2391) at org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilterInternal(ObjectStore.java:2515) at org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilter(ObjectStore.java:2335) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:114) at com.sun.proxy.$Proxy13.getPartitionsByFilter(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions_by_filter(HiveMetaStore.java:4442) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107) at com.sun.proxy.$Proxy15.get_partitions_by_filter(Unknown Source) {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1936) Bad Record logger is not working properly in Carbon Partition
Ravindra Pesala created CARBONDATA-1936: --- Summary: Bad Record logger is not working properly in Carbon Partition Key: CARBONDATA-1936 URL: https://issues.apache.org/jira/browse/CARBONDATA-1936 Project: CarbonData Issue Type: Bug Reporter: Ravindra Pesala Priority: Minor Bad records are not logging and the load is always success irrespective of bad records are present. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] carbondata issue #1719: [WIP] [CARBONDATA-1731,CARBONDATA-1728] Update fails...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/1719 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2524/ ---
[GitHub] carbondata issue #1709: [CARBONDATA-1774] [PrestoIntegration] Not able to fe...
Github user anubhav100 commented on the issue: https://github.com/apache/carbondata/pull/1709 retest this please ---
[GitHub] carbondata issue #1720: [CARBONDATA-1935]fix the backword compatibility issu...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1720 Build Failed with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1053/ ---
[GitHub] carbondata issue #1692: [CARBONDATA-1777] Added check to refresh table if ca...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1692 Build Failed with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1052/ ---
[GitHub] carbondata issue #1720: [CARBONDATA-1935]fix the backword compatibility issu...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1720 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2278/ ---
[GitHub] carbondata issue #1721: [CARBONDATA-1822] Documentation - Added REFRESH TABL...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1721 Can one of the admins verify this patch? ---
[GitHub] carbondata issue #1716: [CARBONDATA-1933] Support Spark 2.2.1 in carbon part...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1716 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2277/ ---
[GitHub] carbondata issue #1693: [CARBONDATA-1909] Load is failing during insert into...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1693 Build Failed with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1051/ ---
[GitHub] carbondata pull request #1702: [CARBONDATA-1896] Clean files operation impro...
Github user dhatchayani commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1702#discussion_r158524394 --- Diff: processing/src/main/java/org/apache/carbondata/processing/util/DeleteLoadFolders.java --- @@ -124,27 +148,50 @@ private static boolean checkIfLoadCanBeDeleted(LoadMetadataDetails oneLoad, return false; } + private static LoadMetadataDetails getCurrentLoadStatusOfSegment(String segmentId, + String metadataPath) { +LoadMetadataDetails[] currentDetails = SegmentStatusManager.readLoadMetadata(metadataPath); +for (LoadMetadataDetails oneLoad : currentDetails) { + if (oneLoad.getLoadName().equalsIgnoreCase(segmentId)) { +return oneLoad; + } +} +return null; + } + public static boolean deleteLoadFoldersFromFileSystem( AbsoluteTableIdentifier absoluteTableIdentifier, boolean isForceDelete, - LoadMetadataDetails[] details) { + LoadMetadataDetails[] details, String metadataPath) { boolean isDeleted = false; if (details != null && details.length != 0) { for (LoadMetadataDetails oneLoad : details) { if (checkIfLoadCanBeDeleted(oneLoad, isForceDelete)) { - String path = getSegmentPath(absoluteTableIdentifier, 0, oneLoad); - boolean deletionStatus = physicalFactAndMeasureMetadataDeletion(path); - if (deletionStatus) { -isDeleted = true; -oneLoad.setVisibility("false"); -LOGGER.info("Info: Deleted the load " + oneLoad.getLoadName()); + ICarbonLock segmentLock = CarbonLockFactory.getCarbonLockObj(absoluteTableIdentifier, + CarbonTablePath.addSegmentPrefix(oneLoad.getLoadName()) + LockUsage.LOCK); + try { +if (segmentLock.lockWithRetries()) { --- End diff -- this can be solved by PR 1708, there we have added one more lockWithRetries with retry and timeout arguments. ---
[GitHub] carbondata issue #1692: [CARBONDATA-1777] Added check to refresh table if ca...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/1692 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2523/ ---
[GitHub] carbondata pull request #1702: [CARBONDATA-1896] Clean files operation impro...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1702#discussion_r158520831 --- Diff: integration/spark2/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala --- @@ -315,88 +314,100 @@ object CarbonDataRDDFactory { val isSortTable = carbonTable.getNumberOfSortColumns > 0 val sortScope = CarbonDataProcessorUtil.getSortScope(carbonLoadModel.getSortScope) +val segmentLock = CarbonLockFactory.getCarbonLockObj(carbonTable.getAbsoluteTableIdentifier, + CarbonTablePath.addSegmentPrefix(carbonLoadModel.getSegmentId) + LockUsage.LOCK) + try { - if (updateModel.isDefined) { -res = loadDataFrameForUpdate( - sqlContext, - dataFrame, - carbonLoadModel, - updateModel, - carbonTable) -res.foreach { resultOfSeg => - resultOfSeg.foreach { resultOfBlock => -if (resultOfBlock._2._1.getSegmentStatus == SegmentStatus.LOAD_FAILURE) { - loadStatus = SegmentStatus.LOAD_FAILURE - if (resultOfBlock._2._2.failureCauses == FailureCauses.NONE) { -updateModel.get.executorErrors.failureCauses = FailureCauses.EXECUTOR_FAILURE -updateModel.get.executorErrors.errorMsg = "Failure in the Executor." - } else { -updateModel.get.executorErrors = resultOfBlock._2._2 + if (segmentLock.lockWithRetries()) { +if (updateModel.isDefined) { + res = loadDataFrameForUpdate( +sqlContext, +dataFrame, +carbonLoadModel, +updateModel, +carbonTable) + res.foreach { resultOfSeg => +resultOfSeg.foreach { resultOfBlock => + if (resultOfBlock._2._1.getSegmentStatus == SegmentStatus.LOAD_FAILURE) { +loadStatus = SegmentStatus.LOAD_FAILURE +if (resultOfBlock._2._2.failureCauses == FailureCauses.NONE) { + updateModel.get.executorErrors.failureCauses = FailureCauses.EXECUTOR_FAILURE + updateModel.get.executorErrors.errorMsg = "Failure in the Executor." +} else { + updateModel.get.executorErrors = resultOfBlock._2._2 +} + } else if (resultOfBlock._2._1.getSegmentStatus == + SegmentStatus.LOAD_PARTIAL_SUCCESS) { +loadStatus = SegmentStatus.LOAD_PARTIAL_SUCCESS +updateModel.get.executorErrors.failureCauses = resultOfBlock._2._2.failureCauses +updateModel.get.executorErrors.errorMsg = resultOfBlock._2._2.errorMsg } -} else if (resultOfBlock._2._1.getSegmentStatus == - SegmentStatus.LOAD_PARTIAL_SUCCESS) { - loadStatus = SegmentStatus.LOAD_PARTIAL_SUCCESS - updateModel.get.executorErrors.failureCauses = resultOfBlock._2._2.failureCauses - updateModel.get.executorErrors.errorMsg = resultOfBlock._2._2.errorMsg } } -} - } else { -status = if (carbonTable.getPartitionInfo(carbonTable.getTableName) != null) { - loadDataForPartitionTable(sqlContext, dataFrame, carbonLoadModel, hadoopConf) -} else if (isSortTable && sortScope.equals(SortScopeOptions.SortScope.GLOBAL_SORT)) { - DataLoadProcessBuilderOnSpark.loadDataUsingGlobalSort(sqlContext.sparkSession, -dataFrame, carbonLoadModel, hadoopConf) -} else if (dataFrame.isDefined) { - loadDataFrame(sqlContext, dataFrame, carbonLoadModel) } else { - loadDataFile(sqlContext, carbonLoadModel, hadoopConf) -} -CommonUtil.mergeIndexFiles(sqlContext.sparkContext, - Seq(carbonLoadModel.getSegmentId), storePath, carbonTable, false) -val newStatusMap = scala.collection.mutable.Map.empty[String, SegmentStatus] -if (status.nonEmpty) { - status.foreach { eachLoadStatus => -val state = newStatusMap.get(eachLoadStatus._1) -state match { - case Some(SegmentStatus.LOAD_FAILURE) => -newStatusMap.put(eachLoadStatus._1, eachLoadStatus._2._1.getSegmentStatus) - case Some(SegmentStatus.LOAD_PARTIAL_SUCCESS) -if eachLoadStatus._2._1.getSegmentStatus == - SegmentStatus.SUCCESS => -newStatusMap.put(eachLoadStatus._1, eachLoadStatus._2._1.getSegmentStatus) - case _ => -newStatusMap.put(eachLoadStatus._1,
[GitHub] carbondata issue #1718: [CARBONDATA-1929][Validation]carbon property configu...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1718 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2276/ ---
[GitHub] carbondata issue #1702: [CARBONDATA-1896] Clean files operation improvement
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1702 Build Failed with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1050/ ---
[GitHub] carbondata pull request #1702: [CARBONDATA-1896] Clean files operation impro...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1702#discussion_r158520072 --- Diff: processing/src/main/java/org/apache/carbondata/processing/util/DeleteLoadFolders.java --- @@ -124,27 +148,50 @@ private static boolean checkIfLoadCanBeDeleted(LoadMetadataDetails oneLoad, return false; } + private static LoadMetadataDetails getCurrentLoadStatusOfSegment(String segmentId, + String metadataPath) { +LoadMetadataDetails[] currentDetails = SegmentStatusManager.readLoadMetadata(metadataPath); +for (LoadMetadataDetails oneLoad : currentDetails) { + if (oneLoad.getLoadName().equalsIgnoreCase(segmentId)) { +return oneLoad; + } +} +return null; + } + public static boolean deleteLoadFoldersFromFileSystem( AbsoluteTableIdentifier absoluteTableIdentifier, boolean isForceDelete, - LoadMetadataDetails[] details) { + LoadMetadataDetails[] details, String metadataPath) { boolean isDeleted = false; if (details != null && details.length != 0) { for (LoadMetadataDetails oneLoad : details) { if (checkIfLoadCanBeDeleted(oneLoad, isForceDelete)) { - String path = getSegmentPath(absoluteTableIdentifier, 0, oneLoad); - boolean deletionStatus = physicalFactAndMeasureMetadataDeletion(path); - if (deletionStatus) { -isDeleted = true; -oneLoad.setVisibility("false"); -LOGGER.info("Info: Deleted the load " + oneLoad.getLoadName()); + ICarbonLock segmentLock = CarbonLockFactory.getCarbonLockObj(absoluteTableIdentifier, + CarbonTablePath.addSegmentPrefix(oneLoad.getLoadName()) + LockUsage.LOCK); + try { +if (segmentLock.lockWithRetries()) { --- End diff -- Better acquire locks for in-progress status segments not for others ---
[jira] [Closed] (CARBONDATA-1783) (Carbon1.3.0 - Streaming) Error "Failed to filter row in vector reader" when filter query executed on streaming data
[ https://issues.apache.org/jira/browse/CARBONDATA-1783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat closed CARBONDATA-1783. --- Resolution: Fixed Fix Version/s: 1.3.0 The defect is fixed in the latest 1.3.0 build and closed. > (Carbon1.3.0 - Streaming) Error "Failed to filter row in vector reader" when > filter query executed on streaming data > > > Key: CARBONDATA-1783 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1783 > Project: CarbonData > Issue Type: Bug > Components: data-query >Affects Versions: 1.3.0 > Environment: 3 node ant cluster >Reporter: Chetan Bhat > Labels: DFX > Fix For: 1.3.0 > > > Steps :- > Spark submit thrift server is started using the command - bin/spark-submit > --master yarn-client --executor-memory 10G --executor-cores 5 --driver-memory > 5G --num-executors 3 --class > org.apache.carbondata.spark.thriftserver.CarbonThriftServer > /srv/spark2.2Bigdata/install/spark/sparkJdbc/carbonlib/carbondata_2.11-1.3.0-SNAPSHOT-shade-hadoop2.7.2.jar > "hdfs://hacluster/user/hive/warehouse/carbon.store" > Spark shell is launched using the command - bin/spark-shell --master > yarn-client --executor-memory 10G --executor-cores 5 --driver-memory 5G > --num-executors 3 --jars > /srv/spark2.2Bigdata/install/spark/sparkJdbc/carbonlib/carbondata_2.11-1.3.0-SNAPSHOT-shade-hadoop2.7.2.jar > From Spark shell user creates table and loads data in the table as shown > below. > import java.io.{File, PrintWriter} > import java.net.ServerSocket > import org.apache.spark.sql.{CarbonEnv, SparkSession} > import org.apache.spark.sql.hive.CarbonRelation > import org.apache.spark.sql.streaming.{ProcessingTime, StreamingQuery} > import org.apache.carbondata.core.constants.CarbonCommonConstants > import org.apache.carbondata.core.util.CarbonProperties > import org.apache.carbondata.core.util.path.{CarbonStorePath, CarbonTablePath} > CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT, > "/MM/dd") > import org.apache.spark.sql.CarbonSession._ > val carbonSession = SparkSession. > builder(). > appName("StreamExample"). > > getOrCreateCarbonSession("hdfs://hacluster/user/hive/warehouse/carbon.store") > > carbonSession.sparkContext.setLogLevel("INFO") > def sql(sql: String) = carbonSession.sql(sql) > def writeSocket(serverSocket: ServerSocket): Thread = { > val thread = new Thread() { > override def run(): Unit = { > // wait for client to connection request and accept > val clientSocket = serverSocket.accept() > val socketWriter = new PrintWriter(clientSocket.getOutputStream()) > var index = 0 > for (_ <- 1 to 1000) { > // write 5 records per iteration > for (_ <- 0 to 100) { > index = index + 1 > socketWriter.println(index.toString + ",name_" + index >+ ",city_" + index + "," + (index * > 1.00).toString + >",school_" + index + ":school_" + index + > index + "$" + index) > } > socketWriter.flush() > Thread.sleep(2000) > } > socketWriter.close() > System.out.println("Socket closed") > } > } > thread.start() > thread > } > > def startStreaming(spark: SparkSession, tablePath: CarbonTablePath, > tableName: String, port: Int): Thread = { > val thread = new Thread() { > override def run(): Unit = { > var qry: StreamingQuery = null > try { > val readSocketDF = spark.readStream > .format("socket") > .option("host", "10.18.98.34") > .option("port", port) > .load() > qry = readSocketDF.writeStream > .format("carbondata") > .trigger(ProcessingTime("5 seconds")) > .option("checkpointLocation", tablePath.getStreamingCheckpointDir) > .option("tablePath", tablePath.getPath).option("tableName", > tableName) > .start() > qry.awaitTermination() > } catch { > case ex: Throwable => > ex.printStackTrace() > println("Done reading and writing streaming data") > } finally { > qry.stop() > } > } > } > thread.start() > thread > } > val streamTableName = "all_datatypes_2048" > sql(s"create table all_datatypes_2048 (imei string,deviceInformationId > int,MAC string,deviceColor string,device_backColor string,modelId > string,marketName string,AMSize string,ROMSize string,CUPAudit > string,CPIClocked string,series string,productionDate timestamp,bomCode > string,internalModels string, deliveryTime string, channelsId string,
[GitHub] carbondata pull request #1702: [CARBONDATA-1896] Clean files operation impro...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1702#discussion_r158519731 --- Diff: processing/src/main/java/org/apache/carbondata/processing/util/DeleteLoadFolders.java --- @@ -124,27 +148,50 @@ private static boolean checkIfLoadCanBeDeleted(LoadMetadataDetails oneLoad, return false; } + private static LoadMetadataDetails getCurrentLoadStatusOfSegment(String segmentId, + String metadataPath) { +LoadMetadataDetails[] currentDetails = SegmentStatusManager.readLoadMetadata(metadataPath); +for (LoadMetadataDetails oneLoad : currentDetails) { + if (oneLoad.getLoadName().equalsIgnoreCase(segmentId)) { +return oneLoad; + } +} +return null; + } + public static boolean deleteLoadFoldersFromFileSystem( AbsoluteTableIdentifier absoluteTableIdentifier, boolean isForceDelete, - LoadMetadataDetails[] details) { + LoadMetadataDetails[] details, String metadataPath) { boolean isDeleted = false; if (details != null && details.length != 0) { for (LoadMetadataDetails oneLoad : details) { if (checkIfLoadCanBeDeleted(oneLoad, isForceDelete)) { - String path = getSegmentPath(absoluteTableIdentifier, 0, oneLoad); - boolean deletionStatus = physicalFactAndMeasureMetadataDeletion(path); - if (deletionStatus) { -isDeleted = true; -oneLoad.setVisibility("false"); -LOGGER.info("Info: Deleted the load " + oneLoad.getLoadName()); + ICarbonLock segmentLock = CarbonLockFactory.getCarbonLockObj(absoluteTableIdentifier, + CarbonTablePath.addSegmentPrefix(oneLoad.getLoadName()) + LockUsage.LOCK); + try { +if (segmentLock.lockWithRetries()) { --- End diff -- It will add up time for each time dataload happens while trying to take lock if parallel load happens. Better add another method `lockWithRetries` which should take very less time while acquiring lock ---
[jira] [Resolved] (CARBONDATA-1930) Dictionary not found exception is thrown when filter expression is given in aggergate table query
[ https://issues.apache.org/jira/browse/CARBONDATA-1930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kumar vishal resolved CARBONDATA-1930. -- Resolution: Fixed Fix Version/s: 1.3.0 > Dictionary not found exception is thrown when filter expression is given in > aggergate table query > - > > Key: CARBONDATA-1930 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1930 > Project: CarbonData > Issue Type: Bug >Reporter: Kunal Kapoor >Assignee: Kunal Kapoor >Priority: Minor > Fix For: 1.3.0 > > Time Spent: 1h 40m > Remaining Estimate: 0h > > Steps to reproduce; > 1. CREATE TABLE filtertable(id int, name string, city string, age string) > STORED BY 'org.apache.carbondata.format' > TBLPROPERTIES('dictionary_include'='name,age') > 2. LOAD DATA LOCAL INPATH > 3. create datamap agg9 on table filtertable using 'preaggregate' as select > name, age, sum(age) from filtertable group by name, age > 4. select name, sum(age) from filtertable where age = '29' group by name, age -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] carbondata pull request #1710: [CARBONDATA-1930] Added condition to refer to...
Github user asfgit closed the pull request at: https://github.com/apache/carbondata/pull/1710 ---
[GitHub] carbondata issue #1716: [CARBONDATA-1933] Support Spark 2.2.1 in carbon part...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/1716 retest sdv please ---
[GitHub] carbondata issue #1719: [WIP] [CARBONDATA-1731,CARBONDATA-1728] Update fails...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/1719 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2522/ ---
[GitHub] carbondata issue #1708: [CARBONDATA-1928] Seperate the properties for timeou...
Github user KanakaKumar commented on the issue: https://github.com/apache/carbondata/pull/1708 retest sdv please ---