[GitHub] carbondata issue #1032: [CARBONDATA-1149] Fixed range info overlapping value...
Github user manishgupta88 commented on the issue: https://github.com/apache/carbondata/pull/1032 Not required as partition feature is re-implemented. ---
[GitHub] carbondata pull request #1830: [CARBONDATA-2051] Added like query ends with ...
GitHub user manishgupta88 opened a pull request: https://github.com/apache/carbondata/pull/1830 [CARBONDATA-2051] Added like query ends with and contains with filter push down suport to carbondata **Problem** Current like filter with start with expression is only pushed down to carbondata. In case of ends with and contains like filter all the data is given back to spark and then spark applies the filter on it. This behavior is fine for the queries which has lesser number of queried columns. But as the number of columns and data increases there is performance impact because the data being sent to spark becomes more thereby increasing the IO. If like filter is push down then first filter column is read and blocks are pruned. In this cases the data returned to the spark is after applying the filter and only blocklets matching the data are fully read. This reduces IO and increases the query performance. **Solution** Modify code to push down like query with ends and contains with filter - [ ] Any interfaces changed? No - [ ] Any backward compatibility impacted? No - [ ] Document update required? No - [ ] Testing done Added test case to verify push down is happening - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. NA You can merge this pull request into a Git repository by running: $ git pull https://github.com/manishgupta88/carbondata like_query_pushdown Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/1830.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1830 commit bff9fbf316941d0c732c04d3c9b7e775285ec893 Author: manishgupta88 <tomanishgupta18@...> Date: 2018-01-18T08:53:17Z Added like query ends with and contains with filter push down suport to carbondata ---
[GitHub] carbondata pull request #1810: [CARBONDATA-2037]Store carbondata locations i...
Github user manishgupta88 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1810#discussion_r162253805 --- Diff: core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMap.java --- @@ -598,16 +627,16 @@ public boolean isScanRequired(FilterResolverIntf filterExp) { if (filterExp == null) { int rowCount = unsafeMemoryDMStore.getRowCount(); for (int i = 0; i < rowCount; i++) { -DataMapRow unsafeRow = unsafeMemoryDMStore.getUnsafeRow(i); -blocklets.add(createBlocklet(unsafeRow, i)); +DataMapRow safeRow = unsafeMemoryDMStore.getUnsafeRow(i).convertToSafeRow(); +blocklets.add(createBlocklet(safeRow, i)); } } else { int startIndex = findStartIndex(convertToRow(searchStartKey), comparator); int endIndex = findEndIndex(convertToRow(searchEndKey), comparator); FilterExecuter filterExecuter = FilterUtil.getFilterExecuterTree(filterExp, segmentProperties, null); while (startIndex <= endIndex) { -DataMapRow unsafeRow = unsafeMemoryDMStore.getUnsafeRow(startIndex); +DataMapRow unsafeRow = unsafeMemoryDMStore.getUnsafeRow(startIndex).convertToSafeRow(); --- End diff -- Please rename this variable also to safeRow ---
[GitHub] carbondata pull request #1813: [CARBONDATA-2015] Restricted maximum length o...
Github user manishgupta88 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1813#discussion_r161963254 --- Diff: processing/src/main/java/org/apache/carbondata/processing/loading/converter/impl/NonDictionaryFieldConverterImpl.java --- @@ -70,13 +70,14 @@ public NonDictionaryFieldConverterImpl(DataField dataField, String nullformat, i try { byte[] value = DataTypeUtil .getBytesBasedOnDataTypeForNoDictionaryColumn(dimensionValue, dataType, dateFormat); -if (dataType == DataTypes.STRING) { - assert value.length <= CarbonCommonConstants.MAX_CHARS_PER_COLUMN_DEFAULT; +if (dataType == DataTypes.STRING +&& value.length > CarbonCommonConstants.MAX_CHARS_PER_COLUMN_DEFAULT) { + throw new CarbonDataLoadingException("Dataload failed, String size cannot exceed " + + CarbonCommonConstants.MAX_CHARS_PER_COLUMN_DEFAULT + " bytes"); } row.update(value, index); - } catch (AssertionError ae) { -throw new CarbonDataLoadingException("Dataload failed, String size cannot exceed " -+ CarbonCommonConstants.MAX_CHARS_PER_COLUMN_DEFAULT + " bytes"); + } catch (CarbonDataLoadingException e) { --- End diff -- Kindly check and confirm if CarbonDataLoadingException can come from any other place also in this flow ---
[GitHub] carbondata pull request #1818: [CARBONDATA-2020][Old Store Support] Add filt...
GitHub user manishgupta88 opened a pull request: https://github.com/apache/carbondata/pull/1818 [CARBONDATA-2020][Old Store Support] Add filter support for old store reading to improve query performance **Problem** For old stores blocklet level min/max comparison was not happening in the executor side due to which all the blocklets were getting scanned. This increased the IO and scanning time in the executor. **Solution** Modified code to retrieve the min/max value from blocklet node and use it for comparsion while scanning for valid blocklets. **Note:** This PR need to be merged after #1796 - [ ] Any interfaces changed? No - [ ] Any backward compatibility impacted? No - [ ] Document update required? No - [ ] Testing done Yes. Manually verified - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. NA You can merge this pull request into a Git repository by running: $ git pull https://github.com/manishgupta88/carbondata filter_support_for_old_store Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/1818.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1818 commit b0c7f1211c74f478fe5f561872b655a1869d6756 Author: manishgupta88 <tomanishgupta18@...> Date: 2018-01-17T05:57:09Z Add filter support for old store reading ---
[GitHub] carbondata issue #1789: [CARBONDATA-2020] Fix avoid reading of all block inf...
Github user manishgupta88 commented on the issue: https://github.com/apache/carbondata/pull/1789 LGTM ---
[GitHub] carbondata pull request #1810: [WIP]Store carbondata locations in datamap to...
Github user manishgupta88 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1810#discussion_r161739311 --- Diff: core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMap.java --- @@ -604,18 +731,26 @@ private ExtendedBlocklet createBlocklet(DataMapRow row, int blockletId) { detailInfo.setBlockletId((short) blockletId); detailInfo.setDimLens(columnCardinality); detailInfo.setSchemaUpdatedTimeStamp(row.getLong(SCHEMA_UPADATED_TIME_INDEX)); -BlockletInfo blockletInfo = new BlockletInfo(); -try { - byte[] byteArray = row.getByteArray(BLOCK_INFO_INDEX); - ByteArrayInputStream stream = new ByteArrayInputStream(byteArray); - DataInputStream inputStream = new DataInputStream(stream); - blockletInfo.readFields(inputStream); - inputStream.close(); -} catch (IOException e) { - throw new RuntimeException(e); +byte[] byteArray = row.getByteArray(BLOCK_INFO_INDEX); +BlockletInfo blockletInfo = null; +if (byteArray.length > 0) { + try { +blockletInfo = new BlockletInfo(); +ByteArrayInputStream stream = new ByteArrayInputStream(byteArray); +DataInputStream inputStream = new DataInputStream(stream); +blockletInfo.readFields(inputStream); +inputStream.close(); +blocklet.setLocation( --- End diff -- I think this should be outside if block because even if blocklet info does not exist in case of old store still we need to set the file location ---
[GitHub] carbondata issue #1785: [CARBONDATA-2015] Restricted maximum length of bytes...
Github user manishgupta88 commented on the issue: https://github.com/apache/carbondata/pull/1785 LGTM ---
[GitHub] carbondata pull request #1796: [WIP] Modified code to add relative blocklet ...
GitHub user manishgupta88 opened a pull request: https://github.com/apache/carbondata/pull/1796 [WIP] Modified code to add relative blocklet id during initialization in the blocklet data map Modified code to add relative blocklet id during initialization in the blocklet data map Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [ ] Any interfaces changed? - [ ] Any backward compatibility impacted? - [ ] Document update required? - [ ] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? - How it is tested? Please attach test report. - Is it a performance related change? Please attach the performance test report. - Any additional information to help reviewers in testing this change. - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. You can merge this pull request into a Git repository by running: $ git pull https://github.com/manishgupta88/carbondata add_blocklet_id_to_data_map Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/1796.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1796 commit 0ba193ef4a53f6bd480c259126622f18c7e80ad8 Author: manishgupta88 <tomanishgupta18@...> Date: 2018-01-12T12:58:56Z Modified code to add relative blocklet id during initialization in the blocklet data map ---
[GitHub] carbondata issue #1769: [CARBONDATA-2014]update table status for failure onl...
Github user manishgupta88 commented on the issue: https://github.com/apache/carbondata/pull/1769 LGTM...will merge once all build run ---
[GitHub] carbondata pull request #1785: [CARBONDATA-2015] Restricted maximum length o...
Github user manishgupta88 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1785#discussion_r161144886 --- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/spark/util/CarbonScalaUtil.scala --- @@ -113,6 +113,9 @@ object CarbonScalaUtil { case s: String => if (s.length > CSVInputFormat.MAX_CHARS_PER_COLUMN_DEFAULT) { throw new Exception("Dataload failed, String length cannot exceed " + CSVInputFormat.MAX_CHARS_PER_COLUMN_DEFAULT + " characters") +} else if (ByteUtil.toBytes(s).length > CSVInputFormat.MAX_CHARS_PER_COLUMN_DEFAULT) { --- End diff -- this can impact the data load performance. Move this check to CarbonDictionaryWriterImpl class where we are already converting the string values to byte array. This will not add any extra cost for for conversion of string to byte array ---
[GitHub] carbondata pull request #1785: [CARBONDATA-2015] Restricted maximum length o...
Github user manishgupta88 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1785#discussion_r161144455 --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/dataload/TestLoadDataGeneral.scala --- @@ -187,6 +187,75 @@ class TestLoadDataGeneral extends QueryTest with BeforeAndAfterAll { } catch { case _:Exception => assert(true) } + } + + test("test load / insert / update with data more than 32000 bytes - dictionary_exclude") { +val testdata = s"$resourcesPath/unicodechar.csv" +sql("drop table if exists load32000bytes") +sql("create table load32000bytes(name string) stored by 'carbondata'") +sql("insert into table load32000bytes select 'aaa'") + +assert(intercept[Exception] { + sql(s"load data local inpath '$testdata' into table load32000bytes OPTIONS ('FILEHEADER'='name')") +}.getMessage.equals("DataLoad failure: Dataload failed, String size cannot exceed 32000 bytes")) + +val source = scala.io.Source.fromFile(testdata) +val data = source.mkString + +try { + sql(s"insert into load32000bytes values('$data')") + assert(false) +} catch { + case _: Exception => +assert(true) +} + +try { --- End diff -- same comment as above ---
[GitHub] carbondata pull request #1785: [CARBONDATA-2015] Restricted maximum length o...
Github user manishgupta88 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1785#discussion_r161144939 --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/dataload/TestLoadDataGeneral.scala --- @@ -187,6 +187,75 @@ class TestLoadDataGeneral extends QueryTest with BeforeAndAfterAll { } catch { case _:Exception => assert(true) } + } + + test("test load / insert / update with data more than 32000 bytes - dictionary_exclude") { +val testdata = s"$resourcesPath/unicodechar.csv" +sql("drop table if exists load32000bytes") +sql("create table load32000bytes(name string) stored by 'carbondata'") +sql("insert into table load32000bytes select 'aaa'") + +assert(intercept[Exception] { + sql(s"load data local inpath '$testdata' into table load32000bytes OPTIONS ('FILEHEADER'='name')") +}.getMessage.equals("DataLoad failure: Dataload failed, String size cannot exceed 32000 bytes")) + +val source = scala.io.Source.fromFile(testdata) +val data = source.mkString + +try { + sql(s"insert into load32000bytes values('$data')") + assert(false) +} catch { + case _: Exception => +assert(true) +} + +try { + sql(s"update load32000bytes set(name)= ('$data')").show() + assert(false) +} catch { + case _: Exception => +assert(true) +} + +sql("drop table if exists load32000bytes") + + } + + test("test load / insert / update with data more than 32000 bytes - dictionary_include") { +val testdata = s"$resourcesPath/unicodechar.csv" +sql("drop table if exists load32000bytes") +sql("create table load32000bytes(name string) stored by 'carbondata' TBLPROPERTIES('DICTIONARY_INCLUDE'='name')") +sql("insert into table load32000bytes select 'aaa'") + +val error = intercept[Exception] { + sql(s"load data local inpath '$testdata' into table load32000bytes OPTIONS ('FILEHEADER'='name')") +}.getMessage + +assert(intercept[Exception] { + sql(s"load data local inpath '$testdata' into table load32000bytes OPTIONS ('FILEHEADER'='name')") +}.getMessage.contains("generate global dictionary failed, Dataload failed, String size cannot exceed 32000 bytes")) + +val source = scala.io.Source.fromFile(testdata) +val data = source.mkString + +try { + sql(s"insert into load32000bytes values('$data')") + assert(false) +} catch { + case _: Exception => +assert(true) +} + +try { --- End diff -- avoid using try catch at all the places in the modified code ---
[GitHub] carbondata pull request #1785: [CARBONDATA-2015] Restricted maximum length o...
Github user manishgupta88 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1785#discussion_r161144435 --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/dataload/TestLoadDataGeneral.scala --- @@ -187,6 +187,75 @@ class TestLoadDataGeneral extends QueryTest with BeforeAndAfterAll { } catch { case _:Exception => assert(true) } + } + + test("test load / insert / update with data more than 32000 bytes - dictionary_exclude") { +val testdata = s"$resourcesPath/unicodechar.csv" +sql("drop table if exists load32000bytes") +sql("create table load32000bytes(name string) stored by 'carbondata'") +sql("insert into table load32000bytes select 'aaa'") + +assert(intercept[Exception] { + sql(s"load data local inpath '$testdata' into table load32000bytes OPTIONS ('FILEHEADER'='name')") +}.getMessage.equals("DataLoad failure: Dataload failed, String size cannot exceed 32000 bytes")) + +val source = scala.io.Source.fromFile(testdata) +val data = source.mkString + +try { + sql(s"insert into load32000bytes values('$data')") --- End diff -- avoid using try catch and use intercept[Exception] ---
[GitHub] carbondata pull request #1769: [CARBONDATA-2014]update table status for fail...
Github user manishgupta88 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1769#discussion_r161143694 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonLoadDataCommand.scala --- @@ -251,7 +255,9 @@ case class CarbonLoadDataCommand( case ex: Exception => LOGGER.error(ex) // update the load entry in table status file for changing the status to marked for delete - CarbonLoaderUtil.updateTableStatusForFailure(carbonLoadModel) + if (isUpdateTableStatusRequired && !table.isHivePartitionTable) { --- End diff -- !table.isHivePartitionTable...this condition is not required while calling updateTableStatusForFailure as boolean flag isUpdateTableStatusRequired will be set ti true only if the table is not a hive partition table ---
[GitHub] carbondata issue #1782: [CARBONDATA-2019] Enhancement of merge index compact...
Github user manishgupta88 commented on the issue: https://github.com/apache/carbondata/pull/1782 @ravipesala ..handled review comments...kindly review and merge ---
[GitHub] carbondata issue #1782: [WIP] Changes for creating carbon index merge file f...
Github user manishgupta88 commented on the issue: https://github.com/apache/carbondata/pull/1782 retest sdv please ---
[GitHub] carbondata pull request #1782: [WIP] Changes for creating carbon index merge...
GitHub user manishgupta88 opened a pull request: https://github.com/apache/carbondata/pull/1782 [WIP] Changes for creating carbon index merge file from old store which did not contain the blocklet info in index information Changes for creating carbon index merge file from old store which did not contain the blocklet info in index information - [ ] Any interfaces changed? - [ ] Any backward compatibility impacted? - [ ] Document update required? - [ ] Testing done - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. You can merge this pull request into a Git repository by running: $ git pull https://github.com/manishgupta88/carbondata enhace_segment_index_compaction Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/1782.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1782 commit 5f4a75f7395c8949a5ccf53030ebdcba5411022f Author: manishgupta88 <tomanishgupta18@...> Date: 2018-01-09T15:02:36Z changes for creating carbon index merge file from old store which did not contain the blocklet info in index information ---
[GitHub] carbondata issue #1702: [CARBONDATA-1896] Clean files operation improvement
Github user manishgupta88 commented on the issue: https://github.com/apache/carbondata/pull/1702 LGTM ---
[GitHub] carbondata issue #1718: [CARBONDATA-1929][Validation]carbon property configu...
Github user manishgupta88 commented on the issue: https://github.com/apache/carbondata/pull/1718 LGTM ---
[GitHub] carbondata issue #1734: [CARBONDATA-1912] Handling lock issues for alter ren...
Github user manishgupta88 commented on the issue: https://github.com/apache/carbondata/pull/1734 LGTM ---
[GitHub] carbondata issue #1734: [CARBONDATA-1912] Handling lock issues for alter ren...
Github user manishgupta88 commented on the issue: https://github.com/apache/carbondata/pull/1734 LGTMwill merge after all builds run ---
[GitHub] carbondata issue #1760: [CARBONDATA-1979] ][IMPLICIT COLUMN] Modified implic...
Github user manishgupta88 commented on the issue: https://github.com/apache/carbondata/pull/1760 LGTM ---
[GitHub] carbondata issue #1740: [CARBONDATA-1949, CARBONDATA-1950] Fixed bug related...
Github user manishgupta88 commented on the issue: https://github.com/apache/carbondata/pull/1740 LGTM..I will merge once all build complete ---
[GitHub] carbondata pull request #1747: [Compatibility] Added changes for backward co...
Github user manishgupta88 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1747#discussion_r159589331 --- Diff: core/src/main/java/org/apache/carbondata/core/util/path/CarbonTablePath.java --- @@ -231,7 +231,22 @@ public String getSchemaFilePath() { * @return schema file path */ public static String getSchemaFilePath(String tablePath) { -return tablePath + File.separator + METADATA_DIR + File.separator + SCHEMA_FILE; +return getActualSchemaFilePath(tablePath); + } + + private static String getActualSchemaFilePath(String tablePath) { +String metaPath = tablePath + File.separator + METADATA_DIR; --- End diff -- use CarbonCommonConstants.FILE_SEPARATOR instead of File.separator ---
[GitHub] carbondata pull request #1747: [Compatibility] Added changes for backward co...
Github user manishgupta88 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1747#discussion_r159589266 --- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/spark/util/CarbonScalaUtil.scala --- @@ -291,4 +291,52 @@ object CarbonScalaUtil { }) otherFields } + + /** + * If the table is from an old store then the table parameters are in lowercase. In the current + * code we are reading the parameters as camel case. + * This method will convert all the schema parts to camel case + * + * @param parameters + * @return + */ + def getDeserializedParameters(parameters: Map[String, String]): Map[String, String] = { --- End diff -- there is no calling point for this method ---
[GitHub] carbondata pull request #1747: [Compatibility] Added changes for backward co...
Github user manishgupta88 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1747#discussion_r159589250 --- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/spark/util/CarbonScalaUtil.scala --- @@ -291,4 +291,52 @@ object CarbonScalaUtil { }) otherFields } + + /** + * If the table is from an old store then the table parameters are in lowercase. In the current + * code we are reading the parameters as camel case. + * This method will convert all the schema parts to camel case + * + * @param parameters + * @return + */ + def getDeserializedParameters(parameters: Map[String, String]): Map[String, String] = { +val keyParts = parameters.getOrElse("spark.sql.sources.options.keys.numparts", "0").toInt +if (keyParts == 0) { + parameters +} else { + var keyStr = "" + for(i <- 0 until keyParts) { +keyStr += parameters(s"spark.sql.sources.options.keys.part.$i") + } + val finalProperties = scala.collection.mutable.Map.empty[String, String] + keyStr.split(",") foreach { --- End diff -- Please check the logic here. Keystr is not appended explicitly with , but splitting is done based on comma ---
[GitHub] carbondata pull request #1082: [CARBONDATA-1218] [GLOBAL SORT] In case of da...
Github user manishgupta88 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1082#discussion_r159586543 --- Diff: processing/src/main/java/org/apache/carbondata/processing/util/CarbonDataProcessorUtil.java --- @@ -145,6 +147,15 @@ public static void renameBadRecordsFromInProgressToNormal( } } + public static void renameBadRecord(CarbonDataLoadConfiguration configuration) { +// rename the bad record in progress to normal +CarbonTableIdentifier identifier = +configuration.getTableIdentifier().getCarbonTableIdentifier(); + CarbonDataProcessorUtil.renameBadRecordsFromInProgressToNormal(configuration, --- End diff -- 1. Avoid calling the method using classname. Method is present in the same class. 2. I think we can introduce a new class called BadRecordsUtil which will have all the functions related to bad records like renameBadRecord, hasBadRecord, renameBadRecordsFromInProgressToNormal (make it private in the new util class) ---
[GitHub] carbondata pull request #1082: [CARBONDATA-1218] [GLOBAL SORT] In case of da...
Github user manishgupta88 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1082#discussion_r159584326 --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/badrecordloger/BadRecordLoggerTest.scala --- @@ -247,6 +248,33 @@ class BadRecordLoggerTest extends QueryTest with BeforeAndAfterAll { } } + test( +"test if first load failed due to bad record then second load should not failed if there is " + +"no bad record") { +sql("drop table IF EXISTS loadIssue") +sql("""CREATE TABLE IF NOT EXISTS loadIssue(ID BigInt, item String) STORED BY 'carbondata'""") + CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_BAD_RECORDS_ACTION, + "FAIL").addProperty(CarbonCommonConstants.LOAD_SORT_SCOPE, "GLOBAL_SORT") +try { + sql("insert into loadIssue select 'x','Book'"); +} catch { + case ex: Exception => +assert(true) +} +try { + sql("insert into loadIssue select 1,'Book'"); +} catch { + case ex: Exception => +assert(false) +} finally { + CarbonProperties.getInstance() +.addProperty(CarbonCommonConstants.CARBON_BAD_RECORDS_ACTION, "FORCE") +.addProperty(CarbonCommonConstants.LOAD_SORT_SCOPE, "LOCAL_SORT") +} +assert(true) +sql("drop table IF EXISTS loadIssue") + } + --- End diff -- 1. Use only one try and finally block inside test code..try should start from first line and end at last and finally should only reset the carbon property 2. Instead of try catch use intercept[Exception] only for the 1st case where exception is expected. For 2nd case try catch is not required...if exception comes anyways test case will fail ---
[GitHub] carbondata pull request #1732: [CARBONDATA-1946] Exception thrown after alte...
Github user manishgupta88 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1732#discussion_r159585172 --- Diff: core/src/main/java/org/apache/carbondata/core/util/DataTypeUtil.java --- @@ -775,4 +775,25 @@ public static DataType valueOf(String name) { } } -} \ No newline at end of file + /** + * Method to type case the data based on modified data type. This method will used for + * retrieving the data after change in data type restructure operation + * + * @param data + * @param restructuredDataType + * @param currentDataOffset + * @param length + * @return + */ + public static long getDataBasedOnRestructuredDataType(byte[] data, DataType restructuredDataType, --- End diff -- yes because alter datatype change is only for Int to BigInt and decimal from lower precision to higher precision. So float to double case is not required to be handled ---
[GitHub] carbondata pull request #1732: [CARBONDATA-1946] Exception thrown after alte...
Github user manishgupta88 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1732#discussion_r159585515 --- Diff: integration/spark2/src/test/scala/org/apache/spark/carbondata/restructure/vectorreader/ChangeDataTypeTestCases.scala --- @@ -164,6 +164,21 @@ class ChangeDataTypeTestCases extends Spark2QueryTest with BeforeAndAfterAll { sql("drop table if exists PreAggMain_preagg1") } + test("test data type change for dictionary exclude INT type column") { --- End diff -- Alter table datatype change does not support short to int, short to long and float to double. It only supports Int to BigInt and decimal from lower precision to higher precision. If any other data type change request comes validation is done and exception is thrown ---
[GitHub] carbondata issue #1754: [CARBONDATA-1975] Wrong input metrics displayed for ...
Github user manishgupta88 commented on the issue: https://github.com/apache/carbondata/pull/1754 LGTM...please attach the test results snapshots when the jira and github issue is resolved ---
[GitHub] carbondata issue #1734: [CARBONDATA-1912] Handling lock issues for alter ren...
Github user manishgupta88 commented on the issue: https://github.com/apache/carbondata/pull/1734 @ManoharVanam ...In that case can we change the logger error to warning in case of non existence of file and check whether still the warning log is getting printed. If warning logs are not getting printed in the console then we can make change in the logic instead changing the logic ---
[GitHub] carbondata issue #1740: [CARBONDATA-1949, CARBONDATA-1950] Fixed bug related...
Github user manishgupta88 commented on the issue: https://github.com/apache/carbondata/pull/1740 @geetikagupta16 Can you please add a test case for describe formatted command wherein you can take all the rows from 1st column and assert for total rows and match each row value with expected output...Also resolve the conflicts ---
[GitHub] carbondata issue #1734: [CARBONDATA-1912] Handling lock issues for alter ren...
Github user manishgupta88 commented on the issue: https://github.com/apache/carbondata/pull/1734 @ManoharVanam...what is need of releasing the locks in catch block when finally block assures that lock is released? ---
[GitHub] carbondata issue #1732: [CARBONDATA-1946] Exception thrown after alter data ...
Github user manishgupta88 commented on the issue: https://github.com/apache/carbondata/pull/1732 retest this please ---
[GitHub] carbondata issue #1718: [CARBONDATA-1929][Validation]carbon property configu...
Github user manishgupta88 commented on the issue: https://github.com/apache/carbondata/pull/1718 @mohammadshahidkhan ..IDG update is required for the min and max values of the properties in this PR ---
[GitHub] carbondata issue #1703: [CARBONDATA-1917] While loading, check for stale dic...
Github user manishgupta88 commented on the issue: https://github.com/apache/carbondata/pull/1703 @dhatchayani ...From my perspective I think this PR is not required. This is a case when user deletes the dictionary and sort index file intentionally from back end from the system and keeps only the dictionary meta file. But in any case user should not be allowed to delete any of the carbon store files. Therefore I think we should not handle this scenario. ---
[GitHub] carbondata issue #1723: [CARBONDATA-1939] Added show segments validation tes...
Github user manishgupta88 commented on the issue: https://github.com/apache/carbondata/pull/1723 LGTM ---
[GitHub] carbondata issue #1732: [CARBONDATA-1946] Exception thrown after alter data ...
Github user manishgupta88 commented on the issue: https://github.com/apache/carbondata/pull/1732 retest this please ---
[GitHub] carbondata pull request #1732: [WIP] Fixed select query failure after alter ...
GitHub user manishgupta88 opened a pull request: https://github.com/apache/carbondata/pull/1732 [WIP] Fixed select query failure after alter change data type operation for dictionary exclude columns Fixed select query failure after alter change data type operation for dictionary exclude columns Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [ ] Any interfaces changed? - [ ] Any backward compatibility impacted? - [ ] Document update required? - [ ] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? - How it is tested? Please attach test report. - Is it a performance related change? Please attach the performance test report. - Any additional information to help reviewers in testing this change. - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. You can merge this pull request into a Git repository by running: $ git pull https://github.com/manishgupta88/carbondata select_ater_change_datatype_fix Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/1732.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1732 commit 31baf92318a033e0ea2fba06f4cc94dc38146770 Author: manishgupta88 <tomanishgupta18@...> Date: 2017-12-27T17:39:58Z Handle select query failure after alter change data tye operation for dictionary exclude columns ---
[GitHub] carbondata issue #1721: [CARBONDATA-1822] Documentation - Added REFRESH TABL...
Github user manishgupta88 commented on the issue: https://github.com/apache/carbondata/pull/1721 LGTM ---
[GitHub] carbondata issue #1708: [CARBONDATA-1928] Seperate the properties for timeou...
Github user manishgupta88 commented on the issue: https://github.com/apache/carbondata/pull/1708 LGTM ---
[GitHub] carbondata issue #1715: [CARBONDATA-1934] Incorrect results are returned by ...
Github user manishgupta88 commented on the issue: https://github.com/apache/carbondata/pull/1715 retest this please ---
[GitHub] carbondata issue #1708: [CARBONDATA-1928] Seperate the properties for timeou...
Github user manishgupta88 commented on the issue: https://github.com/apache/carbondata/pull/1708 LGTM ---
[GitHub] carbondata issue #1696: [CARBONDATA-1884] SDV test cases for CTAS support to...
Github user manishgupta88 commented on the issue: https://github.com/apache/carbondata/pull/1696 LGTM ---
[GitHub] carbondata pull request #1715: [CARBONDATA-1934] Incorrect results are retur...
GitHub user manishgupta88 opened a pull request: https://github.com/apache/carbondata/pull/1715 [CARBONDATA-1934] Incorrect results are returned by select query in case when the number of blocklets for one part file are > 1 in the same task Problem: When a select query is triggered, driver will prune the segments and give a list of blocklets that need to be scanned. The number of tasks from spark will be equal to the number of blocklets identified. In case where one task has more than one blocklet for same file, then BlockExecution getting formed is incorrect. Due to this the query results are incorrect. Fix: Use the abstract index to fill all the details in BlockExecutionInfo - [ ] Any interfaces changed? No - [ ] Any backward compatibility impacted? No - [ ] Document update required? No - [ ] Testing done Manual testing - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. NA You can merge this pull request into a Git repository by running: $ git pull https://github.com/manishgupta88/carbondata data_loss_fix Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/1715.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1715 commit b0c518d4aa7d4b2387899deefc0f9ed39b5c463c Author: manishgupta88 <tomanishgupta18@...> Date: 2017-12-22T10:35:31Z Incorrect results are returned by select query in case when the number of blocklets for one part file are > 1 in the same task ---
[GitHub] carbondata issue #1311: [CARBONDATA-1439] Wrong Error message shown for Bad ...
Github user manishgupta88 commented on the issue: https://github.com/apache/carbondata/pull/1311 LGTM ---
[GitHub] carbondata issue #1126: [CARBONDATA-1258] CarbonData should not allow loadin...
Github user manishgupta88 commented on the issue: https://github.com/apache/carbondata/pull/1126 LGTM ---
[GitHub] carbondata issue #1311: [CARBONDATA-1439] Wrong Error message shown for Bad ...
Github user manishgupta88 commented on the issue: https://github.com/apache/carbondata/pull/1311 retest this please ---
[GitHub] carbondata issue #1670: [CARBONDATA-1899] Add CarbonData concurrency test ca...
Github user manishgupta88 commented on the issue: https://github.com/apache/carbondata/pull/1670 LGTM ---
[GitHub] carbondata issue #1670: [CARBONDATA-1899] Add CarbonData concurrency test ca...
Github user manishgupta88 commented on the issue: https://github.com/apache/carbondata/pull/1670 LGTM...once build runs I will merge ---
[GitHub] carbondata issue #1676: [CARBONDATA-1906] Update registerTempTable method be...
Github user manishgupta88 commented on the issue: https://github.com/apache/carbondata/pull/1676 @xubo245 ..actual modification of code is in 2-3 files, remaining files have been modified for import changes..kindly revert unwanted file changes ---
[GitHub] carbondata pull request #1670: [CARBONDATA-1899] Add CarbonData concurrency ...
Github user manishgupta88 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1670#discussion_r157947962 --- Diff: examples/spark2/src/main/scala/org/apache/carbondata/examples/ConcurrencyTest.scala --- @@ -0,0 +1,355 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.examples + +import java.io.File +import java.util +import java.util.concurrent.{Callable, Executors, Future, TimeUnit} + +import scala.util.Random + +import org.apache.spark.sql.{DataFrame, Row, SaveMode, SparkSession} +import org.apache.spark.sql.types._ + +import org.apache.carbondata.core.constants.CarbonCommonConstants +import org.apache.carbondata.core.util.CarbonProperties + +// scalastyle:off println +object ConcurrencyTest { + + var totalNum = 100 * 1000 * 1000 + var ThreadNum = 16 + var TaskNum = 100 + var ResultIsEmpty = true + val cardinalityId = 1 * 1 + val cardinalityCity = 6 + + def parquetTableName: String = "comparetest_parquet" + + def orcTableName: String = "comparetest_orc" + + def carbonTableName(version: String): String = s"comparetest_carbonV$version" + + // Table schema: + // +-+---+-+-++ + // | id | string| 100,000,000 | dimension | no | + // +-+---+-+-++ + // | Column name | Data type | Cardinality | Column type | Dictionary | + // +-+---+-+-++ + // | city| string| 6 | dimension | yes| + // +-+---+-+-++ + // | country | string| 6 | dimension | yes| + // +-+---+-+-++ + // | planet | string| 100,007 | dimension | yes| + // +-+---+-+-++ + // | m1 | short | NA | measure | no | + // +-+---+-+-++ + // | m2 | int | NA | measure | no | + // +-+---+-+-++ + // | m3 | big int | NA | measure | no | + // +-+---+-+-++ + // | m4 | double| NA | measure | no | + // +-+---+-+-++ + // | m5 | decimal | NA | measure | no | + // +-+---+-+-++ + + private def generateDataFrame(spark: SparkSession): DataFrame = { +val rdd = spark.sparkContext + .parallelize(1 to totalNum, 4) + .map { x => +((x % 1).toString, "city" + x % 6, "country" + x % 6, "planet" + x % 10007, + (x % 16).toShort, x / 2, (x << 1).toLong, x.toDouble / 13, + BigDecimal.valueOf(x.toDouble / 11)) + }.map { x => + Row(x._1, x._2, x._3, x._4, x._5, x._6, x._7, x._8, x._9) +} + +val schema = StructType( + Seq( +StructField("id", StringType, nullable = false), +StructField("city", StringType, nullable = false), +StructField("country", StringType, nullable = false), +StructField("planet", StringType, nullable = false), +StructField("m1", ShortType, nullable = false), +StructField("m2", IntegerType, nullable = false), +StructField("m3", LongType, nullable = false), +StructField("
[GitHub] carbondata pull request #1670: [CARBONDATA-1899] Add CarbonData concurrency ...
Github user manishgupta88 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1670#discussion_r157947459 --- Diff: examples/spark2/src/main/scala/org/apache/carbondata/examples/ConcurrencyTest.scala --- @@ -0,0 +1,355 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.examples + +import java.io.File +import java.util +import java.util.concurrent.{Callable, Executors, Future, TimeUnit} + +import scala.util.Random + +import org.apache.spark.sql.{DataFrame, Row, SaveMode, SparkSession} +import org.apache.spark.sql.types._ + +import org.apache.carbondata.core.constants.CarbonCommonConstants +import org.apache.carbondata.core.util.CarbonProperties + +// scalastyle:off println +object ConcurrencyTest { + + var totalNum = 100 * 1000 * 1000 + var ThreadNum = 16 + var TaskNum = 100 + var ResultIsEmpty = true + val cardinalityId = 1 * 1 + val cardinalityCity = 6 + + def parquetTableName: String = "comparetest_parquet" + + def orcTableName: String = "comparetest_orc" + + def carbonTableName(version: String): String = s"comparetest_carbonV$version" + + // Table schema: + // +-+---+-+-++ + // | id | string| 100,000,000 | dimension | no | + // +-+---+-+-++ + // | Column name | Data type | Cardinality | Column type | Dictionary | + // +-+---+-+-++ + // | city| string| 6 | dimension | yes| + // +-+---+-+-++ + // | country | string| 6 | dimension | yes| + // +-+---+-+-++ + // | planet | string| 100,007 | dimension | yes| + // +-+---+-+-++ + // | m1 | short | NA | measure | no | + // +-+---+-+-++ + // | m2 | int | NA | measure | no | + // +-+---+-+-++ + // | m3 | big int | NA | measure | no | + // +-+---+-+-++ + // | m4 | double| NA | measure | no | + // +-+---+-+-++ + // | m5 | decimal | NA | measure | no | + // +-+---+-+-++ + + private def generateDataFrame(spark: SparkSession): DataFrame = { +val rdd = spark.sparkContext + .parallelize(1 to totalNum, 4) + .map { x => +((x % 1).toString, "city" + x % 6, "country" + x % 6, "planet" + x % 10007, + (x % 16).toShort, x / 2, (x << 1).toLong, x.toDouble / 13, + BigDecimal.valueOf(x.toDouble / 11)) + }.map { x => + Row(x._1, x._2, x._3, x._4, x._5, x._6, x._7, x._8, x._9) +} + +val schema = StructType( + Seq( +StructField("id", StringType, nullable = false), +StructField("city", StringType, nullable = false), +StructField("country", StringType, nullable = false), +StructField("planet", StringType, nullable = false), +StructField("m1", ShortType, nullable = false), +StructField("m2", IntegerType, nullable = false), +StructField("m3", LongType, nullable = false), +StructField("
[GitHub] carbondata pull request #1670: [CARBONDATA-1899] Add CarbonData concurrency ...
Github user manishgupta88 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1670#discussion_r157946201 --- Diff: examples/spark2/src/main/scala/org/apache/carbondata/examples/ConcurrencyTest.scala --- @@ -0,0 +1,355 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.examples + +import java.io.File +import java.util +import java.util.concurrent.{Callable, Executors, Future, TimeUnit} + +import scala.util.Random + +import org.apache.spark.sql.{DataFrame, Row, SaveMode, SparkSession} +import org.apache.spark.sql.types._ + +import org.apache.carbondata.core.constants.CarbonCommonConstants +import org.apache.carbondata.core.util.CarbonProperties + +// scalastyle:off println +object ConcurrencyTest { + + var totalNum = 100 * 1000 * 1000 + var ThreadNum = 16 + var TaskNum = 100 + var ResultIsEmpty = true + val cardinalityId = 1 * 1 + val cardinalityCity = 6 + + def parquetTableName: String = "comparetest_parquet" + + def orcTableName: String = "comparetest_orc" + + def carbonTableName(version: String): String = s"comparetest_carbonV$version" + + // Table schema: + // +-+---+-+-++ + // | id | string| 100,000,000 | dimension | no | + // +-+---+-+-++ + // | Column name | Data type | Cardinality | Column type | Dictionary | + // +-+---+-+-++ + // | city| string| 6 | dimension | yes| + // +-+---+-+-++ + // | country | string| 6 | dimension | yes| + // +-+---+-+-++ + // | planet | string| 100,007 | dimension | yes| + // +-+---+-+-++ + // | m1 | short | NA | measure | no | + // +-+---+-+-++ + // | m2 | int | NA | measure | no | + // +-+---+-+-++ + // | m3 | big int | NA | measure | no | + // +-+---+-+-++ + // | m4 | double| NA | measure | no | + // +-+---+-+-++ + // | m5 | decimal | NA | measure | no | + // +-+---+-+-++ + + private def generateDataFrame(spark: SparkSession): DataFrame = { +val rdd = spark.sparkContext + .parallelize(1 to totalNum, 4) + .map { x => +((x % 1).toString, "city" + x % 6, "country" + x % 6, "planet" + x % 10007, + (x % 16).toShort, x / 2, (x << 1).toLong, x.toDouble / 13, + BigDecimal.valueOf(x.toDouble / 11)) + }.map { x => + Row(x._1, x._2, x._3, x._4, x._5, x._6, x._7, x._8, x._9) +} + +val schema = StructType( + Seq( +StructField("id", StringType, nullable = false), +StructField("city", StringType, nullable = false), +StructField("country", StringType, nullable = false), +StructField("planet", StringType, nullable = false), +StructField("m1", ShortType, nullable = false), +StructField("m2", IntegerType, nullable = false), +StructField("m3", LongType, nullable = false), +StructField("
[GitHub] carbondata issue #1670: [CARBONDATA-1899] Add CarbonData concurrency test ca...
Github user manishgupta88 commented on the issue: https://github.com/apache/carbondata/pull/1670 retest sdv please ---
[GitHub] carbondata issue #1126: [CARBONDATA-1258] CarbonData should not allow loadin...
Github user manishgupta88 commented on the issue: https://github.com/apache/carbondata/pull/1126 retest this please ---
[GitHub] carbondata issue #1679: [CARBONDATA-1907] Avoid unnecessary logging to impro...
Github user manishgupta88 commented on the issue: https://github.com/apache/carbondata/pull/1679 @kumarvishal09 ..please review ---
[GitHub] carbondata pull request #1679: [CARBONDATA-1907] Avoid unnecessary logging t...
GitHub user manishgupta88 opened a pull request: https://github.com/apache/carbondata/pull/1679 [CARBONDATA-1907] Avoid unnecessary logging to improve query performance for no dictionary non string columns Changes done to return null in case of no dictionary column for non string data types when data is empty. This is done to avoid excessive logging which is impacting the query performance. - [ ] Any interfaces changed? No - [ ] Any backward compatibility impacted? No - [ ] Document update required? No - [ ] Testing done Added UT - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. NA You can merge this pull request into a Git repository by running: $ git pull https://github.com/manishgupta88/carbondata empty_byte_fix Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/1679.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1679 commit f5690380ec6d0abaee89acd0bb8fda23eb7f3a33 Author: manishgupta88 <tomanishgupt...@gmail.com> Date: 2017-12-19T11:57:30Z Changes done to return null in case of no dictionary column for non string data types when data is empty. This is done to avoid excessive logging which is impacting the query performance. ---
[GitHub] carbondata issue #1666: [CARBONDATA-1900][Core,processing] Modify loadmetada...
Github user manishgupta88 commented on the issue: https://github.com/apache/carbondata/pull/1666 LGTM ---
[GitHub] carbondata issue #1126: [CARBONDATA-1258] CarbonData should not allow loadin...
Github user manishgupta88 commented on the issue: https://github.com/apache/carbondata/pull/1126 Correct the PR header also...remove dots from the end and put a proper message ---
[GitHub] carbondata pull request #1126: [CARBONDATA-1258] CarbonData should not allow...
Github user manishgupta88 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1126#discussion_r157700973 --- Diff: core/src/main/java/org/apache/carbondata/core/keygenerator/directdictionary/timestamp/DateDirectDictionaryGenerator.java --- @@ -42,12 +42,37 @@ private String dateFormat; + /** + * min value supported for date type column + */ + private static final long MIN_VALUE; + /** + * MAx value supported for date type column + */ + private static final long MAX_VALUE; /** * Logger instance */ private static final LogService LOGGER = LogServiceFactory.getLogService(DateDirectDictionaryGenerator.class.getName()); + static { +SimpleDateFormat df = new SimpleDateFormat("-MM-dd"); +df.setTimeZone(TimeZone.getTimeZone("GMT")); +long minValue = 0; +long maxValue = 0; +try { + minValue = df.parse("0001-01-01").getTime(); + maxValue = df.parse("-12-31").getTime(); +} catch (ParseException e) { --- End diff -- As you are defining the date format yourself and parsing it, ParseException will not be thrown. But as we need to handle this exception, you can just put a warning logger in catch block and take out the remaining logic. If we dont get any findbug for empty catch block then better to just put a comment in catch block and not do anything ---
[GitHub] carbondata issue #1665: [CARBONDATA-1884] Add CTAS support to carbondata
Github user manishgupta88 commented on the issue: https://github.com/apache/carbondata/pull/1665 retest this please ---
[GitHub] carbondata pull request #1665: [CARBONDATA-1884] Add CTAS support to carbond...
Github user manishgupta88 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1665#discussion_r157405351 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/hive/CarbonMetaStore.scala --- @@ -144,6 +145,15 @@ trait CarbonMetaStore { def getThriftTableInfo(tablePath: CarbonTablePath)(sparkSession: SparkSession): TableInfo def getTableFromMetadataCache(database: String, tableName: String): Option[CarbonTable] + + def getCarbonDataSourceHadoopRelation(sparkSession: SparkSession, + tableIdentifier: TableIdentifier): CarbonDatasourceHadoopRelation + + def getSchemaFromUnresolvedRelation(sparkSession: SparkSession, --- End diff -- This method is called from the parser and this logic for retrieving schema from unresolved relation, according to me should not return in parser. Please let me know if you have a different perspective ---
[GitHub] carbondata issue #1311: [CARBONDATA-1439] Wrong Error message shown for Bad ...
Github user manishgupta88 commented on the issue: https://github.com/apache/carbondata/pull/1311 retest this please ---
[GitHub] carbondata issue #1641: [CARBONDATA-1882] select with group by and insertove...
Github user manishgupta88 commented on the issue: https://github.com/apache/carbondata/pull/1641 retest sdv please ---
[GitHub] carbondata issue #1167: [CARBONDATA-1304] [IUD Bug] Iud with single pass
Github user manishgupta88 commented on the issue: https://github.com/apache/carbondata/pull/1167 LGTM ---
[GitHub] carbondata issue #1653: [CARBONDATA-1893] Data load with multiple QUOTECHAR ...
Github user manishgupta88 commented on the issue: https://github.com/apache/carbondata/pull/1653 LGTM ---
[GitHub] carbondata issue #1665: [CARBONDATA-1884] Add CTAS support to carbondata
Github user manishgupta88 commented on the issue: https://github.com/apache/carbondata/pull/1665 retest this please ---
[GitHub] carbondata pull request #1665: [CARBONDATA-1884] Add CTAS support to carbond...
GitHub user manishgupta88 opened a pull request: https://github.com/apache/carbondata/pull/1665 [CARBONDATA-1884] Add CTAS support to carbondata Implemented CTAS feature in carbondata. This will hep to create a carbon table from other parquet/orc tables. - [ ] Any interfaces changed? New DDL has been introduced. Syntax: **CREATE TABLE [IF NOT EXISTS] [db_name.]table_name stored by 'carbondata' AS select_statement** - [ ] Any backward compatibility impacted? No - [ ] Document update required? Yes - [ ] Testing done Yes added 12 functional test cases covering various scenarios for CTAS test - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. NA You can merge this pull request into a Git repository by running: $ git pull https://github.com/manishgupta88/carbondata CTAS_implementation Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/1665.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1665 commit 1dc3df29d92d2cddcd73f6ea0c26140bd4a70ad6 Author: manishgupta88 <tomanishgupt...@gmail.com> Date: 2017-12-13T16:10:19Z Added code for implementing CTAS ---
[GitHub] carbondata pull request #1167: [CARBONDATA-1304] [IUD BuggFix] Iud with sing...
Github user manishgupta88 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1167#discussion_r157133954 --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/iud/TestUpdateCarbonTableWithPersistFalse.scala --- @@ -0,0 +1,56 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.carbondata.spark.testsuite.iud + +import org.apache.spark.sql.Row +import org.apache.spark.sql.test.util.QueryTest +import org.scalatest.BeforeAndAfterAll + +import org.apache.carbondata.common.constants.LoggerAction +import org.apache.carbondata.core.constants.CarbonCommonConstants +import org.apache.carbondata.core.util.CarbonProperties + +class TestUpdateCarbonTableWithPersistFalse extends QueryTest with BeforeAndAfterAll { --- End diff -- Dont add a new test case. Add this test case in the existing update table test case class ---
[GitHub] carbondata pull request #1167: [CARBONDATA-1304] [IUD BuggFix] Iud with sing...
Github user manishgupta88 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1167#discussion_r157133997 --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/iud/TestUpdateCarbonTableWithSinglePass.scala --- @@ -0,0 +1,54 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.carbondata.spark.testsuite.iud + +import org.apache.spark.sql.Row +import org.apache.spark.sql.test.util.QueryTest +import org.scalatest.BeforeAndAfterAll + +import org.apache.carbondata.common.constants.LoggerAction +import org.apache.carbondata.core.constants.{CarbonCommonConstants, CarbonLoadOptionConstants} +import org.apache.carbondata.core.util.CarbonProperties + +class TestUpdateCarbonTableWithSinglePass extends QueryTest with BeforeAndAfterAll { --- End diff -- Dont add a new test case. Add this test case in the existing update table test case class ---
[GitHub] carbondata pull request #1167: [CARBONDATA-1304] [IUD BuggFix] Iud with sing...
Github user manishgupta88 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1167#discussion_r157134388 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonLoadDataCommand.scala --- @@ -314,14 +315,19 @@ case class CarbonLoadDataCommand( } else { None } +var loadDataFrame = dataFrame --- End diff -- Modify the code to make the variable as Val val loadDataFrame = if (updateModel.isDefined) { Some(getDataFrameWithTupleID) } else { dataFrame } ---
[GitHub] carbondata issue #1663: [CARBONDATA-1897] remove column group in describe co...
Github user manishgupta88 commented on the issue: https://github.com/apache/carbondata/pull/1663 LGTM ---
[GitHub] carbondata pull request #1653: [CARBONDATA-1893] Data load with multiple QUO...
Github user manishgupta88 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1653#discussion_r157009066 --- Diff: integration/spark-common/src/main/scala/org/apache/spark/sql/catalyst/CarbonDDLSqlParser.scala --- @@ -871,6 +871,32 @@ abstract class CarbonDDLSqlParser extends AbstractCarbonSparkSQLParser { throw new MalformedCarbonCommandException(errorMessage) } +// Validate QUOTECHAR length +if (options.exists(_._1.equalsIgnoreCase("QUOTECHAR"))) { + val quoteChar: String = options.get("quotechar").get.head._2 + if (quoteChar.length > 1 && !isValidEscapeSequence(quoteChar)) { --- End diff -- No need to validate quote character for a valid escape sequence ---
[GitHub] carbondata pull request #1653: [CARBONDATA-1893] Data load with multiple QUO...
Github user manishgupta88 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1653#discussion_r157009247 --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/dataload/TestLoadOptions.scala --- @@ -0,0 +1,83 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.spark.testsuite.dataload + +import org.apache.spark.sql.test.util.QueryTest +import org.scalatest.BeforeAndAfterAll + +import org.apache.carbondata.spark.exception.MalformedCarbonCommandException + +class TestLoadOptions extends QueryTest with BeforeAndAfterAll{ + + override def beforeAll { +sql("drop table if exists TestLoadTableOptions") +sql("CREATE table TestLoadTableOptions (ID int, date String, country String, name String," + +"phonetype String, serialname String, salary int) stored by 'org.apache.carbondata.format'") + + } + + override def afterAll { +sql("drop table if exists TestLoadTableOptions") + } + + + test("test load data with more than one char in quotechar option") { +val errorMessage = intercept[MalformedCarbonCommandException] { + sql(s"LOAD DATA LOCAL INPATH '$resourcesPath/dataretention1.csv' INTO TABLE " + + s"TestLoadTableOptions OPTIONS('QUOTECHAR'='')") +}.getMessage +assert(errorMessage.equals("Quotation cannot be more than one character.")) + } + + test("test load data with more than one char in commentchar option") { +val errorMessage = intercept[MalformedCarbonCommandException] { + sql(s"LOAD DATA LOCAL INPATH '$resourcesPath/dataretention1.csv' INTO TABLE " + + s"TestLoadTableOptions OPTIONS('COMMENTCHAR'='##')") + assert(false) +}.getMessage +assert(errorMessage.equals("Comment marker cannot be more than one character.")) + } + + test("test load data with more than one char in escapechar option") { +val errorMessage = intercept[MalformedCarbonCommandException] { + sql(s"LOAD DATA LOCAL INPATH '$resourcesPath/dataretention1.csv' INTO TABLE " + + s"TestLoadTableOptions OPTIONS('ESCAPECHAR'='')") + assert(false) +}.getMessage +assert(errorMessage.equals("Escape character cannot be more than one character.")) + } + + test("test load data with invalid escape sequence in quotechar option") { +val errorMessage = intercept[MalformedCarbonCommandException] { + sql(s"LOAD DATA LOCAL INPATH '$resourcesPath/dataretention1.csv' INTO TABLE " + + s"TestLoadTableOptions OPTIONS('QUOTECHAR'='\\y')") +}.getMessage +assert(errorMessage.equals("Quotation cannot be more than one character.")) + } + + test("test load data with with valid escape sequence in quotechar option") { +sql(s"LOAD DATA LOCAL INPATH '$resourcesPath/dataretention1.csv' INTO TABLE " + +s"TestLoadTableOptions OPTIONS('QUOTECHAR'='\\n')") --- End diff -- Replace quotechar with escapechar in above 2 test cases ---
[GitHub] carbondata pull request #1653: [CARBONDATA-1893] Data load with multiple QUO...
Github user manishgupta88 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1653#discussion_r156958685 --- Diff: integration/spark-common/src/main/scala/org/apache/spark/sql/catalyst/CarbonDDLSqlParser.scala --- @@ -932,6 +958,11 @@ abstract class CarbonDDLSqlParser extends AbstractCarbonSparkSQLParser { } } + def isEscapeSequence(loadOption: String): Boolean = { --- End diff -- Rename it to isValidEscapeSequence and also add one valid test case for validating a correct escape sequence ---
[GitHub] carbondata pull request #1632: [CARBONDATA-1839] [DataLoad]Fix bugs in compr...
Github user manishgupta88 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1632#discussion_r156954293 --- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/spark/load/DataLoadProcessBuilderOnSpark.scala --- @@ -121,17 +121,18 @@ object DataLoadProcessBuilderOnSpark { CarbonProperties.getInstance().getGlobalSortRddStorageLevel())) } +val sortStepRowConverter: SortStepRowHandler = new SortStepRowHandler(sortParameters) import scala.reflect.classTag + +// 3. sort val sortRDD = convertRDD - .sortBy(_.getData, numPartitions = numPartitions)(RowOrdering, classTag[Array[AnyRef]]) - .mapPartitionsWithIndex { case (index, rows) => -DataLoadProcessorStepOnSpark.convertTo3Parts(rows, index, modelBroadcast, - sortStepRowCounter) - } + .map(r => DataLoadProcessorStepOnSpark.convertTo3Parts(r, TaskContext.getPartitionId(), +modelBroadcast, sortStepRowConverter, sortStepRowCounter)) + .sortBy(r => r.getData, numPartitions = numPartitions)(RowOrdering, classTag[Array[AnyRef]]) --- End diff -- @xuchuanyin ... This PR is for compressing sort temp files but this code modification is for data load using global sort flow which does not involve creation of sort temp files. Can you please clarify? ---
[GitHub] carbondata pull request #1632: [CARBONDATA-1839] [DataLoad]Fix bugs in compr...
Github user manishgupta88 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1632#discussion_r156919460 --- Diff: core/src/main/java/org/apache/carbondata/core/util/NonDictionaryUtil.java --- @@ -108,60 +105,21 @@ public static Object getMeasure(int index, Object[] row) { return measures[index]; } - public static byte[] getByteArrayForNoDictionaryCols(Object[] row) { - -return (byte[]) row[WriteStepRowUtil.NO_DICTIONARY_AND_COMPLEX]; + /** + * Method to get the required non-dictionary & complex from 3-parted row + * @param index + * @param row + * @return + */ + public static byte[] getNonDictOrComplex(int index, Object[] row) { --- End diff -- Rename the method to getNoDictOrComplex ---
[GitHub] carbondata issue #1311: [CARBONDATA-1439] Wrong Error message shown for Bad ...
Github user manishgupta88 commented on the issue: https://github.com/apache/carbondata/pull/1311 @shivangi1015 ...kindly rebase this PR..it has merge conflicts ---
[GitHub] carbondata issue #1655: [CARBONDATA-1894] Add compactionType Parameter to co...
Github user manishgupta88 commented on the issue: https://github.com/apache/carbondata/pull/1655 LGTM ---
[GitHub] carbondata pull request #1653: [CARBONDATA-1893] Data load with multiple QUO...
Github user manishgupta88 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1653#discussion_r156888264 --- Diff: integration/spark-common/src/main/scala/org/apache/spark/sql/catalyst/CarbonDDLSqlParser.scala --- @@ -871,6 +871,32 @@ abstract class CarbonDDLSqlParser extends AbstractCarbonSparkSQLParser { throw new MalformedCarbonCommandException(errorMessage) } +// Validate QUOTECHAR length +if (options.exists(_._1.equalsIgnoreCase("QUOTECHAR"))) { + val quoteChar: String = options.get("quotechar").get.head._2 + if (quoteChar.length > 1) { +throw new MalformedCarbonCommandException("Quotation cannot be more than one character.") + } +} + +// Validate COMMENTCHAR length +if (options.exists(_._1.equalsIgnoreCase("COMMENTCHAR"))) { + val commentChar: String = options.get("commentchar").get.head._2 + if (commentChar.length > 1) { +throw new MalformedCarbonCommandException( + "Comment marker cannot be more than one character.") + } +} + +// Validate ESCAPECHAR length +if (options.exists(_._1.equalsIgnoreCase("ESCAPECHAR"))) { + val escapechar: String = options.get("escapechar").get.head._2 + if (escapechar.length > 1) { --- End diff -- \n, \t are valid escape sequences...add a method to validate valid and allowed escape sequences ---
[GitHub] carbondata issue #1637: [CARBONDATA-1876]clean all the InProgress segments f...
Github user manishgupta88 commented on the issue: https://github.com/apache/carbondata/pull/1637 retest sdv please ---
[GitHub] carbondata issue #1643: [CARBONDATA-1883] Improvement in merge index code
Github user manishgupta88 commented on the issue: https://github.com/apache/carbondata/pull/1643 LGTM ---
[GitHub] carbondata issue #1622: [CARBONDATA-1865] Remove unnecessary condition to ch...
Github user manishgupta88 commented on the issue: https://github.com/apache/carbondata/pull/1622 LGTM ---
[GitHub] carbondata issue #1646: [CARBONDATA-1886] Delete stale segment folders on ne...
Github user manishgupta88 commented on the issue: https://github.com/apache/carbondata/pull/1646 LGTM ---
[GitHub] carbondata issue #1627: [CARBONDATA-1759]make visibility of segments as fals...
Github user manishgupta88 commented on the issue: https://github.com/apache/carbondata/pull/1627 LGTM ---
[GitHub] carbondata issue #1630: [CARBONDATA-1826] Carbon 1.3.0 - Spark 2.2: Describe...
Github user manishgupta88 commented on the issue: https://github.com/apache/carbondata/pull/1630 LGTM ---
[GitHub] carbondata pull request #1643: [CARBONDATA-1883] Improvement in merge index ...
Github user manishgupta88 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1643#discussion_r156416528 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonAlterTableCompactionCommand.scala --- @@ -199,22 +199,34 @@ case class CarbonAlterTableCompactionCommand( carbonTable.getAbsoluteTableIdentifier).asScala, carbonLoadModel.getTablePath, carbonTable, true) -lock.unlock() -return + +// trigger event for merge index +val operationContext = new OperationContext +val alterTableCompactionPreEvent: AlterTableCompactionPreEvent = + AlterTableCompactionPreEvent(sqlContext.sparkSession, +carbonTable, +null, +"") +OperationListenerBus.getInstance + .fireEvent(alterTableCompactionPreEvent, operationContext) + + } else { +CarbonDataRDDFactory.startCompactionThreads( + sqlContext, + carbonLoadModel, + storeLocation, + compactionModel, + lock, + operationContext +) } - CarbonDataRDDFactory.startCompactionThreads( -sqlContext, -carbonLoadModel, -storeLocation, -compactionModel, -lock, -operationContext - ) } catch { case e: Exception => LOGGER.error(s"Exception in start compaction thread. ${ e.getMessage }") lock.unlock() --- End diff -- Remove lock.unlock() from catch block ---
[GitHub] carbondata pull request #1630: [CARBONDATA-1826] Carbon 1.3.0 - Spark 2.2: D...
Github user manishgupta88 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1630#discussion_r156305867 --- Diff: integration/spark-common/src/main/scala/org/apache/spark/util/CarbonReflectionUtils.scala --- @@ -175,6 +175,15 @@ object CarbonReflectionUtils { } } + def getDescribeTableFormattedField[T: TypeTag : reflect.ClassTag](obj: T): Boolean = { +var isFormatted: Boolean = false --- End diff -- make it val ---
[GitHub] carbondata pull request #1627: [CARBONDATA-1759]make visibility of segments ...
Github user manishgupta88 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1627#discussion_r156275385 --- Diff: processing/src/main/java/org/apache/carbondata/processing/util/DeleteLoadFolders.java --- @@ -122,26 +122,21 @@ private static boolean checkIfLoadCanBeDeleted(LoadMetadataDetails oneLoad, return false; } - public static boolean deleteLoadFoldersFromFileSystem( + public static void deleteLoadFoldersFromFileSystem( AbsoluteTableIdentifier absoluteTableIdentifier, boolean isForceDelete, LoadMetadataDetails[] details) { -boolean isDeleted = false; if (details != null && details.length != 0) { for (LoadMetadataDetails oneLoad : details) { if (checkIfLoadCanBeDeleted(oneLoad, isForceDelete)) { String path = getSegmentPath(absoluteTableIdentifier, 0, oneLoad); - boolean deletionStatus = physicalFactAndMeasureMetadataDeletion(path); - if (deletionStatus) { -isDeleted = true; -oneLoad.setVisibility("false"); -LOGGER.info("Info: Deleted the load " + oneLoad.getLoadName()); - } + physicalFactAndMeasureMetadataDeletion(path); --- End diff -- While handling deletion in this method return true in case file does not exist and add a warning logger. As we dont have any other mechanism for clean up it will be good if we keep all the remaining code same and only set the status as true in case of non existence of file ---
[GitHub] carbondata issue #1625: [CARBONDATA-1869] Null pointer exception thrown when...
Github user manishgupta88 commented on the issue: https://github.com/apache/carbondata/pull/1625 LGTM ---
[GitHub] carbondata pull request #1630: [CARBONDATA-1826] Carbon 1.3.0 - Spark 2.2: D...
Github user manishgupta88 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1630#discussion_r156134688 --- Diff: integration/spark-common/src/main/scala/org/apache/spark/util/CarbonReflectionUtils.scala --- @@ -175,6 +175,17 @@ object CarbonReflectionUtils { } } + def getDescribeTableFormattedField[T: TypeTag : reflect.ClassTag](obj: T): Boolean = { +var isFormatted: Boolean = false +val im = rm.reflect(obj) +for (m <- typeOf[T].members.filter(!_.isMethod)) { + if (m.toString.contains("isFormatted")) { +isFormatted = im.reflectField(m.asTerm).get.asInstanceOf[Boolean] --- End diff -- Once the condition is satisfied, exit the loopAlso can you try to use find method here instead of using for loop.. ---
[GitHub] carbondata issue #1624: [CARBONDATA-1867] Add support for task/segment level...
Github user manishgupta88 commented on the issue: https://github.com/apache/carbondata/pull/1624 @ravipesala ..handled review comments..kindly review and merge ---
[GitHub] carbondata pull request #1624: [WIP][CARBONDATA-1867] Add support for task/s...
GitHub user manishgupta88 opened a pull request: https://github.com/apache/carbondata/pull/1624 [WIP][CARBONDATA-1867] Add support for task/segment level pruning Added support for task/segment level pruning. Added code to compute task level min/max which can be helpful for task/segment level pruning Note: Dependent on PR #1619 ..Need to be rebased once PR 1619 is merged. - [ ] Any interfaces changed? - [ ] Any backward compatibility impacted? - [ ] Document update required? - [ ] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? - How it is tested? Please attach test report. - Is it a performance related change? Please attach the performance test report. - Any additional information to help reviewers in testing this change. - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. You can merge this pull request into a Git repository by running: $ git pull https://github.com/manishgupta88/carbondata add_task_level_min_max Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/1624.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1624 commit 5ada233b0e96baf9864407d6c9364495315ca6bc Author: manishgupta88 <tomanishgupt...@gmail.com> Date: 2017-12-06T11:41:51Z Added support for task/segment level pruning. Added code to compute task level min/max which can be helpful for task/segment level pruning ---
[GitHub] carbondata pull request #1619: [CARBONDATA-1854] Add support for implicit co...
GitHub user manishgupta88 opened a pull request: https://github.com/apache/carbondata/pull/1619 [CARBONDATA-1854] Add support for implicit column filter Added code to support implicit column filtering - [ ] Any interfaces changed? - [ ] Any backward compatibility impacted? - [ ] Document update required? - [ ] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? - How it is tested? Please attach test report. - Is it a performance related change? Please attach the performance test report. - Any additional information to help reviewers in testing this change. - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. You can merge this pull request into a Git repository by running: $ git pull https://github.com/manishgupta88/carbondata implicit_filter_support Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/1619.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1619 ---
[GitHub] carbondata pull request #1617: [CARBONDATA-1592] Added TaskPreSubmitExecutio...
GitHub user manishgupta88 opened a pull request: https://github.com/apache/carbondata/pull/1617 [CARBONDATA-1592] Added TaskPreSubmitExecutionEvent for any operations to be done before submitting a task Added TaskPreSubmitExecutionEvent for any operations to be done before submitting a task and added properties map in carbon table to holdexternal properties if any - [ ] Any interfaces changed? No - [ ] Any backward compatibility impacted? No - [ ] Document update required? No - [ ] Testing done - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. NA You can merge this pull request into a Git repository by running: $ git pull https://github.com/manishgupta88/carbondata preExecutorSubmitEvent Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/1617.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1617 ---
[GitHub] carbondata issue #1574: [CARBONDATA-1816] Changing BAD_RECORDS_ACTION defaul...
Github user manishgupta88 commented on the issue: https://github.com/apache/carbondata/pull/1574 LGTM ---
[GitHub] carbondata issue #1561: [CARBONDATA-1802] Alter query fails if a column is d...
Github user manishgupta88 commented on the issue: https://github.com/apache/carbondata/pull/1561 LGMT ---