[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3938: [CARBONDATA-3996]Fixed show table extended like command exception
CarbonDataQA1 commented on pull request #3938: URL: https://github.com/apache/carbondata/pull/3938#issuecomment-695088906 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4129/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3938: [CARBONDATA-3996]Fixed show table extended like command exception
CarbonDataQA1 commented on pull request #3938: URL: https://github.com/apache/carbondata/pull/3938#issuecomment-695087814 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2388/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] VenuReddy2103 opened a new pull request #3938: [CARBONDATA-3996]Fixed show table extended like command exception
VenuReddy2103 opened a new pull request #3938: URL: https://github.com/apache/carbondata/pull/3938 ### Why is this PR needed? Show table extended like command throws java.lang.ArrayIndexOutOfBoundsException ### What changes were proposed in this PR? Instead of carbondata forming the output rows, can call `showTablesCommand.run(sparkSession)`, get the output rows for tables and filter out rows corresponding to MV tables. ### Does this PR introduce any user interface change? - No ### Is any new testcase added? - Yes This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (CARBONDATA-3996) Show table extended like command throws java.lang.ArrayIndexOutOfBoundsException
Venugopal Reddy K created CARBONDATA-3996: - Summary: Show table extended like command throws java.lang.ArrayIndexOutOfBoundsException Key: CARBONDATA-3996 URL: https://issues.apache.org/jira/browse/CARBONDATA-3996 Project: CarbonData Issue Type: Bug Components: spark-integration Affects Versions: 2.0.0 Reporter: Venugopal Reddy K Fix For: 2.1.0 *Issue:* Show table extended like command throws java.lang.ArrayIndexOutOfBoundsException *Steps to reproduce:* spark.sql("create table employee(id string, name string) stored as carbondata") spark.sql("show table extended like 'emp*'").show(100, false) *Exception stack:* {code:java} Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 3Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 3 at org.apache.spark.sql.catalyst.expressions.GenericInternalRow.genericGet(rows.scala:201) at org.apache.spark.sql.catalyst.expressions.BaseGenericInternalRow$class.getAs(rows.scala:35) at org.apache.spark.sql.catalyst.expressions.BaseGenericInternalRow$class.getUTF8String(rows.scala:46) at org.apache.spark.sql.catalyst.expressions.GenericInternalRow.getUTF8String(rows.scala:195) at org.apache.spark.sql.catalyst.InternalRow$$anonfun$getAccessor$8.apply(InternalRow.scala:136) at org.apache.spark.sql.catalyst.InternalRow$$anonfun$getAccessor$8.apply(InternalRow.scala:136) at org.apache.spark.sql.catalyst.expressions.BoundReference.eval(BoundAttribute.scala:44) at org.apache.spark.sql.catalyst.expressions.UnaryExpression.eval(Expression.scala:389) at org.apache.spark.sql.catalyst.expressions.Alias.eval(namedExpressions.scala:152) at org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.apply(Projection.scala:92) at org.apache.spark.sql.catalyst.optimizer.ConvertToLocalRelation$$anonfun$apply$24$$anonfun$applyOrElse$23.apply(Optimizer.scala:1364) at org.apache.spark.sql.catalyst.optimizer.ConvertToLocalRelation$$anonfun$apply$24$$anonfun$applyOrElse$23.apply(Optimizer.scala:1364) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:35) at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at scala.collection.AbstractTraversable.map(Traversable.scala:104) at org.apache.spark.sql.catalyst.optimizer.ConvertToLocalRelation$$anonfun$apply$24.applyOrElse(Optimizer.scala:1364) at org.apache.spark.sql.catalyst.optimizer.ConvertToLocalRelation$$anonfun$apply$24.applyOrElse(Optimizer.scala:1359) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$2.apply(TreeNode.scala:258) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$2.apply(TreeNode.scala:258) at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:69) at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:257) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDown(LogicalPlan.scala:29) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$class.transformDown(AnalysisHelper.scala:149) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDown(LogicalPlan.scala:29) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDown(LogicalPlan.scala:29) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:263) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:263) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:328) at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:186) at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:326) at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:263) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDown(LogicalPlan.scala:29) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$class.transformDown(AnalysisHelper.scala:149) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDown(LogicalPlan.scala:29) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDown(LogicalPlan.scala:29) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:263) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:263) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:328)
[jira] [Resolved] (CARBONDATA-3995) Support presto querying older complex type stores
[ https://issues.apache.org/jira/browse/CARBONDATA-3995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kunal Kapoor resolved CARBONDATA-3995. -- Fix Version/s: 2.1.0 Resolution: Fixed > Support presto querying older complex type stores > - > > Key: CARBONDATA-3995 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3995 > Project: CarbonData > Issue Type: Bug >Reporter: Ajantha Bhat >Assignee: Ajantha Bhat >Priority: Major > Fix For: 2.1.0 > > Time Spent: 1h 50m > Remaining Estimate: 0h > > Before carbon 2.0, complex child length is stored as SHORT for string, > varchar, binary, date, decimal types. > So, In 2.0 as it is stored as INT, presto complex query code always assumes > it as INT > and goes to out of bound exception when old store is queried. > > If INT_LENGTH_COMPLEX_CHILD_BYTE_ARRAY encoding is present, parse as INT, > else parse as SHORT. > so, that both stores can be queried. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [carbondata] asfgit closed pull request #3937: [CARBONDATA-3995] Support presto querying older complex type stores
asfgit closed pull request #3937: URL: https://github.com/apache/carbondata/pull/3937 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] kunal642 commented on pull request #3937: [CARBONDATA-3995] Support presto querying older complex type stores
kunal642 commented on pull request #3937: URL: https://github.com/apache/carbondata/pull/3937#issuecomment-694935778 LGTM This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3930: [CARBONDATA-3991]Fix the set modified time function on S3 and Alluxio…
CarbonDataQA1 commented on pull request #3930: URL: https://github.com/apache/carbondata/pull/3930#issuecomment-694926222 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2387/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3930: [CARBONDATA-3991]Fix the set modified time function on S3 and Alluxio…
CarbonDataQA1 commented on pull request #3930: URL: https://github.com/apache/carbondata/pull/3930#issuecomment-694925178 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4128/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ajantha-bhat commented on pull request #3936: [CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.
ajantha-bhat commented on pull request #3936: URL: https://github.com/apache/carbondata/pull/3936#issuecomment-694913880 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ajantha-bhat commented on pull request #3936: [CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.
ajantha-bhat commented on pull request #3936: URL: https://github.com/apache/carbondata/pull/3936#issuecomment-694913664 spark 3.4 one test failure is a random issue I presume as these changes are unrelated and build passed before adding comments. We need to track this and fix This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] asfgit closed pull request #3936: [CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.
asfgit closed pull request #3936: URL: https://github.com/apache/carbondata/pull/3936 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ajantha-bhat commented on pull request #3936: [CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.
ajantha-bhat commented on pull request #3936: URL: https://github.com/apache/carbondata/pull/3936#issuecomment-694895620 LGTM This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3936: [CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.
CarbonDataQA1 commented on pull request #3936: URL: https://github.com/apache/carbondata/pull/3936#issuecomment-694890983 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2386/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3936: [CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.
CarbonDataQA1 commented on pull request #3936: URL: https://github.com/apache/carbondata/pull/3936#issuecomment-694890026 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4127/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] akashrn5 commented on pull request #3937: [CARBONDATA-3995] Support presto querying older complex type stores
akashrn5 commented on pull request #3937: URL: https://github.com/apache/carbondata/pull/3937#issuecomment-694878769 LGTM This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ajantha-bhat commented on pull request #3937: [CARBONDATA-3995] Support presto querying older complex type stores
ajantha-bhat commented on pull request #3937: URL: https://github.com/apache/carbondata/pull/3937#issuecomment-694856042 verified in the cluster also. Tested ok. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3937: [CARBONDATA-3995] Support presto querying older complex type stores
CarbonDataQA1 commented on pull request #3937: URL: https://github.com/apache/carbondata/pull/3937#issuecomment-694825616 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2384/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3937: [CARBONDATA-3995] Support presto querying older complex type stores
CarbonDataQA1 commented on pull request #3937: URL: https://github.com/apache/carbondata/pull/3937#issuecomment-694818288 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4125/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] nihal0107 commented on a change in pull request #3936: [CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.
nihal0107 commented on a change in pull request #3936: URL: https://github.com/apache/carbondata/pull/3936#discussion_r490872439 ## File path: integration/hive/src/main/java/org/apache/carbondata/hive/util/HiveCarbonUtil.java ## @@ -303,8 +321,15 @@ public void commitDropTable(Table table, boolean b) throws MetaException { List tokens = new ArrayList(); StringBuilder stack = new StringBuilder(); int openingCount = 0; -for (int i = 0; i < schema.length(); i++) { - if (schema.charAt(i) == '<') { +int length = schema.length(); Review comment: Here I have handled the schema parsing to consider comma-separated values. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] nihal0107 commented on a change in pull request #3936: [CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.
nihal0107 commented on a change in pull request #3936: URL: https://github.com/apache/carbondata/pull/3936#discussion_r490843284 ## File path: integration/hive/src/main/java/org/apache/carbondata/hive/util/HiveCarbonUtil.java ## @@ -171,15 +171,33 @@ public static CarbonTable getCarbonTable(Configuration tableProperties) throws S columns = columns + "," + partitionColumns; columnTypes = columnTypes + ":" + partitionColumnTypes; } -String[] columnTypeArray = HiveCarbonUtil.splitSchemaStringToArray(columnTypes); - +String[][] validatedColumnsAndTypes = validateColumnsAndTypes(columns, columnTypes); CarbonTable carbonTable = CarbonTable.buildFromTableInfo( HiveCarbonUtil.getTableInfo(tableName, databaseName, tablePath, -sortColumns, columns.split(","), columnTypeArray, new ArrayList<>())); +sortColumns, validatedColumnsAndTypes[0], +validatedColumnsAndTypes[1], new ArrayList<>())); carbonTable.setTransactionalTable(false); return carbonTable; } + private static String[][] validateColumnsAndTypes(String columns, String columnTypes) { +String[] columnTypeArray = HiveCarbonUtil.splitSchemaStringToArray(columnTypes); Review comment: In case of empty table some additional columns are getting added in the configuration. Here I have validated if any additional column and removed that. Also added the comments. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] nihal0107 commented on a change in pull request #3936: [CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.
nihal0107 commented on a change in pull request #3936: URL: https://github.com/apache/carbondata/pull/3936#discussion_r490870410 ## File path: integration/hive/src/main/java/org/apache/carbondata/hive/MapredCarbonInputFormat.java ## @@ -154,8 +154,16 @@ private static CarbonTable getCarbonTable(Configuration configuration, String pa } else { carbonInputFormat = new CarbonFileInputFormat<>(); } -List splitList = -carbonInputFormat.getSplits(jobContext); +List splitList; +try { + splitList = carbonInputFormat.getSplits(jobContext); +} catch (IOException ex) { + if (ex.getMessage().contains("No Index files are present in the table location :")) { Review comment: The external table is not getting queried from this getSplits() of this file. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3936: [CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.
CarbonDataQA1 commented on pull request #3936: URL: https://github.com/apache/carbondata/pull/3936#issuecomment-694803147 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2383/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] nihal0107 commented on a change in pull request #3936: [CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.
nihal0107 commented on a change in pull request #3936: URL: https://github.com/apache/carbondata/pull/3936#discussion_r490869778 ## File path: integration/hive/src/test/java/org/apache/carbondata/hive/HiveTestUtils.java ## @@ -88,7 +86,6 @@ public boolean checkAnswer(ResultSet actual, ResultSet expected) throws SQLExcep } Collections.sort(expectedValuesList);Collections.sort(actualValuesList); Assert.assertArrayEquals(expectedValuesList.toArray(), actualValuesList.toArray()); -Assert.assertTrue(rowCountExpected > 0); Review comment: As now we are checking for empty table also which may contain zero row, so I have removed this. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3936: [CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.
CarbonDataQA1 commented on pull request #3936: URL: https://github.com/apache/carbondata/pull/3936#issuecomment-694802437 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4124/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3936: [CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.
ajantha-bhat commented on a change in pull request #3936: URL: https://github.com/apache/carbondata/pull/3936#discussion_r490853673 ## File path: integration/hive/src/main/java/org/apache/carbondata/hive/MapredCarbonInputFormat.java ## @@ -154,8 +154,16 @@ private static CarbonTable getCarbonTable(Configuration configuration, String pa } else { carbonInputFormat = new CarbonFileInputFormat<>(); } -List splitList = -carbonInputFormat.getSplits(jobContext); +List splitList; +try { + splitList = carbonInputFormat.getSplits(jobContext); +} catch (IOException ex) { + if (ex.getMessage().contains("No Index files are present in the table location :")) { Review comment: so test for external table and don't skip exception for external table This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] nihal0107 commented on a change in pull request #3936: [CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.
nihal0107 commented on a change in pull request #3936: URL: https://github.com/apache/carbondata/pull/3936#discussion_r490843284 ## File path: integration/hive/src/main/java/org/apache/carbondata/hive/util/HiveCarbonUtil.java ## @@ -171,15 +171,33 @@ public static CarbonTable getCarbonTable(Configuration tableProperties) throws S columns = columns + "," + partitionColumns; columnTypes = columnTypes + ":" + partitionColumnTypes; } -String[] columnTypeArray = HiveCarbonUtil.splitSchemaStringToArray(columnTypes); - +String[][] validatedColumnsAndTypes = validateColumnsAndTypes(columns, columnTypes); CarbonTable carbonTable = CarbonTable.buildFromTableInfo( HiveCarbonUtil.getTableInfo(tableName, databaseName, tablePath, -sortColumns, columns.split(","), columnTypeArray, new ArrayList<>())); +sortColumns, validatedColumnsAndTypes[0], +validatedColumnsAndTypes[1], new ArrayList<>())); carbonTable.setTransactionalTable(false); return carbonTable; } + private static String[][] validateColumnsAndTypes(String columns, String columnTypes) { +String[] columnTypeArray = HiveCarbonUtil.splitSchemaStringToArray(columnTypes); Review comment: In case of empty table some additional columns are getting added in the configuration. Here I have validated if any additional column and removed that. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3936: [CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.
ajantha-bhat commented on a change in pull request #3936: URL: https://github.com/apache/carbondata/pull/3936#discussion_r490834840 ## File path: integration/hive/src/main/java/org/apache/carbondata/hive/MapredCarbonInputFormat.java ## @@ -154,8 +154,16 @@ private static CarbonTable getCarbonTable(Configuration configuration, String pa } else { carbonInputFormat = new CarbonFileInputFormat<>(); } -List splitList = -carbonInputFormat.getSplits(jobContext); +List splitList; +try { + splitList = carbonInputFormat.getSplits(jobContext); +} catch (IOException ex) { + if (ex.getMessage().contains("No Index files are present in the table location :")) { Review comment: I see that hive supports only non-transactional table support. Non trascational tables are often created by the path provided by user. so, when query if the path doesn't have carbon files. It is must to throw exception. If external table is created on wrong path. Now query gives 0 rows and user will not know what went wrong. ## File path: integration/hive/src/main/java/org/apache/carbondata/hive/util/HiveCarbonUtil.java ## @@ -171,15 +171,33 @@ public static CarbonTable getCarbonTable(Configuration tableProperties) throws S columns = columns + "," + partitionColumns; columnTypes = columnTypes + ":" + partitionColumnTypes; } -String[] columnTypeArray = HiveCarbonUtil.splitSchemaStringToArray(columnTypes); - +String[][] validatedColumnsAndTypes = validateColumnsAndTypes(columns, columnTypes); CarbonTable carbonTable = CarbonTable.buildFromTableInfo( HiveCarbonUtil.getTableInfo(tableName, databaseName, tablePath, -sortColumns, columns.split(","), columnTypeArray, new ArrayList<>())); +sortColumns, validatedColumnsAndTypes[0], +validatedColumnsAndTypes[1], new ArrayList<>())); carbonTable.setTransactionalTable(false); return carbonTable; } + private static String[][] validateColumnsAndTypes(String columns, String columnTypes) { +String[] columnTypeArray = HiveCarbonUtil.splitSchemaStringToArray(columnTypes); Review comment: what are these changes for didn't see anything about this in descriptions. please add comments ## File path: integration/hive/src/main/java/org/apache/carbondata/hive/util/HiveCarbonUtil.java ## @@ -303,8 +321,15 @@ public void commitDropTable(Table table, boolean b) throws MetaException { List tokens = new ArrayList(); StringBuilder stack = new StringBuilder(); int openingCount = 0; -for (int i = 0; i < schema.length(); i++) { - if (schema.charAt(i) == '<') { +int length = schema.length(); Review comment: same comment as above ## File path: integration/hive/src/test/java/org/apache/carbondata/hive/HiveTestUtils.java ## @@ -88,7 +86,6 @@ public boolean checkAnswer(ResultSet actual, ResultSet expected) throws SQLExcep } Collections.sort(expectedValuesList);Collections.sort(actualValuesList); Assert.assertArrayEquals(expectedValuesList.toArray(), actualValuesList.toArray()); -Assert.assertTrue(rowCountExpected > 0); Review comment: why removed this validation ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ajantha-bhat commented on pull request #3937: [CARBONDATA-3995] Support presto querying older complex type stores
ajantha-bhat commented on pull request #3937: URL: https://github.com/apache/carbondata/pull/3937#issuecomment-694767622 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3937: [CARBONDATA-3995] Support presto querying older complex type stores
CarbonDataQA1 commented on pull request #3937: URL: https://github.com/apache/carbondata/pull/3937#issuecomment-694763475 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4122/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3937: [CARBONDATA-3995] Support presto querying older complex type stores
CarbonDataQA1 commented on pull request #3937: URL: https://github.com/apache/carbondata/pull/3937#issuecomment-694757377 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2381/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3936: [WIP][CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.
CarbonDataQA1 commented on pull request #3936: URL: https://github.com/apache/carbondata/pull/3936#issuecomment-694744886 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4123/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3936: [WIP][CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.
CarbonDataQA1 commented on pull request #3936: URL: https://github.com/apache/carbondata/pull/3936#issuecomment-694744293 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2382/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ajantha-bhat commented on pull request #3937: [CARBONDATA-3995] Support presto querying older complex type stores
ajantha-bhat commented on pull request #3937: URL: https://github.com/apache/carbondata/pull/3937#issuecomment-694711366 @kumarvishal09 , @kunal642 , @akashrn5 : please check and merge This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3935: [CARBONDATA-3993] Remove deletePartialLoadData in data loading process
ajantha-bhat commented on a change in pull request #3935: URL: https://github.com/apache/carbondata/pull/3935#discussion_r490760218 ## File path: integration/spark/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala ## @@ -267,9 +266,8 @@ object CarbonDataRDDFactory { throw new Exception("Exception in compaction " + exception.getMessage) } } finally { - executor.shutdownNow() try { -compactor.deletePartialLoadsInCompaction() Review comment: a) When compaction retries, it uses the same segment ID, if stale files are not cleaned. It gives duplicate data. So, before this change, we need #3934 to be merged which can use a unique segment id for compaction retry. b) please check and move the logic of deletePartialLoadsInCompaction in clean files command, instead of permanently removing it. If the clean files don't have this logic, it may not able to clean stale files. c) Also if the purpose of this PR is to avoid accidental data loss. you need to handle `cleanStaleDeltaFiles` in `CarbonUpdateUtil.java` and also identify other places. Just handling in few place will not guarantee that we cannot have data loss. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ajantha-bhat edited a comment on pull request #3935: [CARBONDATA-3993] Remove deletePartialLoadData in data loading process
ajantha-bhat edited a comment on pull request #3935: URL: https://github.com/apache/carbondata/pull/3935#issuecomment-694708624 Agree with @Zhangshunyu and @akashrn5 a) When compaction retries, it uses the same segment ID, if stale files are not cleaned. It gives duplicate data. So, before this change, we need #3934 to be merged which can use a unique segment id for compaction retry. b) please check and move the logic of `deletePartialLoadsInCompaction` in clean files command, instead of permanently removing it. If the clean files don't have this logic, it may not able to clean stale files. c) Also if the purpose of this PR is to avoid accidental data loss. you need to handle `cleanStaleDeltaFiles` in` CarbonUpdateUtil.java` and also identify other places. Just handling in once place will not guarantee that we cannot have data loss. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ajantha-bhat edited a comment on pull request #3935: [CARBONDATA-3993] Remove deletePartialLoadData in data loading process
ajantha-bhat edited a comment on pull request #3935: URL: https://github.com/apache/carbondata/pull/3935#issuecomment-694708624 Agree with @Zhangshunyu and @akashrn5 a) When compaction retries, it uses the same segment ID, if stale files are not cleaned. It gives duplicate data. So, before this change, we need #3934 to be merged which can use a unique segment id for compaction retry. b) please check and move the logic of deletePartialLoadsInCompaction in clean files command, instead of permanently removing it. If the clean files don't have this logic, it may not able to clean stale files c) Also if the purpose of this PR at deleting accidental data loss. you need to handle `cleanStaleDeltaFiles` in` CarbonUpdateUtil.java` and also identify other places. Just handling in once place will not guarantee that we cannot have data loss. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ajantha-bhat commented on pull request #3935: [CARBONDATA-3993] Remove deletePartialLoadData in data loading process
ajantha-bhat commented on pull request #3935: URL: https://github.com/apache/carbondata/pull/3935#issuecomment-694708624 Agree with @Zhangshunyu and @akashrn5 a) When compaction retried as it use the same segment ID, if stale files are not cleaned. It gives duplicate data. So, before this change, we need #3934 to be merged which can use a unique segment id for compaction retry. b) please check and move the logic of deletePartialLoadsInCompaction in clean files command, instead of permanently removing it. If the clean files don't have this logic, it may not able to clean stale files c) Also if the purpose of this PR at deleting accidental data loss. you need to handle `cleanStaleDeltaFiles` in` CarbonUpdateUtil.java` and also identify other places. Just handling in once place will not guarantee that we cannot have data loss. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] akashrn5 commented on pull request #3935: [CARBONDATA-3993] Remove deletePartialLoadData in data loading process
akashrn5 commented on pull request #3935: URL: https://github.com/apache/carbondata/pull/3935#issuecomment-694704817 during insert or load, if you dont clean up still fine, but if you remove `deletePartialLoadsInCompaction` directly, it can create problem of load or query failures or wrong data in query or extra data. This is because we dont enter any entry before compaction in table status in order to maintain the segment ID logic of compaction. So its dangerous to remove this blindly. I remember @ajantha-bhat faced similar issues recently during compaction. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (CARBONDATA-3995) Support presto querying older complex type stores
Ajantha Bhat created CARBONDATA-3995: Summary: Support presto querying older complex type stores Key: CARBONDATA-3995 URL: https://issues.apache.org/jira/browse/CARBONDATA-3995 Project: CarbonData Issue Type: Bug Reporter: Ajantha Bhat Assignee: Ajantha Bhat Before carbon 2.0, complex child length is stored as SHORT for string, varchar, binary, date, decimal types. So, In 2.0 as it is stored as INT, presto complex query code always assumes it as INT and goes to out of bound exception when old store is queried. If INT_LENGTH_COMPLEX_CHILD_BYTE_ARRAY encoding is present, parse as INT, else parse as SHORT. so, that both stores can be queried. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [carbondata] ajantha-bhat opened a new pull request #3937: [CARBONDATA-3995] Support presto querying older complex type stores
ajantha-bhat opened a new pull request #3937: URL: https://github.com/apache/carbondata/pull/3937 ### Why is this PR needed? Before carbon 2.0, complex child length is stored as SHORT for string, varchar, binary, date, decimal types. So, In 2.0 as it is stored as INT, presto complex query code always assumes it as INT and goes to out of bound exception when old store is queried. ### What changes were proposed in this PR? If INT_LENGTH_COMPLEX_CHILD_BYTE_ARRAY encoding is present, parse as INT, else parse as SHORT. so, that both stores can be queried. ### Does this PR introduce any user interface change? - No ### Is any new testcase added? - No [upgrade scenario] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org