[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3938: [CARBONDATA-3996]Fixed show table extended like command exception

2020-09-18 Thread GitBox


CarbonDataQA1 commented on pull request #3938:
URL: https://github.com/apache/carbondata/pull/3938#issuecomment-695088906


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4129/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3938: [CARBONDATA-3996]Fixed show table extended like command exception

2020-09-18 Thread GitBox


CarbonDataQA1 commented on pull request #3938:
URL: https://github.com/apache/carbondata/pull/3938#issuecomment-695087814


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2388/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] VenuReddy2103 opened a new pull request #3938: [CARBONDATA-3996]Fixed show table extended like command exception

2020-09-18 Thread GitBox


VenuReddy2103 opened a new pull request #3938:
URL: https://github.com/apache/carbondata/pull/3938


### Why is this PR needed?
Show table extended like command throws 
java.lang.ArrayIndexOutOfBoundsException

### What changes were proposed in this PR?
   Instead of carbondata forming the output rows, can call 
`showTablesCommand.run(sparkSession)`, get the output rows for tables and 
filter out rows corresponding to MV tables.
   
### Does this PR introduce any user interface change?
- No
   
### Is any new testcase added?
- Yes
   
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (CARBONDATA-3996) Show table extended like command throws java.lang.ArrayIndexOutOfBoundsException

2020-09-18 Thread Venugopal Reddy K (Jira)
Venugopal Reddy K created CARBONDATA-3996:
-

 Summary: Show table extended like command throws 
java.lang.ArrayIndexOutOfBoundsException
 Key: CARBONDATA-3996
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3996
 Project: CarbonData
  Issue Type: Bug
  Components: spark-integration
Affects Versions: 2.0.0
Reporter: Venugopal Reddy K
 Fix For: 2.1.0


*Issue:*

Show table extended like command throws java.lang.ArrayIndexOutOfBoundsException

*Steps to reproduce:*

spark.sql("create table employee(id string, name string) stored as carbondata")
spark.sql("show table extended like 'emp*'").show(100, false)

*Exception stack:*

 
{code:java}
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 3Exception 
in thread "main" java.lang.ArrayIndexOutOfBoundsException: 3 at 
org.apache.spark.sql.catalyst.expressions.GenericInternalRow.genericGet(rows.scala:201)
 at 
org.apache.spark.sql.catalyst.expressions.BaseGenericInternalRow$class.getAs(rows.scala:35)
 at 
org.apache.spark.sql.catalyst.expressions.BaseGenericInternalRow$class.getUTF8String(rows.scala:46)
 at 
org.apache.spark.sql.catalyst.expressions.GenericInternalRow.getUTF8String(rows.scala:195)
 at 
org.apache.spark.sql.catalyst.InternalRow$$anonfun$getAccessor$8.apply(InternalRow.scala:136)
 at 
org.apache.spark.sql.catalyst.InternalRow$$anonfun$getAccessor$8.apply(InternalRow.scala:136)
 at 
org.apache.spark.sql.catalyst.expressions.BoundReference.eval(BoundAttribute.scala:44)
 at 
org.apache.spark.sql.catalyst.expressions.UnaryExpression.eval(Expression.scala:389)
 at 
org.apache.spark.sql.catalyst.expressions.Alias.eval(namedExpressions.scala:152)
 at 
org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.apply(Projection.scala:92)
 at 
org.apache.spark.sql.catalyst.optimizer.ConvertToLocalRelation$$anonfun$apply$24$$anonfun$applyOrElse$23.apply(Optimizer.scala:1364)
 at 
org.apache.spark.sql.catalyst.optimizer.ConvertToLocalRelation$$anonfun$apply$24$$anonfun$applyOrElse$23.apply(Optimizer.scala:1364)
 at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
 at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
 at 
scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
 at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:35) at 
scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at 
scala.collection.AbstractTraversable.map(Traversable.scala:104) at 
org.apache.spark.sql.catalyst.optimizer.ConvertToLocalRelation$$anonfun$apply$24.applyOrElse(Optimizer.scala:1364)
 at 
org.apache.spark.sql.catalyst.optimizer.ConvertToLocalRelation$$anonfun$apply$24.applyOrElse(Optimizer.scala:1359)
 at 
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$2.apply(TreeNode.scala:258)
 at 
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$2.apply(TreeNode.scala:258)
 at 
org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:69)
 at 
org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:257) 
at 
org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDown(LogicalPlan.scala:29)
 at 
org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$class.transformDown(AnalysisHelper.scala:149)
 at 
org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDown(LogicalPlan.scala:29)
 at 
org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDown(LogicalPlan.scala:29)
 at 
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:263)
 at 
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:263)
 at 
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:328)
 at 
org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:186)
 at 
org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:326) at 
org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:263) 
at 
org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDown(LogicalPlan.scala:29)
 at 
org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$class.transformDown(AnalysisHelper.scala:149)
 at 
org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDown(LogicalPlan.scala:29)
 at 
org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDown(LogicalPlan.scala:29)
 at 
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:263)
 at 
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:263)
 at 
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:328)
 

[jira] [Resolved] (CARBONDATA-3995) Support presto querying older complex type stores

2020-09-18 Thread Kunal Kapoor (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kunal Kapoor resolved CARBONDATA-3995.
--
Fix Version/s: 2.1.0
   Resolution: Fixed

> Support presto querying older complex type stores
> -
>
> Key: CARBONDATA-3995
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3995
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Ajantha Bhat
>Assignee: Ajantha Bhat
>Priority: Major
> Fix For: 2.1.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Before carbon 2.0, complex child length is stored as SHORT for string, 
> varchar, binary, date, decimal types.
> So, In 2.0 as it is stored as INT, presto complex query code always assumes 
> it as INT 
> and goes to out of bound exception when old store is queried.
>  
> If INT_LENGTH_COMPLEX_CHILD_BYTE_ARRAY encoding is present, parse as INT, 
> else parse as SHORT.
> so, that both stores can be queried.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [carbondata] asfgit closed pull request #3937: [CARBONDATA-3995] Support presto querying older complex type stores

2020-09-18 Thread GitBox


asfgit closed pull request #3937:
URL: https://github.com/apache/carbondata/pull/3937


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] kunal642 commented on pull request #3937: [CARBONDATA-3995] Support presto querying older complex type stores

2020-09-18 Thread GitBox


kunal642 commented on pull request #3937:
URL: https://github.com/apache/carbondata/pull/3937#issuecomment-694935778


   LGTM



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3930: [CARBONDATA-3991]Fix the set modified time function on S3 and Alluxio…

2020-09-18 Thread GitBox


CarbonDataQA1 commented on pull request #3930:
URL: https://github.com/apache/carbondata/pull/3930#issuecomment-694926222


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2387/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3930: [CARBONDATA-3991]Fix the set modified time function on S3 and Alluxio…

2020-09-18 Thread GitBox


CarbonDataQA1 commented on pull request #3930:
URL: https://github.com/apache/carbondata/pull/3930#issuecomment-694925178


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4128/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on pull request #3936: [CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.

2020-09-18 Thread GitBox


ajantha-bhat commented on pull request #3936:
URL: https://github.com/apache/carbondata/pull/3936#issuecomment-694913880


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on pull request #3936: [CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.

2020-09-18 Thread GitBox


ajantha-bhat commented on pull request #3936:
URL: https://github.com/apache/carbondata/pull/3936#issuecomment-694913664


   spark 3.4 one test failure is a random issue I presume as these changes are 
unrelated and build passed before adding comments. We need to track this and fix



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] asfgit closed pull request #3936: [CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.

2020-09-18 Thread GitBox


asfgit closed pull request #3936:
URL: https://github.com/apache/carbondata/pull/3936


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on pull request #3936: [CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.

2020-09-18 Thread GitBox


ajantha-bhat commented on pull request #3936:
URL: https://github.com/apache/carbondata/pull/3936#issuecomment-694895620


   LGTM



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3936: [CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.

2020-09-18 Thread GitBox


CarbonDataQA1 commented on pull request #3936:
URL: https://github.com/apache/carbondata/pull/3936#issuecomment-694890983


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2386/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3936: [CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.

2020-09-18 Thread GitBox


CarbonDataQA1 commented on pull request #3936:
URL: https://github.com/apache/carbondata/pull/3936#issuecomment-694890026


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4127/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] akashrn5 commented on pull request #3937: [CARBONDATA-3995] Support presto querying older complex type stores

2020-09-18 Thread GitBox


akashrn5 commented on pull request #3937:
URL: https://github.com/apache/carbondata/pull/3937#issuecomment-694878769


   LGTM



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on pull request #3937: [CARBONDATA-3995] Support presto querying older complex type stores

2020-09-18 Thread GitBox


ajantha-bhat commented on pull request #3937:
URL: https://github.com/apache/carbondata/pull/3937#issuecomment-694856042


   verified in the cluster also. Tested ok.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3937: [CARBONDATA-3995] Support presto querying older complex type stores

2020-09-18 Thread GitBox


CarbonDataQA1 commented on pull request #3937:
URL: https://github.com/apache/carbondata/pull/3937#issuecomment-694825616


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2384/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3937: [CARBONDATA-3995] Support presto querying older complex type stores

2020-09-18 Thread GitBox


CarbonDataQA1 commented on pull request #3937:
URL: https://github.com/apache/carbondata/pull/3937#issuecomment-694818288


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4125/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] nihal0107 commented on a change in pull request #3936: [CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.

2020-09-18 Thread GitBox


nihal0107 commented on a change in pull request #3936:
URL: https://github.com/apache/carbondata/pull/3936#discussion_r490872439



##
File path: 
integration/hive/src/main/java/org/apache/carbondata/hive/util/HiveCarbonUtil.java
##
@@ -303,8 +321,15 @@ public void commitDropTable(Table table, boolean b) throws 
MetaException {
 List tokens = new ArrayList();
 StringBuilder stack = new StringBuilder();
 int openingCount = 0;
-for (int i = 0; i < schema.length(); i++) {
-  if (schema.charAt(i) == '<') {
+int length = schema.length();

Review comment:
   Here I have handled the schema parsing to consider comma-separated 
values.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] nihal0107 commented on a change in pull request #3936: [CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.

2020-09-18 Thread GitBox


nihal0107 commented on a change in pull request #3936:
URL: https://github.com/apache/carbondata/pull/3936#discussion_r490843284



##
File path: 
integration/hive/src/main/java/org/apache/carbondata/hive/util/HiveCarbonUtil.java
##
@@ -171,15 +171,33 @@ public static CarbonTable getCarbonTable(Configuration 
tableProperties) throws S
   columns = columns + "," + partitionColumns;
   columnTypes = columnTypes + ":" + partitionColumnTypes;
 }
-String[] columnTypeArray = 
HiveCarbonUtil.splitSchemaStringToArray(columnTypes);
-
+String[][] validatedColumnsAndTypes = validateColumnsAndTypes(columns, 
columnTypes);
 CarbonTable carbonTable = CarbonTable.buildFromTableInfo(
 HiveCarbonUtil.getTableInfo(tableName, databaseName, tablePath,
-sortColumns, columns.split(","), columnTypeArray, new 
ArrayList<>()));
+sortColumns, validatedColumnsAndTypes[0],
+validatedColumnsAndTypes[1], new ArrayList<>()));
 carbonTable.setTransactionalTable(false);
 return carbonTable;
   }
 
+  private static String[][] validateColumnsAndTypes(String columns, String 
columnTypes) {
+String[] columnTypeArray = 
HiveCarbonUtil.splitSchemaStringToArray(columnTypes);

Review comment:
   In case of empty table some additional columns are getting added in the 
configuration. Here I have validated if any additional column and removed that. 
Also added the comments.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] nihal0107 commented on a change in pull request #3936: [CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.

2020-09-18 Thread GitBox


nihal0107 commented on a change in pull request #3936:
URL: https://github.com/apache/carbondata/pull/3936#discussion_r490870410



##
File path: 
integration/hive/src/main/java/org/apache/carbondata/hive/MapredCarbonInputFormat.java
##
@@ -154,8 +154,16 @@ private static CarbonTable getCarbonTable(Configuration 
configuration, String pa
 } else {
   carbonInputFormat = new CarbonFileInputFormat<>();
 }
-List splitList =
-carbonInputFormat.getSplits(jobContext);
+List splitList;
+try {
+  splitList = carbonInputFormat.getSplits(jobContext);
+} catch (IOException ex) {
+  if (ex.getMessage().contains("No Index files are present in the table 
location :")) {

Review comment:
   The external table is not getting queried from this getSplits() of this 
file.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3936: [CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.

2020-09-18 Thread GitBox


CarbonDataQA1 commented on pull request #3936:
URL: https://github.com/apache/carbondata/pull/3936#issuecomment-694803147


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2383/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] nihal0107 commented on a change in pull request #3936: [CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.

2020-09-18 Thread GitBox


nihal0107 commented on a change in pull request #3936:
URL: https://github.com/apache/carbondata/pull/3936#discussion_r490869778



##
File path: 
integration/hive/src/test/java/org/apache/carbondata/hive/HiveTestUtils.java
##
@@ -88,7 +86,6 @@ public boolean checkAnswer(ResultSet actual, ResultSet 
expected) throws SQLExcep
 }
 Collections.sort(expectedValuesList);Collections.sort(actualValuesList);
 Assert.assertArrayEquals(expectedValuesList.toArray(), 
actualValuesList.toArray());
-Assert.assertTrue(rowCountExpected > 0);

Review comment:
   As now we are checking for empty table also which may contain zero row, 
so I have removed this.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3936: [CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.

2020-09-18 Thread GitBox


CarbonDataQA1 commented on pull request #3936:
URL: https://github.com/apache/carbondata/pull/3936#issuecomment-694802437


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4124/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3936: [CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.

2020-09-18 Thread GitBox


ajantha-bhat commented on a change in pull request #3936:
URL: https://github.com/apache/carbondata/pull/3936#discussion_r490853673



##
File path: 
integration/hive/src/main/java/org/apache/carbondata/hive/MapredCarbonInputFormat.java
##
@@ -154,8 +154,16 @@ private static CarbonTable getCarbonTable(Configuration 
configuration, String pa
 } else {
   carbonInputFormat = new CarbonFileInputFormat<>();
 }
-List splitList =
-carbonInputFormat.getSplits(jobContext);
+List splitList;
+try {
+  splitList = carbonInputFormat.getSplits(jobContext);
+} catch (IOException ex) {
+  if (ex.getMessage().contains("No Index files are present in the table 
location :")) {

Review comment:
   so test for external table and don't skip exception for external table





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] nihal0107 commented on a change in pull request #3936: [CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.

2020-09-18 Thread GitBox


nihal0107 commented on a change in pull request #3936:
URL: https://github.com/apache/carbondata/pull/3936#discussion_r490843284



##
File path: 
integration/hive/src/main/java/org/apache/carbondata/hive/util/HiveCarbonUtil.java
##
@@ -171,15 +171,33 @@ public static CarbonTable getCarbonTable(Configuration 
tableProperties) throws S
   columns = columns + "," + partitionColumns;
   columnTypes = columnTypes + ":" + partitionColumnTypes;
 }
-String[] columnTypeArray = 
HiveCarbonUtil.splitSchemaStringToArray(columnTypes);
-
+String[][] validatedColumnsAndTypes = validateColumnsAndTypes(columns, 
columnTypes);
 CarbonTable carbonTable = CarbonTable.buildFromTableInfo(
 HiveCarbonUtil.getTableInfo(tableName, databaseName, tablePath,
-sortColumns, columns.split(","), columnTypeArray, new 
ArrayList<>()));
+sortColumns, validatedColumnsAndTypes[0],
+validatedColumnsAndTypes[1], new ArrayList<>()));
 carbonTable.setTransactionalTable(false);
 return carbonTable;
   }
 
+  private static String[][] validateColumnsAndTypes(String columns, String 
columnTypes) {
+String[] columnTypeArray = 
HiveCarbonUtil.splitSchemaStringToArray(columnTypes);

Review comment:
   In case of empty table some additional columns are getting added in the 
configuration. Here I have validated if any additional column and removed that.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3936: [CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.

2020-09-18 Thread GitBox


ajantha-bhat commented on a change in pull request #3936:
URL: https://github.com/apache/carbondata/pull/3936#discussion_r490834840



##
File path: 
integration/hive/src/main/java/org/apache/carbondata/hive/MapredCarbonInputFormat.java
##
@@ -154,8 +154,16 @@ private static CarbonTable getCarbonTable(Configuration 
configuration, String pa
 } else {
   carbonInputFormat = new CarbonFileInputFormat<>();
 }
-List splitList =
-carbonInputFormat.getSplits(jobContext);
+List splitList;
+try {
+  splitList = carbonInputFormat.getSplits(jobContext);
+} catch (IOException ex) {
+  if (ex.getMessage().contains("No Index files are present in the table 
location :")) {

Review comment:
   I see that hive supports only non-transactional table support. Non 
trascational tables are often created by the path provided by user.  so, when 
query if the path doesn't have carbon files. It is must to throw exception. 
   
   If external table is created on wrong path. Now query gives 0 rows and user 
will not know what went wrong.

##
File path: 
integration/hive/src/main/java/org/apache/carbondata/hive/util/HiveCarbonUtil.java
##
@@ -171,15 +171,33 @@ public static CarbonTable getCarbonTable(Configuration 
tableProperties) throws S
   columns = columns + "," + partitionColumns;
   columnTypes = columnTypes + ":" + partitionColumnTypes;
 }
-String[] columnTypeArray = 
HiveCarbonUtil.splitSchemaStringToArray(columnTypes);
-
+String[][] validatedColumnsAndTypes = validateColumnsAndTypes(columns, 
columnTypes);
 CarbonTable carbonTable = CarbonTable.buildFromTableInfo(
 HiveCarbonUtil.getTableInfo(tableName, databaseName, tablePath,
-sortColumns, columns.split(","), columnTypeArray, new 
ArrayList<>()));
+sortColumns, validatedColumnsAndTypes[0],
+validatedColumnsAndTypes[1], new ArrayList<>()));
 carbonTable.setTransactionalTable(false);
 return carbonTable;
   }
 
+  private static String[][] validateColumnsAndTypes(String columns, String 
columnTypes) {
+String[] columnTypeArray = 
HiveCarbonUtil.splitSchemaStringToArray(columnTypes);

Review comment:
   what are these changes for didn't see anything about this in 
descriptions. please add comments  

##
File path: 
integration/hive/src/main/java/org/apache/carbondata/hive/util/HiveCarbonUtil.java
##
@@ -303,8 +321,15 @@ public void commitDropTable(Table table, boolean b) throws 
MetaException {
 List tokens = new ArrayList();
 StringBuilder stack = new StringBuilder();
 int openingCount = 0;
-for (int i = 0; i < schema.length(); i++) {
-  if (schema.charAt(i) == '<') {
+int length = schema.length();

Review comment:
   same comment as above

##
File path: 
integration/hive/src/test/java/org/apache/carbondata/hive/HiveTestUtils.java
##
@@ -88,7 +86,6 @@ public boolean checkAnswer(ResultSet actual, ResultSet 
expected) throws SQLExcep
 }
 Collections.sort(expectedValuesList);Collections.sort(actualValuesList);
 Assert.assertArrayEquals(expectedValuesList.toArray(), 
actualValuesList.toArray());
-Assert.assertTrue(rowCountExpected > 0);

Review comment:
   why removed this validation ?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on pull request #3937: [CARBONDATA-3995] Support presto querying older complex type stores

2020-09-18 Thread GitBox


ajantha-bhat commented on pull request #3937:
URL: https://github.com/apache/carbondata/pull/3937#issuecomment-694767622


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3937: [CARBONDATA-3995] Support presto querying older complex type stores

2020-09-18 Thread GitBox


CarbonDataQA1 commented on pull request #3937:
URL: https://github.com/apache/carbondata/pull/3937#issuecomment-694763475


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4122/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3937: [CARBONDATA-3995] Support presto querying older complex type stores

2020-09-18 Thread GitBox


CarbonDataQA1 commented on pull request #3937:
URL: https://github.com/apache/carbondata/pull/3937#issuecomment-694757377


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2381/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3936: [WIP][CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.

2020-09-18 Thread GitBox


CarbonDataQA1 commented on pull request #3936:
URL: https://github.com/apache/carbondata/pull/3936#issuecomment-694744886


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4123/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3936: [WIP][CARBONDATA-3914] Fixed exception on reading data from carbon-hive empty table.

2020-09-18 Thread GitBox


CarbonDataQA1 commented on pull request #3936:
URL: https://github.com/apache/carbondata/pull/3936#issuecomment-694744293


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2382/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on pull request #3937: [CARBONDATA-3995] Support presto querying older complex type stores

2020-09-18 Thread GitBox


ajantha-bhat commented on pull request #3937:
URL: https://github.com/apache/carbondata/pull/3937#issuecomment-694711366


   @kumarvishal09 , @kunal642 , @akashrn5 : please check and merge



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3935: [CARBONDATA-3993] Remove deletePartialLoadData in data loading process

2020-09-18 Thread GitBox


ajantha-bhat commented on a change in pull request #3935:
URL: https://github.com/apache/carbondata/pull/3935#discussion_r490760218



##
File path: 
integration/spark/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala
##
@@ -267,9 +266,8 @@ object CarbonDataRDDFactory {
 throw new Exception("Exception in compaction " + 
exception.getMessage)
   }
 } finally {
-  executor.shutdownNow()
   try {
-compactor.deletePartialLoadsInCompaction()

Review comment:
   a) When compaction retries, it uses the same segment ID, if stale files 
are not cleaned. It gives duplicate data.
   So, before this change, we need #3934 to be merged which can use a unique 
segment id for compaction retry.
   
   b) please check and move the logic of deletePartialLoadsInCompaction in 
clean files command, instead of permanently removing it. If the clean files 
don't have this logic, it may not able to clean stale files.
   
   c) Also if the purpose of this PR is to avoid accidental data loss. you need 
to handle `cleanStaleDeltaFiles` in `CarbonUpdateUtil.java` and also identify 
other places. Just handling in few place will not guarantee that we cannot have 
data loss.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat edited a comment on pull request #3935: [CARBONDATA-3993] Remove deletePartialLoadData in data loading process

2020-09-18 Thread GitBox


ajantha-bhat edited a comment on pull request #3935:
URL: https://github.com/apache/carbondata/pull/3935#issuecomment-694708624


   Agree with @Zhangshunyu and @akashrn5 
   a) When compaction retries, it uses the same segment ID, if stale files are 
not cleaned. It gives duplicate data.
   So, before this change, we need #3934 to be merged which can use a unique 
segment id for compaction retry.
   b) please check and move the logic of `deletePartialLoadsInCompaction` in 
clean files command, instead of permanently removing it. If the clean files 
don't have this logic, it may not able to clean stale files.
   c) Also if the purpose of this PR is to avoid accidental data loss. you need 
to handle `cleanStaleDeltaFiles` in` CarbonUpdateUtil.java` and also identify 
other places. Just handling in once place will not guarantee that we cannot 
have data loss. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat edited a comment on pull request #3935: [CARBONDATA-3993] Remove deletePartialLoadData in data loading process

2020-09-18 Thread GitBox


ajantha-bhat edited a comment on pull request #3935:
URL: https://github.com/apache/carbondata/pull/3935#issuecomment-694708624


   Agree with @Zhangshunyu and @akashrn5 
   a) When compaction retries, it uses the same segment ID, if stale files are 
not cleaned. It gives duplicate data.
   So, before this change, we need #3934 to be merged which can use a unique 
segment id for compaction retry.
   b) please check and move the logic of deletePartialLoadsInCompaction in 
clean files command, instead of permanently removing it. If the clean files 
don't have this logic, it may not able to clean stale files
   c) Also if the purpose of this PR at deleting accidental data loss. you need 
to handle `cleanStaleDeltaFiles` in` CarbonUpdateUtil.java` and also identify 
other places. Just handling in once place will not guarantee that we cannot 
have data loss. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on pull request #3935: [CARBONDATA-3993] Remove deletePartialLoadData in data loading process

2020-09-18 Thread GitBox


ajantha-bhat commented on pull request #3935:
URL: https://github.com/apache/carbondata/pull/3935#issuecomment-694708624


   Agree with @Zhangshunyu and @akashrn5 
   a) When compaction retried as it use the same segment ID, if stale files are 
not cleaned. It gives duplicate data.
   So, before this change, we need #3934 to be merged which can use a unique 
segment id for compaction retry.
   b) please check and move the logic of deletePartialLoadsInCompaction in 
clean files command, instead of permanently removing it. If the clean files 
don't have this logic, it may not able to clean stale files
   c) Also if the purpose of this PR at deleting accidental data loss. you need 
to handle `cleanStaleDeltaFiles` in` CarbonUpdateUtil.java` and also identify 
other places. Just handling in once place will not guarantee that we cannot 
have data loss. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] akashrn5 commented on pull request #3935: [CARBONDATA-3993] Remove deletePartialLoadData in data loading process

2020-09-18 Thread GitBox


akashrn5 commented on pull request #3935:
URL: https://github.com/apache/carbondata/pull/3935#issuecomment-694704817


   during insert or load, if you dont clean up still fine, but if you remove 
`deletePartialLoadsInCompaction` directly, it can create problem of load or 
query failures or wrong data in query or extra data. This is because we dont 
enter any entry before compaction in table status in order to maintain the 
segment ID logic of compaction. So its dangerous to remove this blindly. 
   
   I remember @ajantha-bhat faced similar issues recently during compaction.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (CARBONDATA-3995) Support presto querying older complex type stores

2020-09-18 Thread Ajantha Bhat (Jira)
Ajantha Bhat created CARBONDATA-3995:


 Summary: Support presto querying older complex type stores
 Key: CARBONDATA-3995
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3995
 Project: CarbonData
  Issue Type: Bug
Reporter: Ajantha Bhat
Assignee: Ajantha Bhat


Before carbon 2.0, complex child length is stored as SHORT for string, varchar, 
binary, date, decimal types.
So, In 2.0 as it is stored as INT, presto complex query code always assumes it 
as INT 
and goes to out of bound exception when old store is queried.

 

If INT_LENGTH_COMPLEX_CHILD_BYTE_ARRAY encoding is present, parse as INT, else 
parse as SHORT.
so, that both stores can be queried.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [carbondata] ajantha-bhat opened a new pull request #3937: [CARBONDATA-3995] Support presto querying older complex type stores

2020-09-18 Thread GitBox


ajantha-bhat opened a new pull request #3937:
URL: https://github.com/apache/carbondata/pull/3937


### Why is this PR needed?
Before carbon 2.0, complex child length is stored as SHORT for string, 
varchar, binary, date, decimal types.
   So, In 2.0 as it is stored as INT, presto complex query code always assumes 
it as  INT 
   and goes to out of bound exception when old store is queried.

### What changes were proposed in this PR?
   If INT_LENGTH_COMPLEX_CHILD_BYTE_ARRAY encoding is present, parse as INT, 
else parse as SHORT.
   so, that both stores can be queried.
   
### Does this PR introduce any user interface change?
- No
   
### Is any new testcase added?
- No [upgrade scenario]
   
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org