[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3885: [CARBONDATA-3946] Support IndexServer with Presto Engine

2020-08-12 Thread GitBox


ajantha-bhat commented on a change in pull request #3885:
URL: https://github.com/apache/carbondata/pull/3885#discussion_r469701326



##
File path: 
integration/presto/src/main/prestodb/org/apache/carbondata/presto/impl/CarbonTableReader.java
##
@@ -281,7 +287,11 @@ private CarbonTableCacheModel 
getValidCacheBySchemaTableName(SchemaTableName sch
   createInputFormat(jobConf, carbonTable.getAbsoluteTableIdentifier(),
   new IndexFilter(carbonTable, filters, true), filteredPartitions);
   Job job = Job.getInstance(jobConf);
+  CarbonProperties.getInstance()
+  .addProperty(CarbonCommonConstants.IS_QUERY_FROM_PRESTO, "true");

Review comment:
   @kunal642 , current carbon and presto integration is only in the query. 
Load or insert is not supported. 
   So setting only in query flow should be enough I guess





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3885: [CARBONDATA-3946] Support IndexServer with Presto Engine

2020-08-11 Thread GitBox


ajantha-bhat commented on a change in pull request #3885:
URL: https://github.com/apache/carbondata/pull/3885#discussion_r468503482



##
File path: 
integration/presto/src/main/java/org/apache/carbondata/presto/impl/CarbonLocalInputSplit.java
##
@@ -127,7 +128,8 @@ public CarbonLocalInputSplit(@JsonProperty("segmentId") 
String segmentId,
   @JsonProperty("deleteDeltaFiles") String[] deleteDeltaFiles,
   @JsonProperty("blockletId") String blockletId,
   @JsonProperty("detailInfo") String detailInfo,
-  @JsonProperty("fileFormatOrdinal") int fileFormatOrdinal) {
+  @JsonProperty("fileFormatOrdinal") int fileFormatOrdinal,
+  boolean isDistributedPruningEnabled) {

Review comment:
   please also keep  @JsonProperty for this





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3885: [CARBONDATA-3946] Support IndexServer with Presto Engine

2020-08-11 Thread GitBox


ajantha-bhat commented on a change in pull request #3885:
URL: https://github.com/apache/carbondata/pull/3885#discussion_r468499772



##
File path: docs/prestodb-guide.md
##
@@ -301,3 +303,21 @@ Presto carbon only supports reading the carbon table which 
is written by spark c
 During reading, it supports the non-distributed indexes like block index and 
bloom index.
 It doesn't support Materialized View as it needs query plan to be changed and 
presto does not allow it.
 Also, Presto carbon supports streaming segment read from streaming table 
created by spark.
+
+## Presto Setup with CarbonData Distributed IndexServer
+
+### Dependency jars
+After copying all the jars from 
../integration/presto/target/carbondata-presto-X.Y.Z-SNAPSHOT 
+to `plugin/carbondata` directory on all nodes, ensure copying the following 
jars as well.
+1. Copy ../integration/spark/target/carbondata-spark_X.Y.Z-SNAPSHOT.jar
+2. Copy corresponding Spark dependency jars to the location.
+
+### Configure properties
+Configure IndexServer configurations in carbon.properties file. Refer 
+[Configuring 
IndexServer](https://github.com/apache/carbondata/blob/master/docs/index-server.md#Configurations)
 for more info.
+Add  `-Dcarbon.properties.filepath=/carbon.properties` in jvm.config 
file. 
+
+### Presto with IndexServer
+Start distributed index server. Launch presto CLI and fire SELECT query and 
check if the corresponding job
+is triggered in the index server application.

Review comment:
   Also mention that can use spark to see the cache loaded by using show 
metacache command





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3885: [CARBONDATA-3946] Support IndexServer with Presto Engine

2020-08-11 Thread GitBox


ajantha-bhat commented on a change in pull request #3885:
URL: https://github.com/apache/carbondata/pull/3885#discussion_r468499341



##
File path: docs/prestodb-guide.md
##
@@ -301,3 +303,21 @@ Presto carbon only supports reading the carbon table which 
is written by spark c
 During reading, it supports the non-distributed indexes like block index and 
bloom index.
 It doesn't support Materialized View as it needs query plan to be changed and 
presto does not allow it.
 Also, Presto carbon supports streaming segment read from streaming table 
created by spark.
+
+## Presto Setup with CarbonData Distributed IndexServer

Review comment:
   As `prestosql` is default profile, add this doc in `prestosql-guide.md` 
and in prestodb doc give a link to this section of prestosql as it is common to 
both the presto version 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org