[GitHub] [carbondata] xuchuanyin commented on a change in pull request #3177: [CARBONDATA-3306] Distributed index server

2019-04-22 Thread GitBox
xuchuanyin commented on a change in pull request #3177: [CARBONDATA-3306] 
Distributed index server
URL: https://github.com/apache/carbondata/pull/3177#discussion_r277489179
 
 

 ##
 File path: 
core/src/main/java/org/apache/carbondata/core/metadata/schema/table/CarbonTable.java
 ##
 @@ -1385,4 +1387,14 @@ public static boolean hasMVDataMap(CarbonTable 
carbonTable) throws IOException {
   return SortScopeOptions.getSortScope(sortScope);
 }
   }
+
+  @Override public void write(DataOutput out) throws IOException {
+tableInfo.write(out);
+  }
+
+  @Override public void readFields(DataInput in) throws IOException {
 
 Review comment:
   so, the `in` and `out` here is actually `TableInfo` instead of 
`CarbonTable`? I don't think it's an elegant code


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] xuchuanyin commented on a change in pull request #3177: [CARBONDATA-3306] Distributed index server

2019-04-22 Thread GitBox
xuchuanyin commented on a change in pull request #3177: [CARBONDATA-3306] 
Distributed index server
URL: https://github.com/apache/carbondata/pull/3177#discussion_r277489371
 
 

 ##
 File path: 
core/src/main/java/org/apache/carbondata/core/util/CarbonProperties.java
 ##
 @@ -1601,4 +1601,43 @@ private void validateDetailQueryBatchSize() {
   }
 }
   }
+
+  public boolean isDistributedPruningEnabled(String dbName, String tableName) {
+String configuredValue = getSessionPropertyValue(
+CarbonCommonConstants.CARBON_ENABLE_INDEX_SERVER + "." + dbName + "." 
+ tableName);
+if (configuredValue == null) {
+  configuredValue = 
getProperty(CarbonCommonConstants.CARBON_ENABLE_INDEX_SERVER);
+}
+boolean isServerEnabledByUser = Boolean.parseBoolean(configuredValue);
+if (isServerEnabledByUser) {
+  LOGGER.error("Distributed Index server is enabled for " + dbName + "." + 
tableName);
+}
+return isServerEnabledByUser;
+  }
+
+  public String getIndexServerIP() {
+return 
carbonProperties.getProperty(CarbonCommonConstants.CARBON_INDEX_SERVER_IP, "");
+  }
+
+  public int getIndexServerPort() {
+String configuredPort =
+
carbonProperties.getProperty(CarbonCommonConstants.CARBON_INDEX_SERVER_PORT,
+CarbonCommonConstants.CARBON_INDEX_SERVER_PORT_DEFAULT);
+try {
+  return Integer.parseInt(configuredPort);
+} catch (NumberFormatException e) {
+  LOGGER.info("Configured port for index server is not a valid number, 
using "
 
 Review comment:
   use `warn` level


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] xuchuanyin commented on a change in pull request #3177: [CARBONDATA-3306] Distributed index server

2019-04-22 Thread GitBox
xuchuanyin commented on a change in pull request #3177: [CARBONDATA-3306] 
Distributed index server
URL: https://github.com/apache/carbondata/pull/3177#discussion_r277488429
 
 

 ##
 File path: 
core/src/main/java/org/apache/carbondata/core/indexstore/ExtendedBlocklet.java
 ##
 @@ -144,4 +150,19 @@ public void setColumnSchema(List 
columnSchema) {
 this.inputSplit.setColumnSchema(columnSchema);
   }
 
+
+
+  @Override public void write(DataOutput out) throws IOException {
+out.writeUTF(dataMapUniqueId);
+inputSplit.write(out);
+super.write(out);
+  }
+
+  @Override public void readFields(DataInput in) throws IOException {
+dataMapUniqueId = in.readUTF();
+inputSplit = new CarbonInputSplit();
+inputSplit.readFields(in);
+super.readFields(in);
 
 Review comment:
   this has the same problem as described above


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] xuchuanyin commented on a change in pull request #3177: [CARBONDATA-3306] Distributed index server

2019-04-22 Thread GitBox
xuchuanyin commented on a change in pull request #3177: [CARBONDATA-3306] 
Distributed index server
URL: https://github.com/apache/carbondata/pull/3177#discussion_r277288298
 
 

 ##
 File path: 
core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
 ##
 @@ -2104,4 +2104,34 @@ private CarbonCommonConstants() {
*/
   public static final String CARBON_QUERY_DATAMAP_BLOOM_CACHE_SIZE_DEFAULT_VAL 
= "512";
 
+  /**
+   * The IP on which Index Server will be started.
+   */
+  @CarbonProperty
+  public static final String CARBON_INDEX_SERVER_IP = "carbon.index.server.ip";
+
+  /**
+   * The Port to be used to start Index Server.
+   */
+  @CarbonProperty
+  public static final String CARBON_INDEX_SERVER_PORT = 
"carbon.index.server.port";
+
+  public static final String CARBON_INDEX_SERVER_PORT_DEFAULT = "8990";
+
+  /**
+   * Whether to use index server for caching and pruning or not.
+   * This property can be used for
+   * 1. the whole application(carbon.properties).
+   * 2. the whole session(set carbon.enable.index.server)
+   * 3. a specific table for one session (set 
carbon.enable.index.server..)
+   */
+  @CarbonProperty(dynamicConfigurable = true)
+  public static final String CARBON_ENABLE_INDEX_SERVER = 
"carbon.enable.index.server";
+
+  /**
+   * Property is used to enable/disable fallback for indexserver.
+   * Used for testing purposes only.
+   */
+  public static final String CARBON_DISABLE_INDEX_SERVER_FALLBACK = 
"carbon.disable.index.server"
+  + ".fallback";
 
 Review comment:
   ???
   please keep the parameter in one line


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] xuchuanyin commented on a change in pull request #3177: [CARBONDATA-3306] Distributed index server

2019-04-22 Thread GitBox
xuchuanyin commented on a change in pull request #3177: [CARBONDATA-3306] 
Distributed index server
URL: https://github.com/apache/carbondata/pull/3177#discussion_r277487400
 
 

 ##
 File path: 
core/src/main/java/org/apache/carbondata/core/indexstore/ExtendedBlocklet.java
 ##
 @@ -144,4 +150,19 @@ public void setColumnSchema(List 
columnSchema) {
 this.inputSplit.setColumnSchema(columnSchema);
   }
 
+
+
+  @Override public void write(DataOutput out) throws IOException {
+out.writeUTF(dataMapUniqueId);
+inputSplit.write(out);
+super.write(out);
 
 Review comment:
   ???
   usually during serialization/deserialization, we call the parent's method 
first to ensure the compatibility, so here may contains bugs.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] xuchuanyin commented on a change in pull request #3177: [CARBONDATA-3306] Distributed index server

2019-04-22 Thread GitBox
xuchuanyin commented on a change in pull request #3177: [CARBONDATA-3306] 
Distributed index server
URL: https://github.com/apache/carbondata/pull/3177#discussion_r277288005
 
 

 ##
 File path: 
core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
 ##
 @@ -2104,4 +2104,34 @@ private CarbonCommonConstants() {
*/
   public static final String CARBON_QUERY_DATAMAP_BLOOM_CACHE_SIZE_DEFAULT_VAL 
= "512";
 
+  /**
+   * The IP on which Index Server will be started.
+   */
+  @CarbonProperty
+  public static final String CARBON_INDEX_SERVER_IP = "carbon.index.server.ip";
+
+  /**
+   * The Port to be used to start Index Server.
+   */
+  @CarbonProperty
+  public static final String CARBON_INDEX_SERVER_PORT = 
"carbon.index.server.port";
+
+  public static final String CARBON_INDEX_SERVER_PORT_DEFAULT = "8990";
 
 Review comment:
   Why is this configured?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] xuchuanyin commented on a change in pull request #3177: [CARBONDATA-3306] Distributed index server

2019-04-22 Thread GitBox
xuchuanyin commented on a change in pull request #3177: [CARBONDATA-3306] 
Distributed index server
URL: https://github.com/apache/carbondata/pull/3177#discussion_r277489973
 
 

 ##
 File path: 
examples/spark2/src/main/scala/org/apache/carbondata/examples/CarbonSessionExample.scala
 ##
 @@ -48,93 +48,39 @@ object CarbonSessionExample {
 
 val rootPath = new File(this.getClass.getResource("/").getPath
 + "../../../..").getCanonicalPath
+import spark._
+val normalTable = "carbon_normal"
+val bloomDMSampleTable = "carbon_bloom"
+val dataMapName = "bloom_dm"
+//sql(
 
 Review comment:
   Why comment these code?
   If not required, just remove them


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] xuchuanyin commented on a change in pull request #3177: [CARBONDATA-3306] Distributed index server

2019-04-22 Thread GitBox
xuchuanyin commented on a change in pull request #3177: [CARBONDATA-3306] 
Distributed index server
URL: https://github.com/apache/carbondata/pull/3177#discussion_r277489260
 
 

 ##
 File path: 
core/src/main/java/org/apache/carbondata/core/util/CarbonProperties.java
 ##
 @@ -1601,4 +1601,43 @@ private void validateDetailQueryBatchSize() {
   }
 }
   }
+
+  public boolean isDistributedPruningEnabled(String dbName, String tableName) {
+String configuredValue = getSessionPropertyValue(
+CarbonCommonConstants.CARBON_ENABLE_INDEX_SERVER + "." + dbName + "." 
+ tableName);
+if (configuredValue == null) {
+  configuredValue = 
getProperty(CarbonCommonConstants.CARBON_ENABLE_INDEX_SERVER);
+}
+boolean isServerEnabledByUser = Boolean.parseBoolean(configuredValue);
+if (isServerEnabledByUser) {
+  LOGGER.error("Distributed Index server is enabled for " + dbName + "." + 
tableName);
 
 Review comment:
   error level?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] xuchuanyin commented on a change in pull request #3177: [CARBONDATA-3306] Distributed index server

2019-04-22 Thread GitBox
xuchuanyin commented on a change in pull request #3177: [CARBONDATA-3306] 
Distributed index server
URL: https://github.com/apache/carbondata/pull/3177#discussion_r277291187
 
 

 ##
 File path: 
core/src/main/java/org/apache/carbondata/core/datamap/DataMapStoreManager.java
 ##
 @@ -467,6 +467,33 @@ public void clearInvalidSegments(CarbonTable carbonTable, 
List segments
 
   }
 
+  public void refreshSegmentCacheIfRequired(CarbonTable carbonTable,
+  SegmentUpdateStatusManager updateStatusManager, List 
filteredSegmentToAccess)
+  throws IOException {
+List toBeCleanedSegments = new ArrayList<>();
+for (Segment filteredSegment : filteredSegmentToAccess) {
+  boolean refreshNeeded =
+  
DataMapStoreManager.getInstance().getTableSegmentRefresher(carbonTable)
+  .isRefreshNeeded(filteredSegment,
+  
updateStatusManager.getInvalidTimestampRange(filteredSegment.getSegmentNo()));
+  if (refreshNeeded) {
+toBeCleanedSegments.add(filteredSegment.getSegmentNo());
+  }
+}
+// Clean segments if refresh is needed
+for (Segment segment : filteredSegmentToAccess) {
+  if 
(DataMapStoreManager.getInstance().getTableSegmentRefresher(carbonTable)
+  .isRefreshNeeded(segment.getSegmentNo())) {
+toBeCleanedSegments.add(segment.getSegmentNo());
+  }
+}
+if (toBeCleanedSegments.size() > 0) {
+  DataMapStoreManager.getInstance()
 
 Review comment:
   please optimize the code format to keep the code more compact


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] xuchuanyin commented on a change in pull request #3177: [CARBONDATA-3306] Distributed index server

2019-04-22 Thread GitBox
xuchuanyin commented on a change in pull request #3177: [CARBONDATA-3306] 
Distributed index server
URL: https://github.com/apache/carbondata/pull/3177#discussion_r277287840
 
 

 ##
 File path: 
core/src/main/java/org/apache/carbondata/core/cache/CarbonLRUCache.java
 ##
 @@ -54,7 +54,7 @@
   /**
* totalSize size of the cache
*/
-  private long currentSize;
+  public long currentSize;
 
 Review comment:
   Why is this required?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] xuchuanyin commented on a change in pull request #3177: [CARBONDATA-3306] Distributed index server

2019-04-22 Thread GitBox
xuchuanyin commented on a change in pull request #3177: [CARBONDATA-3306] 
Distributed index server
URL: https://github.com/apache/carbondata/pull/3177#discussion_r277487010
 
 

 ##
 File path: 
core/src/main/java/org/apache/carbondata/core/indexstore/Blocklet.java
 ##
 @@ -21,7 +21,7 @@
 import java.io.IOException;
 import java.io.Serializable;
 
-import org.apache.carbondata.core.metadata.schema.table.Writable;
+import org.apache.hadoop.io.Writable;
 
 Review comment:
   ? why is this modification required?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services