[GitHub] [carbondata] jack86596 commented on a change in pull request #4057: [CARBONDATA-4088] Drop metacache didn't clear some cache information …

2020-12-23 Thread GitBox


jack86596 commented on a change in pull request #4057:
URL: https://github.com/apache/carbondata/pull/4057#discussion_r547918254



##
File path: docs/configuration-parameters.md
##
@@ -149,6 +149,7 @@ This section provides the details of all the configurations 
required for the Car
 | carbon.max.pagination.lru.cache.size.in.mb | -1 | Maximum memory **(in MB)** 
upto which the SDK pagination reader can cache the blocklet rows. Suggest to 
configure as multiple of blocklet size. Default value of -1 means there is no 
memory limit for caching. Only integer values greater than 0 are accepted. |
 | carbon.partition.max.driver.lru.cache.size | -1 | Maximum memory **(in MB)** 
upto which driver can cache partition metadata. Beyond this, least recently 
used data will be removed from cache before loading new set of values.
 | carbon.mapOrderPushDown._.column| empty | If order by 
column is in sort column, specify that sort column here to avoid ordering at 
map task . |
+| carbon.metacache.expiration.seconds | Long.MAX_VALUE | Expiration time **(in 
seconds)** for tableInfo cache in CarbonMetadata and tableModifiedTime in 
CarbonFileMetastore, after the time configured since last access to the cache 
entry, tableInfo and tableModifiedTime will be removed from each cache. Recent 
access will refresh the timer. Default value of Long.MAX_VALUE means the cache 
will not be expired by time. **NOTE:** At the time when cache is being expired, 
queries on the table may fail with NullPointerException. |

Review comment:
   @kunal642 https://issues.apache.org/jira/browse/CARBONDATA-4098 raised 
one jira to tack the null pointer issue.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] jack86596 commented on a change in pull request #4057: [CARBONDATA-4088] Drop metacache didn't clear some cache information …

2020-12-22 Thread GitBox


jack86596 commented on a change in pull request #4057:
URL: https://github.com/apache/carbondata/pull/4057#discussion_r547766036



##
File path: docs/configuration-parameters.md
##
@@ -149,6 +149,7 @@ This section provides the details of all the configurations 
required for the Car
 | carbon.max.pagination.lru.cache.size.in.mb | -1 | Maximum memory **(in MB)** 
upto which the SDK pagination reader can cache the blocklet rows. Suggest to 
configure as multiple of blocklet size. Default value of -1 means there is no 
memory limit for caching. Only integer values greater than 0 are accepted. |
 | carbon.partition.max.driver.lru.cache.size | -1 | Maximum memory **(in MB)** 
upto which driver can cache partition metadata. Beyond this, least recently 
used data will be removed from cache before loading new set of values.
 | carbon.mapOrderPushDown._.column| empty | If order by 
column is in sort column, specify that sort column here to avoid ordering at 
map task . |
+| carbon.metacache.expiration.seconds | Long.MAX_VALUE | Expiration time **(in 
seconds)** for tableInfo cache in CarbonMetadata and tableModifiedTime in 
CarbonFileMetastore, after the time configured since last access to the cache 
entry, tableInfo and tableModifiedTime will be removed from each cache. Recent 
access will refresh the timer. Default value of Long.MAX_VALUE means the cache 
will not be expired by time. **NOTE:** At the time when cache is being expired, 
queries on the table may fail with NullPointerException. |

Review comment:
   I think this feature is just for those customers who care about the 
memory leak. So if the customer doesn't want memory leak, he can choose a 
proper expired time which at the time of cache expired, the table will never be 
queried. For example, customer will have a new table every day, and this table 
will be queried only at that day. So he can choose 1 week as the property value 
so that this table cache will only leak for one week and after one week will be 
cleared.
   @ajantha-bhat the memory leak is not about the index cache, it is table 
metadata cache.
   For those customers who don't care about memory leak, they don't need to do 
anything. All the behavior keep same as before change.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] jack86596 commented on a change in pull request #4057: [CARBONDATA-4088] Drop metacache didn't clear some cache information …

2020-12-22 Thread GitBox


jack86596 commented on a change in pull request #4057:
URL: https://github.com/apache/carbondata/pull/4057#discussion_r547766036



##
File path: docs/configuration-parameters.md
##
@@ -149,6 +149,7 @@ This section provides the details of all the configurations 
required for the Car
 | carbon.max.pagination.lru.cache.size.in.mb | -1 | Maximum memory **(in MB)** 
upto which the SDK pagination reader can cache the blocklet rows. Suggest to 
configure as multiple of blocklet size. Default value of -1 means there is no 
memory limit for caching. Only integer values greater than 0 are accepted. |
 | carbon.partition.max.driver.lru.cache.size | -1 | Maximum memory **(in MB)** 
upto which driver can cache partition metadata. Beyond this, least recently 
used data will be removed from cache before loading new set of values.
 | carbon.mapOrderPushDown._.column| empty | If order by 
column is in sort column, specify that sort column here to avoid ordering at 
map task . |
+| carbon.metacache.expiration.seconds | Long.MAX_VALUE | Expiration time **(in 
seconds)** for tableInfo cache in CarbonMetadata and tableModifiedTime in 
CarbonFileMetastore, after the time configured since last access to the cache 
entry, tableInfo and tableModifiedTime will be removed from each cache. Recent 
access will refresh the timer. Default value of Long.MAX_VALUE means the cache 
will not be expired by time. **NOTE:** At the time when cache is being expired, 
queries on the table may fail with NullPointerException. |

Review comment:
   I think this feature is just for those customers who care about the 
memory leak. So if the customer doesn't want memory leak, he can choose a 
proper expired time which at the time of cache expired, the table will never be 
queried. For example, customer will have a new table every day, and this table 
will be queried only at that day. So he can choose 1 week as the property value 
so that this table cache will only leak for one week and after one week will be 
cleared.
   @ajantha-bhat the memory leak is not about the index cache, it is table 
metadata cache.
   For those customers who don't care about memory leak, they don't need to 
anything. All the behavior keep same as before change.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] jack86596 commented on a change in pull request #4057: [CARBONDATA-4088] Drop metacache didn't clear some cache information …

2020-12-22 Thread GitBox


jack86596 commented on a change in pull request #4057:
URL: https://github.com/apache/carbondata/pull/4057#discussion_r547766036



##
File path: docs/configuration-parameters.md
##
@@ -149,6 +149,7 @@ This section provides the details of all the configurations 
required for the Car
 | carbon.max.pagination.lru.cache.size.in.mb | -1 | Maximum memory **(in MB)** 
upto which the SDK pagination reader can cache the blocklet rows. Suggest to 
configure as multiple of blocklet size. Default value of -1 means there is no 
memory limit for caching. Only integer values greater than 0 are accepted. |
 | carbon.partition.max.driver.lru.cache.size | -1 | Maximum memory **(in MB)** 
upto which driver can cache partition metadata. Beyond this, least recently 
used data will be removed from cache before loading new set of values.
 | carbon.mapOrderPushDown._.column| empty | If order by 
column is in sort column, specify that sort column here to avoid ordering at 
map task . |
+| carbon.metacache.expiration.seconds | Long.MAX_VALUE | Expiration time **(in 
seconds)** for tableInfo cache in CarbonMetadata and tableModifiedTime in 
CarbonFileMetastore, after the time configured since last access to the cache 
entry, tableInfo and tableModifiedTime will be removed from each cache. Recent 
access will refresh the timer. Default value of Long.MAX_VALUE means the cache 
will not be expired by time. **NOTE:** At the time when cache is being expired, 
queries on the table may fail with NullPointerException. |

Review comment:
   I think this feature is just for those customers who care about the 
memory leak. So if the customer doesn't want memory leak, he can choose a 
proper expired time which at the time of cache expired, the table will never be 
queried. For example, customer will have a new table every day, and this table 
will be queried only at that day. So he can choose 1 week as the property value 
so that this table cache will only leak for one week and after one week will be 
cleared.
   @ajantha-bhat the memory leak is not about the index cache, it is table 
metadata cache.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] jack86596 commented on a change in pull request #4057: [CARBONDATA-4088] Drop metacache didn't clear some cache information …

2020-12-22 Thread GitBox


jack86596 commented on a change in pull request #4057:
URL: https://github.com/apache/carbondata/pull/4057#discussion_r547692764



##
File path: docs/configuration-parameters.md
##
@@ -149,6 +149,7 @@ This section provides the details of all the configurations 
required for the Car
 | carbon.max.pagination.lru.cache.size.in.mb | -1 | Maximum memory **(in MB)** 
upto which the SDK pagination reader can cache the blocklet rows. Suggest to 
configure as multiple of blocklet size. Default value of -1 means there is no 
memory limit for caching. Only integer values greater than 0 are accepted. |
 | carbon.partition.max.driver.lru.cache.size | -1 | Maximum memory **(in MB)** 
upto which driver can cache partition metadata. Beyond this, least recently 
used data will be removed from cache before loading new set of values.
 | carbon.mapOrderPushDown._.column| empty | If order by 
column is in sort column, specify that sort column here to avoid ordering at 
map task . |
+| carbon.metacache.expiration.seconds | Long.MAX_VALUE | Expiration time **(in 
seconds)** for tableInfo cache in CarbonMetadata and tableModifiedTime in 
CarbonFileMetastore, after the time configured since last access to the cache 
entry, tableInfo and tableModifiedTime will be removed from each cache. Recent 
access will refresh the timer. Default value of Long.MAX_VALUE means the cache 
will not be expired by time. **NOTE:** At the time when cache is being expired, 
queries on the table may fail with NullPointerException. |

Review comment:
   Yes, if we set carbon.metacache.expiration.seconds to very small like 
1s, UT will fail with NullPointerException.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] jack86596 commented on a change in pull request #4057: [CARBONDATA-4088] Drop metacache didn't clear some cache information …

2020-12-22 Thread GitBox


jack86596 commented on a change in pull request #4057:
URL: https://github.com/apache/carbondata/pull/4057#discussion_r547319531



##
File path: docs/configuration-parameters.md
##
@@ -149,6 +149,7 @@ This section provides the details of all the configurations 
required for the Car
 | carbon.max.pagination.lru.cache.size.in.mb | -1 | Maximum memory **(in MB)** 
upto which the SDK pagination reader can cache the blocklet rows. Suggest to 
configure as multiple of blocklet size. Default value of -1 means there is no 
memory limit for caching. Only integer values greater than 0 are accepted. |
 | carbon.partition.max.driver.lru.cache.size | -1 | Maximum memory **(in MB)** 
upto which driver can cache partition metadata. Beyond this, least recently 
used data will be removed from cache before loading new set of values.
 | carbon.mapOrderPushDown._.column| empty | If order by 
column is in sort column, specify that sort column here to avoid ordering at 
map task . |
+| carbon.metacache.expiration.seconds | Long.MAX_VALUE | Expiration time **(in 
seconds)** for tableInfo cache in CarbonMetadata, after the time configured 
since last access to the cache entry, tableInfo will be removed from cache. 
Recent access will refresh the timer. Default value of Long.MAX_VALUE means the 
cache will not be expired by time. **NOTE:** At the time when cache is being 
expired, queries on the table may fail with NullPointerException. |

Review comment:
   added.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] jack86596 commented on a change in pull request #4057: [CARBONDATA-4088] Drop metacache didn't clear some cache information …

2020-12-22 Thread GitBox


jack86596 commented on a change in pull request #4057:
URL: https://github.com/apache/carbondata/pull/4057#discussion_r547319412



##
File path: 
core/src/main/java/org/apache/carbondata/core/util/CarbonProperties.java
##
@@ -2086,6 +2086,27 @@ public boolean isSetLenientEnabled() {
 return Boolean.parseBoolean(configuredValue);
   }
 
+  public long getMetaCacheExpirationTime() {
+String configuredValue = CarbonProperties.getInstance()
+
.getProperty(CarbonCommonConstants.CARBON_METACACHE_EXPIRATION_TIME_IN_SECONDS);
+if (configuredValue == null || configuredValue.equalsIgnoreCase("0")) {
+  return 
CarbonCommonConstants.CARBON_METACACHE_EXPIRATION_TIME_IN_SECONDS_DEFAULT;
+}
+try {
+  long expirationTime = Long.parseLong(configuredValue);
+  LOGGER.info("Value for "
+  + CarbonCommonConstants.CARBON_METACACHE_EXPIRATION_TIME_IN_SECONDS 
+ " is "
+  + expirationTime + ".");
+  return expirationTime;
+} catch (NumberFormatException e) {
+  LOGGER.info(configuredValue + " is not a valid input for "

Review comment:
   changed to warning log.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org