selfxp commented on a change in pull request #4885:
URL: https://github.com/apache/openwhisk/pull/4885#discussion_r414902423



##########
File path: 
common/scala/src/main/scala/org/apache/openwhisk/core/containerpool/logging/SplunkLogStore.scala
##########
@@ -122,13 +125,15 @@ class SplunkLogStore(
         "search" -> search,
         "output_mode" -> "json",
         "earliest_time" -> start
-          .getOrElse(Instant.EPOCH)
+          .getOrElse(Instant.now().minus(splunkConfig.earliestTimeOffsetHours, 
ChronoUnit.HOURS))
           .minusSeconds(splunkConfig.queryTimestampOffsetSeconds)
           .toString, //assume that activation start/end are UTC zone, and 
splunk events are the same
         "latest_time" -> end
           .getOrElse(Instant.now())
           .plusSeconds(splunkConfig.queryTimestampOffsetSeconds) //add 5s to 
avoid a timerange of 0 on short-lived activations
-          .toString)).toEntity
+          .toString,
+        "max_time" -> splunkConfig.finalizeMaxTime.toString //max time for the 
search query to run in seconds

Review comment:
       `max_time` behavior is the following: after the defined interval, the 
query will return the results it found. If it didn't have time to finish then 
it will return 0 or partial results. It never fails. This configuration 
represents a great way to protect against querying huge data sets. 
   
   Yes, I have done extensive testing when I first started developing this PR 
on how long the new query will take, as no `earliest_time` and `latest_time` 
constraints would be available when no activation record was present. The 
results were not promising at first as the query execution time  would take 
minutes, but then I realized that the query was not correctly formatted: the 
`spath` was mixed with the `search` parameters. Moving `spath` after search, 
fixed the issue, and the time spent to search with time constraints, was almost 
the same as the time it took executing the query without the time constraints.
   
   before: 
   ```
   "search=search index=someindex | spath=log_message | search namespace=guest 
| search activation_id=a930e5ae4ad4455c8f2505d665aad282 | table log_message"
   ```
   after (huge performance benefits as the search executes in one step and the 
parsing is applied to a small dataset) :
   ```
   "search=search index=someindex | search namespace=guest | search 
activation_id=a930e5ae4ad4455c8f2505d665aad282 | spath=log_message | table 
log_message"
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to