selfxp commented on a change in pull request #4885:
URL: https://github.com/apache/openwhisk/pull/4885#discussion_r414902423
##########
File path:
common/scala/src/main/scala/org/apache/openwhisk/core/containerpool/logging/SplunkLogStore.scala
##########
@@ -122,13 +125,15 @@ class SplunkLogStore(
"search" -> search,
"output_mode" -> "json",
"earliest_time" -> start
- .getOrElse(Instant.EPOCH)
+ .getOrElse(Instant.now().minus(splunkConfig.earliestTimeOffsetHours,
ChronoUnit.HOURS))
.minusSeconds(splunkConfig.queryTimestampOffsetSeconds)
.toString, //assume that activation start/end are UTC zone, and
splunk events are the same
"latest_time" -> end
.getOrElse(Instant.now())
.plusSeconds(splunkConfig.queryTimestampOffsetSeconds) //add 5s to
avoid a timerange of 0 on short-lived activations
- .toString)).toEntity
+ .toString,
+ "max_time" -> splunkConfig.finalizeMaxTime.toString //max time for the
search query to run in seconds
Review comment:
`max_time` behavior is the following: after the defined interval, the
query will return the results it found. If it didn't have time to finish then
it will return 0 or partial results. It never fails. This configuration
represents a great way to protect against querying huge data sets.
Yes, I have done extensive testing when I first started developing this PR
on how long the new query will take, as no `earliest_time` and `latest_time`
constraints would be available when no activation record was present. The
results were not promising at first as the query execution time would take
minutes, but then I realized that the query was not correctly formatted: the
`spath` was mixed with the `search` parameters. Moving `spath` after search,
fixed the issue, and the time spent to search with time constraints, was almost
the same as the time it took executing the query without the time constraints.
before:
```
"search=search index=someindex | spath=log_message | search namespace=guest
| search activation_id=a930e5ae4ad4455c8f2505d665aad282 | table log_message"
```
after (huge performance benefits as the search executes in one step and the
parsing is applied to a small dataset) :
```
"search=search index=someindex | search namespace=guest | search
activation_id=a930e5ae4ad4455c8f2505d665aad282 | spath=log_message | table
log_message"
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]