arunramani opened a new pull request, #16140:
URL: https://github.com/apache/druid/pull/16140

   ### Description
   This fixes an issue where MSQ ingestion failed if the first monitor in the 
list is `ServiceStatusMonitor`. Before this change, setting the 
`ServiceStatusMonitor` as the first monitor will fail with errors in `CliPeon` 
like below
   
   ```
   1) Error in custom provider, java.lang.RuntimeException: 
com.fasterxml.jackson.databind.exc.ValueInstantiationException: Cannot 
construct instance of 
`org.apache.druid.segment.virtual.ExpressionVirtualColumn`, problem: Unable to 
provision, see the following errors:
   
   1) Problem parsing object at prefix[druid.lookup]: Cannot construct instance 
of `org.apache.druid.query.lookup.LookupListeningAnnouncerConfig`, problem: 
Unable to provision, see the following errors:
   
   1) Error in custom provider, java.lang.IllegalStateException: This is a 
proxy used to support circular references. The object we're proxying is not 
constructed yet. Please wait until after injection has completed to use this 
object.
     at org.apache.druid.cli.CliPeon$1.getDataSourceFromTask(CliPeon.java:344) 
(via modules: com.google.inject.util.Modules$OverrideModule -> 
com.google.inject.util.Modules$OverrideModule -> org.apache.druid.cli.CliPeon$1)
     at org.apache.druid.cli.CliPeon$1.getDataSourceFromTask(CliPeon.java:344) 
(via modules: com.google.inject.util.Modules$OverrideModule -> 
com.google.inject.util.Modules$OverrideModule -> org.apache.druid.cli.CliPeon$1)
     while locating java.lang.String annotated with 
@com.google.inject.name.Named(value="druidDataSource")
       for field at 
org.apache.druid.server.metrics.DataSourceTaskIdHolder.dataSource(DataSourceTaskIdHolder.java:29)
     at 
org.apache.druid.server.metrics.MetricsModule.configure(MetricsModule.java:92) 
(via modules: com.google.inject.util.Modules$OverrideModule -> 
com.google.inject.util.Modules$OverrideModule -> 
org.apache.druid.server.metrics.MetricsModule)
     while locating org.apache.druid.server.metrics.DataSourceTaskIdHolder
   Caused by: java.lang.IllegalStateException: This is a proxy used to support 
circular references. The object we're proxying is not constructed yet. Please 
wait until after injection has completed to use this object.
        at 
com.google.common.base.Preconditions.checkState(Preconditions.java:512)
        at 
com.google.inject.internal.DelegatingInvocationHandler.invoke(DelegatingInvocationHandler.java:36)
        at com.sun.proxy.$Proxy103.getDataSource(Unknown Source)
        at 
org.apache.druid.cli.CliPeon$1.getDataSourceFromTask(CliPeon.java:344)
        at 
org.apache.druid.cli.CliPeon$1$$FastClassByGuice$$1ae344b1.invoke(<generated>)
        at 
com.google.inject.internal.ProviderMethod$FastClassProviderMethod.doProvision(ProviderMethod.java:264)
        at 
com.google.inject.internal.ProviderMethod$Factory.provision(ProviderMethod.java:401)
        at 
com.google.inject.internal.ProviderMethod$Factory.get(ProviderMethod.java:376)
        at 
com.google.inject.internal.ProviderToInternalFactoryAdapter$1.call(ProviderToInternalFactoryAdapter.java:46)
        at 
com.google.inject.internal.InjectorImpl.callInContext(InjectorImpl.java:1092)
   ```
   
   The problem is happening when the task file is deserialized in 
`CliPeon.readTask`. This is the sequence of events
   * `CliPeon.readTask` uses Jackson to deserialize the task
   * Jackson uses guice to find some injectable values. One of these is 
`DataSourceTaskIdHolder`
   * To inject that value, guice decides to use the provider annotated by 
`CliPeon.getDataSourceFromTask` and `CliPeon.getTaskIDFromTask`.
   
   This creates a circular ref since `readTask` is the provider for `Task` and 
`CliPeon.getDataSourceFromTask` and `CliPeon.getTaskIDFromTask` use `Task` as 
their inputs.
   
   However if you set any other monitor first such as `JvmMonitor`, then the 
injector doesn’t use the providers from `CliPeon`. Jackson is able to bind both 
datasource and task ID from the task file itself.
   
   For now triggering the binding of `DataSourceTaskIdHolder` before any of the 
monitors are loaded resolves the issue but this does not seem like the right 
fix. I have not been able to figure out what the correct fix is but I suspect 
it's a lot deeper than this one-liner.
   
   ##### Key changed/added classes in this PR
    * `MetricsModule`
   
   This PR has:
   
   - [x] been self-reviewed.
   - [ ] a release note entry in the PR description.
   - [ ] added unit tests or modified existing tests to cover new code paths, 
ensuring the threshold for [code 
coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md)
 is met.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to