Roland Johann created ZEPPELIN-3765:
---------------------------------------

             Summary: Dependency download via Proxy inefficient
                 Key: ZEPPELIN-3765
                 URL: https://issues.apache.org/jira/browse/ZEPPELIN-3765
             Project: Zeppelin
          Issue Type: Bug
          Components: Interpreters
    Affects Versions: 0.8.0
            Reporter: Roland Johann


Zeppelin tries to download dependencies with repositories in defined order:
 * default maven central
 * local
 * custom/self defined

When zeppelin must use a http proxy custom defined maven central with proxy 
properties solve this particular issue and introduces another one, based on the 
ordering of dependency resolution with defined repositories: First default 
maven central will be used, when the connection can't be established the used 
library (sonatype aether and apache wagon) wait 120 seconds until the request 
fails, then tries subsequent repositories.

In the current implementation one configures aether to use shorter timeouts 
with system properties `aether.connector.connectTimeout` and 
`aether.connector.requestTimeout` which will be used at 
`WagonRepositoryConnector.java:310` but never will be passet to the actual http 
client.

As apache wagon with `LightweightHttpWagon` implementation is in use which 
itself uses plain Java `sun.net.www.http.HttpClient` properties to set timeouts 
will be `sun.net.client.defaultConnectTimeout` and 
`sun.net.client.defaultReadTimeout`.

These Properties can be used as workaround but we shouldn't rely on timeouts 
here.

We can solve the problem in several ways:

Change the kind default repositories are treated/specified:
 * ordering of repos: first custom, then default
 * don't specify default repositories at code but at `interpreter.json` 
configuration file
 * when still definition at code, default maven central repo should get 
optional proxy/auth at `org.apache.zeppelin.dep.Booter#newCentralRepository`

I prefer that there are no repo definitions at code and the user can specify 
everything at `interpreter.json`.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to