This is now building successfully, it is pretty slow on Apache Jenkins (11
mins) but it successfully builds and passes all 2,700 or so tests

I will look at getting this added back into the main build now that the
issues appear to have been resolved.

Rob


On 8/9/13 10:31 AM, "Rob Vesse" <[email protected]> wrote:

>I just had an ah-ha moment after reading this and looking over your latest
>commits.
>
>The issue is that the code makes a lot of queries, queries create their
>own HttpClient instances because otherwise they can't apply timeouts to
>remote requests since timeouts are a global parameter setting on a HTTP
>client.  I am going to try having the query route (HttpQuery) pass the
>HttpClient it creates internally up to the QueryEngineHTTP and have that
>explicitly shutdown the client when the use closes the query execution.
>
>QueryEngineHTTP already ensure that they close the TypedInputStream when
>the query execution is closed.
>
>I will test this out and see if it resolves the issue on Jenkins.
>
>Rob
>
>
>On 8/9/13 6:29 AM, "Andy Seaborne" <[email protected]> wrote:
>
>>The code now in SVN is showing stability for heavy repeated use and for
>>the authentication tests cases.  However, the stability test has always
>>failed non-deterministically so it's not proof it's all working.  I have
>>gone through and tracked down response handling and I hope I have
>>ensured sterams are closed.  If you use HttpOp to get an
>>(Typed)InputStream, then the caller must close that stream, otherwise it
>>does run out of OS resources after a while (1000's of calls).
>>
>>      Andy
>>
>>On 09/08/13 09:24, Andy Seaborne wrote:
>>> The default SystemDefaultHttpClient has a per-route pool of 5 and a
>>> system maximum of 10.  We do have to be careful of this lock-up
>>> possibility. Using DefaultHttpClient directly and setting how we want
>>>is
>>> probably a better style.
>>>
>>> I must look more closely at jena-jdbc - how far though it's tests does
>>> it get?  How many connections have been and gone?
>>>
>>> The HttpOp in the codebase, when it isn't pased a HttpClient, creates a
>>> new one each time and they don't share a pool.  The
>>> SystemDefaultHttpClient is used once so no chances of a lock-up.
>>>
>>> It's not what happening in JENA-498 - there, a single threaded tight
>>> loop is running for a non-deterministic number of times then causing an
>>> exception (seems ot be difefrent on different OS's).
>>>
>>> There is a chance that Fuseki is not closing it's end properly, or
>>> rather early enough, but when I checked the code, it's all down to
>>>Jetty
>>> and that should be pretty well tested.  We run Fuseki for many months
>>>at
>>> a time.
>>>
>>>      Andy
>>>
>>> On 09/08/13 00:39, Rob Vesse wrote:
>>>> The following may be the culprit in JDBC's case:
>>>>
>>>> The PoolingClientConnectionManager will allocate connections based on
>>>>its
>>>> configuration. If all connections for a given route have already been
>>>> leased, a request for a connection will block until a connection is
>>>> released back to the pool. One can ensure the connection manager does
>>>>not
>>>> block indefinitely in the connection request operation by setting
>>>> 'http.conn-manager.timeout' to a positive value. If the connection
>>>> request
>>>> cannot be serviced within the given time period
>>>> ConnectionPoolTimeoutException will be thrown.
>>>>
>>>>
>>>> So HttpClient will block indefinitely until a connection is
>>>> available.  We
>>>> likely want to turn off that behaviour so that when we hit this state
>>>> things get a useful error rather than an infinite hang.
>>>>
>>>> Rob
>>>>
>>>>
>>>> On 8/8/13 4:11 PM, "Andy Seaborne" <[email protected]> wrote:
>>>>
>>>>> Maybe related to JENA-498 (many HttpOps overwhelming the system).
>>>>>
>>>>> But if HttOp uses a shared HttpClient, I was getting lockups.  It
>>>>>does
>>>>> appear to be HTTP error handling (failing to close the input stream
>>>>>of
>>>>> the response when it's 4xx or 5xx - there may be a body still).
>>>>>
>>>>> The other part of a shared HttpClient is the authenticator.  I
>>>>>haven't
>>>>> check that yet. I wonder if we need to make it only the HttpClient is
>>>>> passed in with a HttpAuthenticator alreay set.  The
>>>>>DatasetAccessorHttp
>>>>> could do that.  I haven't check the other uses yet; I doubt it's as
>>>>> clear cut for SPARQL Query etc.
>>>>>
>>>>> With the old code, creating new SystemDefaultHttpClient was not
>>>>>giving
>>>>> connection pooling and reuse; only a fast loop caused a problem
>>>>>(20k-40k
>>>>> iterators).
>>>>>
>>>>> But I don't know why it works on your interval system and not
>>>>> AFS/Jenkins.  Different versions of ARQ/HttpOp?
>>>>>
>>>>>     Andy
>>>>>
>>>>> On 08/08/13 23:44, Rob Vesse wrote:
>>>>>> Yes the module that hangs is the driver for remote endpoints and
>>>>>>stands
>>>>>> up
>>>>>> a Fuseki server and communicates with it using HTTP which of course
>>>>>>now
>>>>>> all goes through HttpOp
>>>>>>
>>>>>> Problem is that I never seem to get an actual exception just hangs
>>>>>>on
>>>>>> the
>>>>>> build server.
>>>>>>
>>>>>> This might also explain why DEBUG level logging makes the build
>>>>>>succeed
>>>>>> because HttpClient is very noisy at DEBUG level and all that logging
>>>>>> likely introduces the delays in the right parts of the code to allow
>>>>>> resources to be freed up.
>>>>>>
>>>>>> Rob
>>>>>>
>>>>>>
>>>>>> On 8/8/13 3:40 PM, "Andy Seaborne" <[email protected]> wrote:
>>>>>>
>>>>>>> On 08/08/13 19:42, Rob Vesse wrote:
>>>>>>>> So I am officially stumped
>>>>>>>>
>>>>>>>> Adding the delay still causes the builds to hang so I really don't
>>>>>>>> understand why the builds fail on Apache Jenkins.  Note that I've
>>>>>>>> been
>>>>>>>> building the JDBC module on our internal Jenkins server for some
>>>>>>>>time
>>>>>>>> and
>>>>>>>> never had an issue there.  Plus the builds run fine on a local
>>>>>>>> machine.
>>>>>>>>
>>>>>>>> If anyone else can take a look or has any suggestions please jump
>>>>>>>>in
>>>>>>>
>>>>>>> <straw-grasping mode>
>>>>>>>
>>>>>>> Are you using HttpOp?  Apache HttpClient?
>>>>>>>
>>>>>>> I'm fairly certain HttpOp can cause resource starvation by improper
>>>>>>> use
>>>>>>> of HttpClient.  However, I haven't managed to find out where for
>>>>>>> certain
>>>>>>> [HTTP Exceptions are my current best guess]. (I can perturb the
>>>>>>> situation by tweaking pooling numbers.)
>>>>>>>
>>>>>>>     Andy
>>>>>>>
>>>>>>>>
>>>>>>>> Rob
>>>>>>>>
>>>>>>>>
>>>>>>>> On 8/8/13 11:12 AM, "Rob Vesse" <[email protected]> wrote:
>>>>>>>>
>>>>>>>>> Ok, so turning the log level back down causes the build to go
>>>>>>>>> back to
>>>>>>>>> failing
>>>>>>>>>
>>>>>>>>> This starts to look like some kind of timing issue manifesting on
>>>>>>>>> the
>>>>>>>>> build server causing the tests to get into a hung state.
>>>>>>>>>Apparently
>>>>>>>>> having the high log level adds sufficient delay into the process
>>>>>>>>>to
>>>>>>>>> avoid
>>>>>>>>> this.
>>>>>>>>>
>>>>>>>>> My next idea is to simply insert a delay between the tests in
>>>>>>>>> question
>>>>>>>>> and
>>>>>>>>> see if that solves things.
>>>>>>>>>
>>>>>>>>> Rob
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 8/8/13 10:55 AM, "Rob Vesse" <[email protected]> wrote:
>>>>>>>>>
>>>>>>>>>> Ok that is very very weird, after turning up the logging for
>>>>>>>>>>that
>>>>>>>>>> module
>>>>>>>>>> the build ran through to success (and generated a ridiculously
>>>>>>>>>> large
>>>>>>>>>> log
>>>>>>>>>> file at the same time).
>>>>>>>>>>
>>>>>>>>>> Next step is to try turning down the log level and see if the
>>>>>>>>>>build
>>>>>>>>>> still
>>>>>>>>>> succeeds.
>>>>>>>>>>
>>>>>>>>>> Rob
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 8/8/13 10:35 AM, "Rob Vesse" <[email protected]> wrote:
>>>>>>>>>>
>>>>>>>>>>> The problem is that nothing is blowing up, the build just gets
>>>>>>>>>>> stuck
>>>>>>>>>>> and
>>>>>>>>>>> hangs until the build timeout plugin steps in and aborts the
>>>>>>>>>>>build
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> The hang is in the tests for the remote endpoint driver which
>>>>>>>>>>>are
>>>>>>>>>>> standing
>>>>>>>>>>> up Fuseki instances.  However if there was some contention for
>>>>>>>>>>> ports
>>>>>>>>>>> in
>>>>>>>>>>> the tests I would expect the tests to just plain fail.
>>>>>>>>>>>
>>>>>>>>>>> I suspect there may be some deadlock of some sort happening
>>>>>>>>>>>when
>>>>>>>>>>> running
>>>>>>>>>>> the tests on the server but it's hard to tell where/what the
>>>>>>>>>>> deadlock
>>>>>>>>>>> is.
>>>>>>>>>>> I am turning the log level for the tests in question to DEBUG
>>>>>>>>>>>and
>>>>>>>>>>> will
>>>>>>>>>>> re-run a build to see if that yields anything more useful.
>>>>>>>>>>>
>>>>>>>>>>> Rob
>>>>>>>>>>>
>>>>>>>>>>> On 8/8/13 6:53 AM, "Andy Seaborne" <[email protected]> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> On 01/08/13 20:56, Rob Vesse wrote:
>>>>>>>>>>>>> I've removed it from the main build for now.  For some reason
>>>>>>>>>>>>>it
>>>>>>>>>>>>> is
>>>>>>>>>>>>> getting stuck (but not crashing) on the Apache build server.
>>>>>>>>>>>>> This
>>>>>>>>>>>>> is
>>>>>>>>>>>>> despite it building fine locally and on our internal build
>>>>>>>>>>>>> servers.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Not sure how to proceed on this - is it worth setting up a
>>>>>>>>>>>>> separate
>>>>>>>>>>>>> build
>>>>>>>>>>>>> for JDBC on the Apache build servers to help try and isolate
>>>>>>>>>>>>>the
>>>>>>>>>>>>> problem?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Rob
>>>>>>>>>>>>
>>>>>>>>>>>> What exactly is blowing up?
>>>>>>>>>>>>
>>>>>>>>>>>> The Apache build servers have all sorts of things on them and
>>>>>>>>>>>>a
>>>>>>>>>>>> wide
>>>>>>>>>>>> range of plugins, which itself can a problem.
>>>>>>>>>>>>
>>>>>>>>>>>>     Andy
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 8/1/13 11:45 AM, "Rob Vesse" <[email protected]> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> I've moved Jena JDBC from Experimental into Trunk and added
>>>>>>>>>>>>>>it
>>>>>>>>>>>>>> to
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>> main build.  The builds are a little nosier that some of the
>>>>>>>>>>>>>> other
>>>>>>>>>>>>>> modules so may want some tweaking to avoid spurious build
>>>>>>>>>>>>>> output.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I haven't attempted to figure out how to add it to the
>>>>>>>>>>>>>>distro
>>>>>>>>>>>>>> because
>>>>>>>>>>>>>> I
>>>>>>>>>>>>>> know nothing about Maven Assembly plugin
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Rob
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Reply via email to