This  note applies to applications running in Query console *outside* of the 
cluster using the ELB as the entry point and running long lived requests.
This is actually true regardless of AWS or ELB's   in nearly any networking 
environment - especially if there are routers involved -
a long lived idle connection can be aborted ...

But your description doesn’t mention running QC or how your connect to your app.
Could you provide details on where your app is running, what language, what 
system,  are you connecting through the ELB or not ?
Are you using the REST Client or XCC ? or is this a locally running task job ?

The general solution to this kind of problem - AWS independent - is to not run 
queries (over TCP of any sort) that are "idle" for long periods.
No matter what you set the request timeout to be there are many cases where the 
networking infrastructure will just kill your connection after a while.
For example on Windows, its about 3 minutes max then your dead.     This 
happens even when you set the keep-alive flags on the socket layer !

So the trick is that you need to run shorter running queries *between the time 
of the request and response* ...
If your query is going to take over say 1 minute ... then run it as a spawn() 
and return back to your app quickly.
This will free up the connection and keep networking IT from killing your 
"idle" connection.

You can also run smaller batches.   If your batches are taking more than a few 
seconds each then running more smaller batches will have little additional 
overhead,
but will return back to the client more frequently keeping the connection 
alive.  They are also more reliable and recoverable because if you do run into 
an error
there are fewer outstanding requests to investigate or retry.






-----------------------------------------------------------------------------
David Lee
Lead Engineer
MarkLogic Corporation
[email protected]
Phone: +1 812-482-5224
Cell:  +1 812-630-7622
www.marklogic.com<http://www.marklogic.com/>

From: [email protected] 
[mailto:[email protected]] On Behalf Of Prasanth N V R
Sent: Thursday, October 09, 2014 2:24 AM
To: MarkLogic Developer Discussion
Subject: [MarkLogic Dev General] SVC-CANCELLED - EC2 - S3 Object access

Hi,
We have created an application deployed in EC2 ML which reads the data from AWS 
S3 and process it and loaded in ML database.

We stored 5Million documents in a bucket.
An http post request will be hitting an XQuery which will read a document from 
S3 and apply some logic and store in the ML db.
This process will be executed for every 2 mins , to start processing  200 
documents one by one. So this will create 200 http post calls for every 2 mins.

XDMP-CANCELLED or SVC-CANCELLED error is thrown in the middle of the process 
and stops few post call which intern stops processing all the documenst from S3.

We read from the below forum the load balancer configured in EC2 will cancel 
few request.
https://help.marklogic.com/Knowledgebase/Article/View/151/0/running-query-console-queries-on-amazon-web-services

Is there any options to tune the EC2 ML to avoid getting this error?

Thanks,
Prasanth
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to