Hey everyone!

In the past, I have performed long running tasks such as graph data imports
with OSM and other sources by running java code while Neo4j server was
taken offline. However, I find it incredibly inefficient to do it over
again if the use case involves frequent execution of these tasks (such as
updating spatial data from the freshest OSM source on a weekly basis).

I intend to achieve the following flow:

Send parameters to a REST API Endpoint -> Queue a job for background
processing -> Track and report progress

I'd like to have a generic worker for specific long-running tasks, that
takes parameters and performs a specific job. So far, I have not found any
ready resources for this. I'd be happy to know, if they are available. That
could save a lot of work for me.

Broadly, I have an unmanaged extension that:

   1. Exposes a REST API endpoint to accept parameters such as a URL to an
   OSM source file (.pbf, .zip, or .xml)
   2. Pushes the URL and some more meta information for logging and
   monitoring progress, to a message queue like Amazon SQS (or RabbitMQ or
   AMQP or anything else that can be implemented)

I am looking for ways to implement a background worker that:

   1. Wakes up periodically (maybe with geometric or Fibonacci backoff that
   resets with every message that is processed and/or upon reaching a
   threshold) to check for any messages in queue.
   2. Upon a new found message, it invokes the relevant factory class to
   perform a specific action, and logs specific progress checkpoints on a
   temporary sub-graph in Neo4j. I could even implement pub-sub to report
   real-time progress, but that is not a priority at the moment. E.g. Import
   OSM data and log the status of running tasks.

Finally, I'd expose another REST API endpoint to retrieve currently running
job statuses, so that I could have a separately built lean front-end client
implementation to display and manage jobs with specific actions such as
cancellation, setting priority, etc via specific REST API endpoints to
modify/delete queued jobs. The front-end client could also take parameters
(such as an OSM file URL) to queue a job with Neo4j.

As mentioned above, implementing a background worker to process queued jobs
is currently a blind spot for me.

I'd be very happy and grateful to know of a better way to achieve execution
of long-running tasks without taking the server offline, and via REST API


You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to neo4j+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to