Hey everyone!

In the past, I have performed long running tasks such as graph data imports
with OSM and other sources by running java code while Neo4j server was
taken offline. However, I find it incredibly inefficient to do it over
again if the use case involves frequent execution of these tasks (such as
updating spatial data from the freshest OSM source on a weekly basis).

I intend to achieve the following flow:

Send parameters to a REST API Endpoint -> Queue a job for background
processing -> Track and report progress

I'd like to have a generic worker for specific long-running tasks, that
takes parameters and performs a specific job. So far, I have not found any
ready resources for this. I'd be happy to know, if they are available. That
could save a lot of work for me.

Broadly, I have an unmanaged extension that:

   1. Exposes a REST API endpoint to accept parameters such as a URL to an
   OSM source file (.pbf, .zip, or .xml)
   2. Pushes the URL and some more meta information for logging and
   monitoring progress, to a message queue like Amazon SQS (or RabbitMQ or
   AMQP or anything else that can be implemented)

I am looking for ways to implement a background worker that:

   1. Wakes up periodically (maybe with geometric or Fibonacci backoff that
   resets with every message that is processed and/or upon reaching a
   threshold) to check for any messages in queue.
   2. Upon a new found message, it invokes the relevant factory class to
   perform a specific action, and logs specific progress checkpoints on a
   temporary sub-graph in Neo4j. I could even implement pub-sub to report
   real-time progress, but that is not a priority at the moment. E.g. Import
   OSM data and log the status of running tasks.

Finally, I'd expose another REST API endpoint to retrieve currently running
job statuses, so that I could have a separately built lean front-end client
implementation to display and manage jobs with specific actions such as
cancellation, setting priority, etc via specific REST API endpoints to
modify/delete queued jobs. The front-end client could also take parameters
(such as an OSM file URL) to queue a job with Neo4j.

As mentioned above, implementing a background worker to process queued jobs
is currently a blind spot for me.

I'd be very happy and grateful to know of a better way to achieve execution
of long-running tasks without taking the server offline, and via REST API
extensions.

--
Cheers,
Nikhil

-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to neo4j+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to