Yes, you can use the task framework, which hasn't been released yet, but will be soon. For more on the task framework, you can read this blog post: http://engineering.linkedin.com/distributed-systems/ad-hoc-task-management-apache-helix You can submit a job with 1000 tasks using either Java or YAML. The YAML specification of this job would look something like: name: MyWorkflowjobs: - name: RunQueries command: RunQuery # The command corresponding to Task callbacks jobConfigMap: { # Arbitrary key-value pairs to pass to all tasks in this job k1: "v1", k2: "v2" } numConcurrentTasksPerInstance: 200 # Max parallelism per instance tasks: # Schedule 1000 tasks, each responsible for aggregating requests for a chunk of partitions - taskConfigMap: { # Arbitrary key-value pairs to pass to this task query: "query1" } - taskConfigMap: { query: "query2" } - taskConfigMap: { query: "query3" } # Repeat for remaining 997 tasks
You can also see this class for an example of how to build jobs in Java: https://github.com/apache/helix/blob/master/helix-core/src/test/java/org/apache/helix/integration/task/TestIndependentTaskRebalancer.java Then you just need to implement a Task callback and register it on each of the instances, and Helix will take care of assignment and retries. Date: Thu, 21 Aug 2014 09:07:11 -0700 Subject: Helix parallelism From: [email protected] To: [email protected] Hi, I just started looking at the capability that helix can do Parallelism executing task evenly in the cluster instances, resources. I have a requirement in executing different queries but in parallel to solve some issue. Can helix help in this case? For example1. I have some 1000 different queries to be executed.2. I have 5 nodes configured in the helix cluster capable of executing set of queries.3. I need helix to distribute these 1000 different queries equally to the 5 nodes (200 per node) and takes care re-executing failed set of queries. And notifies the controller about the job done. Can someone help me in understand how helix can solve this kind of issue? Regards,Maha
