Fwd: Real-time computing

Andrey Mashenkov Wed, 01 Mar 2017 07:04:04 -0800

Hi Neeraj,

1. You can have as many clients as you need to mitigate SPOF.
BTW, it would be better to use watcher [1] for file system changes, if
possible.


Why you want to use only one client at a time to send jobs?

2. All of 5 listeners will get these event, it is better to use some kind
of event filter. Or even better to use affinity call from client [2]


[1] https://docs.oracle.com/javase/tutorial/essential/io/notification.html
[2] https://apacheignite.readme.io/docs/collocate-compute-and-data

On Fri, Feb 24, 2017 at 3:08 AM, Neeraj Vaidya <[email protected]>
wrote:

> Thanks Andrey,
>
> 1) If I understand correctly, the FailOver feature of the ComputeGrid is
> to mitigate SPOF for the compute jobs i.e. the Callables/Runnables/Closures
> which are distributed to multiple nodes. But my goal was also to mitigate
> the failure of the client node which is responsible for reading external
> files and creating the compute jobs collection.Can the FailOverSpi handle
> that as well ? My pseudo-code of the client node is as follows :
> - Check for files in filesystem
> - If present, then for each line present in the file, create a compute
> job. (If I understand correctly, this is the piece of code which falls
> under the scope of FailOverSpi)
> - Finally, loop back to wait for any more files.
>
> 2) Coming to my second question. Let's say I cache the CDR file
> records/entries into a certain cache e.g. : "CDRFileCache". I then run 5
> nodes each with a listener waiting for new entries to be added to this
> cache.
> - If I stream 3 entries into this cache, one after another, will all
> listeners process all these 3 entries ? i.e will entry1,2,3 be processed by
> listener1,2,3,4 and 5 ?
> - Or is it that if listener1 is processing entry1, then entry1 will not be
> processed by any other listener because listener1 has already started
> processing it ?
>
> Regards,
> Neeraj
>
> --------------------------------------------
> On Fri, 24/2/17, Andrey Mashenkov <[email protected]> wrote:
>
>  Subject: Re: Real-time computing
>  To: [email protected], "Neeraj Vaidya" <[email protected]>
>  Date: Friday, 24 February, 2017, 2:20 AM
>
>  Hi Neeraj,
>  1. Why you want
>  to use Zookeeper to mitigating an SPOF instead of Ignite
>  ComputeGrid failover features?
>  2. If you need
>  to reuse data then caching makes sense. For processing new
>  entries you can use Events or Continuous
>  queries. You are free in
>  choosing number of nodes for your grid. You can choose what
>  nodes will hold data and what nodes will be used for
>  computations.
>  I'm not
>  sure I understand last question. Would you please detail the
>  last use case?
>
>  On Thu, Feb 23, 2017 at
>  3:23 AM, Neeraj Vaidya <[email protected]>
>  wrote:
>  Hi,
>
>
>
>  I have a use case where I need to perform computation on
>  records in files (specifically files containing telecom
>  CDRs).
>
>
>
>  To this, I have a few questions :
>
>
>
>  1) Should I have just one client node which reads these
>  records and creates Callable compute jobs for each record ?
>  With just 1 client node, I suppose this will be a
>  single-point of failure. I could use Zookeeper to manage a
>  cluster of such nodes, thus possibly mitigating an SPOF.
>
>
>
>  2) Or should I stream/load these records using a client,
>  into a cache and then have other cluster nodes read this
>  cache for new entries and then let them perform the
>  computation ? In this case, is there a way by which I can
>  have only one node get hold of computing every record ?
>
>
>
>  Regards,
>
>  Neeraj
>
>
>
>
>  --
>  Best regards,
>  Andrey V.
>  Mashenkov
>
>
>


-- 
Best regards,
Andrey V. Mashenkov



-- 
Best regards,
Andrey V. Mashenkov

Fwd: Real-time computing

Reply via email to