HDHT is for *embedded* state management, fully encapsulated by the
operator. It cannot be used like a central database.

Thanks,
Thomas


On Thu, Sep 1, 2016 at 9:49 AM, Tushar Gosavi <[email protected]>
wrote:

> Hi Jim,
>
> Currently HDHT is accessible only to single operator in a DAG. Single
> HDHT store can not be managed by two different operator at a time
> which could cause metadata corruption. Theoretically HDHT bucket could
> be read from multiple operators, but only one writer is allowed.
>
> In your case a stage in transaction is processed completely by
> different operator and then only next stage can start. It could still
> be achieved by using a single operator which manages HDHT state, and
> having a loop in DAG to send completed transaction ids to sequencer.
>
> - Sequence operator will emit transaction to transaction processing
> operator.
> - If it receives an out of order transaction it will note it down in HDHT.
> - The processing operator will send completed transaction id on a port
> which is connected back to sequence operator.
> - On receiving data on this loopback port, sequence operator will
> update HDHT and search for next transaction in order, which could be
> stored in HDHT and will emit to next processing operator.
>
> - Tushar.
>
>
> On Sat, Aug 27, 2016 at 1:31 AM, Jim <[email protected]> wrote:
> > Good afternoon,
> >
> >
> >
> > I have an apex application where I may receive edi transactions, but
> > sometimes they arrive out of order and I want to hold any out of sequence
> > transactions till the correct time in the flow to process them.
> >
> >
> >
> > For example for a standard order, we will receive from the remote vendor:
> >
> >
> >
> > 1.)    General Acknowledgement
> >
> > 2.)    Detailed Acknowledgement
> >
> > 3.)    Ship Notification
> >
> > 4.)    Invoice
> >
> >
> >
> > They are supposed to be sent and received in that order.
> >
> >
> >
> > However sometimes vendors systems have problems, etc. so they send the
> all
> > of these at the same time, and then we can receive them out of sequence.
> > Data packets for these are very small, say from 1 to 512 bytes, and the
> only
> > time they will be out of sequence, we will receive them very closely
> > together.
> >
> >
> >
> > I am trying to think of the best way to do this in my datatorrent /
> Hadoop /
> > yarn facilities, instead of creating a datatable in postgreSQl and using
> > that.
> >
> >
> >
> > Can I create a flow that works like this (I am not sure if this makes
> sense,
> > or is the best way to solve my problem, while keeping state, etc.
> maintained
> > for all the operators):
> >
> >
> >
> > 1.)    In the inbound transaction router, check the hdht store for the
> order
> > number, if it doesn’t exist, this means it is a new order, if the
> > transaction trying to process is the general acknowledgment, emit the
> data
> > to the general acknowledgement operator; if it is not – store the
> > transaction data into the correct bucket identifying the transaction is
> it
> > for, as well as the next step to be the general acknowledgement in HDHT
> by
> > order number.
> >
> > 2.)    Say the next transaction is the ship notification, in the router,
> we
> > would check the HDHT store, see this is not the next expected transaction
> > (say it is supposed to be the detail acknowledgement), so we would just
> post
> > the data for the ship notification into HDHT the store and say we are
> done.
> >
> > 3.)    Say we now receive the detailed acknowledgement for an order whose
> > next step IS the detailed acknowledgement, we would see this is the
> correct
> > next transaction, emit it to the detailed acknowledgement operator, and
> > update the HDHT store to show that the next transaction should be the
> ship
> > notification.  NOTE:  we can’t emit the ship notification yet, till we
> have
> > confirmed that the detailed ackkowledgment has been completed.
> >
> > 4.)    In each of the 4 transaction operators at the end of the
> processing,
> > we would update the HDHT store to show the next expected step, and if we
> > already received data for the next expected step pull it from the HDHT
> > store, and write the transaction into our SQS queue which is the input
> into
> > the inbound transaction router at the beginning of the application, so it
> > processes through the system.
> >
> >
> >
> > I believe HDHT can be used to pass data throughout an entire application,
> > and is not limited to just a per operator basis, correct?
> >
> >
> >
> > Any comments / feedback?
> >
> >
> >
> > Thanks,
> >
> >
> >
> > Jim
>

Reply via email to