HDHT is for *embedded* state management, fully encapsulated by the operator. It cannot be used like a central database.
Thanks, Thomas On Thu, Sep 1, 2016 at 9:49 AM, Tushar Gosavi <[email protected]> wrote: > Hi Jim, > > Currently HDHT is accessible only to single operator in a DAG. Single > HDHT store can not be managed by two different operator at a time > which could cause metadata corruption. Theoretically HDHT bucket could > be read from multiple operators, but only one writer is allowed. > > In your case a stage in transaction is processed completely by > different operator and then only next stage can start. It could still > be achieved by using a single operator which manages HDHT state, and > having a loop in DAG to send completed transaction ids to sequencer. > > - Sequence operator will emit transaction to transaction processing > operator. > - If it receives an out of order transaction it will note it down in HDHT. > - The processing operator will send completed transaction id on a port > which is connected back to sequence operator. > - On receiving data on this loopback port, sequence operator will > update HDHT and search for next transaction in order, which could be > stored in HDHT and will emit to next processing operator. > > - Tushar. > > > On Sat, Aug 27, 2016 at 1:31 AM, Jim <[email protected]> wrote: > > Good afternoon, > > > > > > > > I have an apex application where I may receive edi transactions, but > > sometimes they arrive out of order and I want to hold any out of sequence > > transactions till the correct time in the flow to process them. > > > > > > > > For example for a standard order, we will receive from the remote vendor: > > > > > > > > 1.) General Acknowledgement > > > > 2.) Detailed Acknowledgement > > > > 3.) Ship Notification > > > > 4.) Invoice > > > > > > > > They are supposed to be sent and received in that order. > > > > > > > > However sometimes vendors systems have problems, etc. so they send the > all > > of these at the same time, and then we can receive them out of sequence. > > Data packets for these are very small, say from 1 to 512 bytes, and the > only > > time they will be out of sequence, we will receive them very closely > > together. > > > > > > > > I am trying to think of the best way to do this in my datatorrent / > Hadoop / > > yarn facilities, instead of creating a datatable in postgreSQl and using > > that. > > > > > > > > Can I create a flow that works like this (I am not sure if this makes > sense, > > or is the best way to solve my problem, while keeping state, etc. > maintained > > for all the operators): > > > > > > > > 1.) In the inbound transaction router, check the hdht store for the > order > > number, if it doesn’t exist, this means it is a new order, if the > > transaction trying to process is the general acknowledgment, emit the > data > > to the general acknowledgement operator; if it is not – store the > > transaction data into the correct bucket identifying the transaction is > it > > for, as well as the next step to be the general acknowledgement in HDHT > by > > order number. > > > > 2.) Say the next transaction is the ship notification, in the router, > we > > would check the HDHT store, see this is not the next expected transaction > > (say it is supposed to be the detail acknowledgement), so we would just > post > > the data for the ship notification into HDHT the store and say we are > done. > > > > 3.) Say we now receive the detailed acknowledgement for an order whose > > next step IS the detailed acknowledgement, we would see this is the > correct > > next transaction, emit it to the detailed acknowledgement operator, and > > update the HDHT store to show that the next transaction should be the > ship > > notification. NOTE: we can’t emit the ship notification yet, till we > have > > confirmed that the detailed ackkowledgment has been completed. > > > > 4.) In each of the 4 transaction operators at the end of the > processing, > > we would update the HDHT store to show the next expected step, and if we > > already received data for the next expected step pull it from the HDHT > > store, and write the transaction into our SQS queue which is the input > into > > the inbound transaction router at the beginning of the application, so it > > processes through the system. > > > > > > > > I believe HDHT can be used to pass data throughout an entire application, > > and is not limited to just a per operator basis, correct? > > > > > > > > Any comments / feedback? > > > > > > > > Thanks, > > > > > > > > Jim >
