For atomic transactions, the change was merged yesterday. For entity level transactions, it should be a very small change.
Cheers, Murtadha > On Nov 17, 2017, at 6:07 PM, abdullah alamoudi <[email protected]> wrote: > > I understand that is not the case right now but what you're working on? > > Cheers, > Abdullah. > > >> On Nov 17, 2017, at 7:04 AM, Murtadha Hubail <[email protected]> wrote: >> >> A transaction context can register multiple primary indexes. >> Since each entity commit log contains the dataset id, you can decrement the >> active operations on >> the operation tracker associated with that dataset id. >> >> On 17/11/2017, 5:52 PM, "abdullah alamoudi" <[email protected]> wrote: >> >> Can you illustrate how a deadlock can happen? I am anxious to know. >> Moreover, the reason for the multiple transaction ids in feeds is not >> simply because we compile them differently. >> >> How would a commit operator know which dataset active operation counter to >> decrement if they share the same id for example? >> >>> On Nov 16, 2017, at 9:46 PM, Xikui Wang <[email protected]> wrote: >>> >>> Yes. That deadlock could happen. Currently, we have one-to-one mappings for >>> the jobs and transactions, except for the feeds. >>> >>> @Abdullah, after some digging into the code, I think probably we can use a >>> single transaction id for the job which feeds multiple datasets? See if I >>> can convince you. :) >>> >>> The reason we have multiple transaction ids in feeds is that we compile >>> each connection job separately and combine them into a single feed job. A >>> new transaction id is created and assigned to each connection job, thus for >>> the combined job, we have to handle the different transactions as they >>> are embedded in the connection job specifications. But, what if we create a >>> single transaction id for the combined job? That transaction id will be >>> embedded into each connection so they can write logs freely, but the >>> transaction will be started and committed only once as there is only one >>> feed job. In this way, we won't need multiTransactionJobletEventListener >>> and the transaction id can be removed from the job specification easily as >>> well (for Steven's change). >>> >>> Best, >>> Xikui >>> >>> >>>> On Thu, Nov 16, 2017 at 4:26 PM, Mike Carey <[email protected]> wrote: >>>> >>>> I worry about deadlocks. The waits for graph may not understand that >>>> making t1 wait will also make t2 wait since they may share a thread - >>>> right? Or do we have jobs and transactions separately represented there >>>> now? >>>> >>>>> On Nov 16, 2017 3:10 PM, "abdullah alamoudi" <[email protected]> wrote: >>>>> >>>>> We are using multiple transactions in a single job in case of feed and I >>>>> think that this is the correct way. >>>>> Having a single job for a feed that feeds into multiple datasets is a >>>> good >>>>> thing since job resources/feed resources are consolidated. >>>>> >>>>> Here are some points: >>>>> - We can't use the same transaction id to feed multiple datasets. The >>>> only >>>>> other option is to have multiple jobs each feeding a different dataset. >>>>> - Having multiple jobs (in addition to the extra resources used, memory >>>>> and CPU) would then forces us to either read data from external sources >>>>> multiple times, parse records multiple times, etc >>>>> or having to have a synchronization between the different jobs and the >>>>> feed source within asterixdb. IMO, this is far more complicated than >>>> having >>>>> multiple transactions within a single job and the cost far outweigh the >>>>> benefits. >>>>> >>>>> P.S, >>>>> We are also using this for bucket connections in Couchbase Analytics. >>>>> >>>>>> On Nov 16, 2017, at 2:57 PM, Till Westmann <[email protected]> wrote: >>>>>> >>>>>> If there are a number of issue with supporting multiple transaction ids >>>>>> and no clear benefits/use-cases, I’d vote for simplification :) >>>>>> Also, code that’s not being used has a tendency to "rot" and so I think >>>>>> that it’s usefulness might be limited by the time we’d find a use for >>>>>> this functionality. >>>>>> >>>>>> My 2c, >>>>>> Till >>>>>> >>>>>>> On 16 Nov 2017, at 13:57, Xikui Wang wrote: >>>>>>> >>>>>>> I'm separating the connections into different jobs in some of my >>>>>>> experiments... but that was intended to be used for the experimental >>>>>>> settings (i.e., not for master now)... >>>>>>> >>>>>>> I think the interesting question here is whether we want to allow one >>>>>>> Hyracks job to carry multiple transactions. I personally think that >>>>> should >>>>>>> be allowed as the transaction and job are two separate concepts, but I >>>>>>> couldn't find such use cases other than the feeds. Does anyone have a >>>>> good >>>>>>> example on this? >>>>>>> >>>>>>> Another question is, if we do allow multiple transactions in a single >>>>>>> Hyracks job, how do we enable commit runtime to obtain the correct TXN >>>>> id >>>>>>> without having that embedded as part of the job specification. >>>>>>> >>>>>>> Best, >>>>>>> Xikui >>>>>>> >>>>>>> On Thu, Nov 16, 2017 at 1:01 PM, abdullah alamoudi < >>>> [email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> I am curious as to how feed will work without this? >>>>>>>> >>>>>>>> ~Abdullah. >>>>>>>>> On Nov 16, 2017, at 12:43 PM, Steven Jacobs <[email protected]> >>>> wrote: >>>>>>>>> >>>>>>>>> Hi all, >>>>>>>>> We currently have MultiTransactionJobletEventListenerFactory, which >>>>>>>> allows >>>>>>>>> for one Hyracks job to run multiple Asterix transactions together. >>>>>>>>> >>>>>>>>> This class is only used by feeds, and feeds are in process of >>>>> changing to >>>>>>>>> no longer need this feature. As part of the work in pre-deploying >>>> job >>>>>>>>> specifications to be used by multiple hyracks jobs, I've been >>>> working >>>>> on >>>>>>>>> removing the transaction id from the job specifications, as we use a >>>>> new >>>>>>>>> transaction for each invocation of a deployed job. >>>>>>>>> >>>>>>>>> There is currently no clear way to remove the transaction id from >>>> the >>>>> job >>>>>>>>> spec and keep the option for MultiTransactionJobletEventLis >>>>> tenerFactory. >>>>>>>>> >>>>>>>>> The question for the group is, do we see a need to maintain this >>>> class >>>>>>>> that >>>>>>>>> will no longer be used by any current code? Or, an other words, is >>>>> there >>>>>>>> a >>>>>>>>> strong possibility that in the future we will want multiple >>>>> transactions >>>>>>>> to >>>>>>>>> share a single Hyracks job, meaning that it is worth figuring out >>>> how >>>>> to >>>>>>>>> maintain this class? >>>>>>>>> >>>>>>>>> Steven >>>>>>>> >>>>>>>> >>>>> >>>>> >>>> >> >> >> >> >
