I understand that is not the case right now but what you're working on?

Cheers,
Abdullah.


> On Nov 17, 2017, at 7:04 AM, Murtadha Hubail <[email protected]> wrote:
> 
> A transaction context can register multiple primary indexes.
> Since each entity commit log contains the dataset id, you can decrement the 
> active operations on 
> the operation tracker associated with that dataset id.
> 
> On 17/11/2017, 5:52 PM, "abdullah alamoudi" <[email protected]> wrote:
> 
>    Can you illustrate how a deadlock can happen? I am anxious to know.
>    Moreover, the reason for the multiple transaction ids in feeds is not 
> simply because we compile them differently.
> 
>    How would a commit operator know which dataset active operation counter to 
> decrement if they share the same id for example?
> 
>> On Nov 16, 2017, at 9:46 PM, Xikui Wang <[email protected]> wrote:
>> 
>> Yes. That deadlock could happen. Currently, we have one-to-one mappings for
>> the jobs and transactions, except for the feeds.
>> 
>> @Abdullah, after some digging into the code, I think probably we can use a
>> single transaction id for the job which feeds multiple datasets? See if I
>> can convince you. :)
>> 
>> The reason we have multiple transaction ids in feeds is that we compile
>> each connection job separately and combine them into a single feed job. A
>> new transaction id is created and assigned to each connection job, thus for
>> the combined job, we have to handle the different transactions as they
>> are embedded in the connection job specifications. But, what if we create a
>> single transaction id for the combined job? That transaction id will be
>> embedded into each connection so they can write logs freely, but the
>> transaction will be started and committed only once as there is only one
>> feed job. In this way, we won't need multiTransactionJobletEventListener
>> and the transaction id can be removed from the job specification easily as
>> well (for Steven's change).
>> 
>> Best,
>> Xikui
>> 
>> 
>> On Thu, Nov 16, 2017 at 4:26 PM, Mike Carey <[email protected]> wrote:
>> 
>>> I worry about deadlocks.  The waits for graph may not understand that
>>> making t1 wait will also make t2 wait since they may share a thread -
>>> right?  Or do we have jobs and transactions separately represented there
>>> now?
>>> 
>>> On Nov 16, 2017 3:10 PM, "abdullah alamoudi" <[email protected]> wrote:
>>> 
>>>> We are using multiple transactions in a single job in case of feed and I
>>>> think that this is the correct way.
>>>> Having a single job for a feed that feeds into multiple datasets is a
>>> good
>>>> thing since job resources/feed resources are consolidated.
>>>> 
>>>> Here are some points:
>>>> - We can't use the same transaction id to feed multiple datasets. The
>>> only
>>>> other option is to have multiple jobs each feeding a different dataset.
>>>> - Having multiple jobs (in addition to the extra resources used, memory
>>>> and CPU) would then forces us to either read data from external sources
>>>> multiple times, parse records multiple times, etc
>>>> or having to have a synchronization between the different jobs and the
>>>> feed source within asterixdb. IMO, this is far more complicated than
>>> having
>>>> multiple transactions within a single job and the cost far outweigh the
>>>> benefits.
>>>> 
>>>> P.S,
>>>> We are also using this for bucket connections in Couchbase Analytics.
>>>> 
>>>>> On Nov 16, 2017, at 2:57 PM, Till Westmann <[email protected]> wrote:
>>>>> 
>>>>> If there are a number of issue with supporting multiple transaction ids
>>>>> and no clear benefits/use-cases, I’d vote for simplification :)
>>>>> Also, code that’s not being used has a tendency to "rot" and so I think
>>>>> that it’s usefulness might be limited by the time we’d find a use for
>>>>> this functionality.
>>>>> 
>>>>> My 2c,
>>>>> Till
>>>>> 
>>>>> On 16 Nov 2017, at 13:57, Xikui Wang wrote:
>>>>> 
>>>>>> I'm separating the connections into different jobs in some of my
>>>>>> experiments... but that was intended to be used for the experimental
>>>>>> settings (i.e., not for master now)...
>>>>>> 
>>>>>> I think the interesting question here is whether we want to allow one
>>>>>> Hyracks job to carry multiple transactions. I personally think that
>>>> should
>>>>>> be allowed as the transaction and job are two separate concepts, but I
>>>>>> couldn't find such use cases other than the feeds. Does anyone have a
>>>> good
>>>>>> example on this?
>>>>>> 
>>>>>> Another question is, if we do allow multiple transactions in a single
>>>>>> Hyracks job, how do we enable commit runtime to obtain the correct TXN
>>>> id
>>>>>> without having that embedded as part of the job specification.
>>>>>> 
>>>>>> Best,
>>>>>> Xikui
>>>>>> 
>>>>>> On Thu, Nov 16, 2017 at 1:01 PM, abdullah alamoudi <
>>> [email protected]>
>>>>>> wrote:
>>>>>> 
>>>>>>> I am curious as to how feed will work without this?
>>>>>>> 
>>>>>>> ~Abdullah.
>>>>>>>> On Nov 16, 2017, at 12:43 PM, Steven Jacobs <[email protected]>
>>> wrote:
>>>>>>>> 
>>>>>>>> Hi all,
>>>>>>>> We currently have MultiTransactionJobletEventListenerFactory, which
>>>>>>> allows
>>>>>>>> for one Hyracks job to run multiple Asterix transactions together.
>>>>>>>> 
>>>>>>>> This class is only used by feeds, and feeds are in process of
>>>> changing to
>>>>>>>> no longer need this feature. As part of the work in pre-deploying
>>> job
>>>>>>>> specifications to be used by multiple hyracks jobs, I've been
>>> working
>>>> on
>>>>>>>> removing the transaction id from the job specifications, as we use a
>>>> new
>>>>>>>> transaction for each invocation of a deployed job.
>>>>>>>> 
>>>>>>>> There is currently no clear way to remove the transaction id from
>>> the
>>>> job
>>>>>>>> spec and keep the option for MultiTransactionJobletEventLis
>>>> tenerFactory.
>>>>>>>> 
>>>>>>>> The question for the group is, do we see a need to maintain this
>>> class
>>>>>>> that
>>>>>>>> will no longer be used by any current code? Or, an other words, is
>>>> there
>>>>>>> a
>>>>>>>> strong possibility that in the future we will want multiple
>>>> transactions
>>>>>>> to
>>>>>>>> share a single Hyracks job, meaning that it is worth figuring out
>>> how
>>>> to
>>>>>>>> maintain this class?
>>>>>>>> 
>>>>>>>> Steven
>>>>>>> 
>>>>>>> 
>>>> 
>>>> 
>>> 
> 
> 
> 
> 

Reply via email to