We are using multiple transactions in a single job in case of feed and I think 
that this is the correct way.
Having a single job for a feed that feeds into multiple datasets is a good 
thing since job resources/feed resources are consolidated.

Here are some points:
- We can't use the same transaction id to feed multiple datasets. The only 
other option is to have multiple jobs each feeding a different dataset.
- Having multiple jobs (in addition to the extra resources used, memory and 
CPU) would then forces us to either read data from external sources multiple 
times, parse records multiple times, etc
  or having to have a synchronization between the different jobs and the feed 
source within asterixdb. IMO, this is far more complicated than having multiple 
transactions within a single job and the cost far outweigh the benefits.

P.S,
We are also using this for bucket connections in Couchbase Analytics.

> On Nov 16, 2017, at 2:57 PM, Till Westmann <[email protected]> wrote:
> 
> If there are a number of issue with supporting multiple transaction ids
> and no clear benefits/use-cases, I’d vote for simplification :)
> Also, code that’s not being used has a tendency to "rot" and so I think
> that it’s usefulness might be limited by the time we’d find a use for
> this functionality.
> 
> My 2c,
> Till
> 
> On 16 Nov 2017, at 13:57, Xikui Wang wrote:
> 
>> I'm separating the connections into different jobs in some of my
>> experiments... but that was intended to be used for the experimental
>> settings (i.e., not for master now)...
>> 
>> I think the interesting question here is whether we want to allow one
>> Hyracks job to carry multiple transactions. I personally think that should
>> be allowed as the transaction and job are two separate concepts, but I
>> couldn't find such use cases other than the feeds. Does anyone have a good
>> example on this?
>> 
>> Another question is, if we do allow multiple transactions in a single
>> Hyracks job, how do we enable commit runtime to obtain the correct TXN id
>> without having that embedded as part of the job specification.
>> 
>> Best,
>> Xikui
>> 
>> On Thu, Nov 16, 2017 at 1:01 PM, abdullah alamoudi <[email protected]>
>> wrote:
>> 
>>> I am curious as to how feed will work without this?
>>> 
>>> ~Abdullah.
>>>> On Nov 16, 2017, at 12:43 PM, Steven Jacobs <[email protected]> wrote:
>>>> 
>>>> Hi all,
>>>> We currently have MultiTransactionJobletEventListenerFactory, which
>>> allows
>>>> for one Hyracks job to run multiple Asterix transactions together.
>>>> 
>>>> This class is only used by feeds, and feeds are in process of changing to
>>>> no longer need this feature. As part of the work in pre-deploying job
>>>> specifications to be used by multiple hyracks jobs, I've been working on
>>>> removing the transaction id from the job specifications, as we use a new
>>>> transaction for each invocation of a deployed job.
>>>> 
>>>> There is currently no clear way to remove the transaction id from the job
>>>> spec and keep the option for MultiTransactionJobletEventListenerFactory.
>>>> 
>>>> The question for the group is, do we see a need to maintain this class
>>> that
>>>> will no longer be used by any current code? Or, an other words, is there
>>> a
>>>> strong possibility that in the future we will want multiple transactions
>>> to
>>>> share a single Hyracks job, meaning that it is worth figuring out how to
>>>> maintain this class?
>>>> 
>>>> Steven
>>> 
>>> 

Reply via email to