Just done. https://issues.apache.org/jira/browse/FALCON-511
Thanks! John 2014-07-15 22:22 GMT-07:00 Shwetha GS <[email protected]>: > Hi John, > > We didn't have any usecase of this kind of replication, hence we didn't > think about it. Its a valid usecase, can you file a jira for tracking this? > > Thanks, > Shwetha > > > On Wed, Jul 16, 2014 at 1:00 AM, John Yu <[email protected]> wrote: > > > Hey Satish, > > > > Thanks for your reply! > > > > I can see how setting up that way would definitely work. > > Also, it is probably technically more correct as well, as data generated > by > > different processes should be considered different. > > > > However, we are thinking along the lines of data discovery, in which a > > critical dataset might be computed on different colos simultaneously for > > both DR and load balancing purposes. In this scenario, we would somehow > > like the end users to know that feed1 and feed2 are logically the same > > data, and they are free to pick one to use. > > > > Just wondering whether it make sense to support multiple sources and > > multiple targets without specifying partition (and maybe the target > cluster > > have to specify the order of sources from which to copy). Also I am > > guessing that this "multiple sources and multiple targets without > > specifying partition" requirement must have came up before, and what was > > the thought process that went behind not supporting it in the end. > > > > Thanks a lot! > > John > > > > > > 2014-07-14 21:34 GMT-07:00 Satish Mittal <[email protected]>: > > > > > Hi, > > > > > > Given that both ETL clusters are producing the same data-set > independent > > of > > > each other and the aim is to replicate the data-set within colo (to > avoid > > > any cross-colo data movement), you could simply have 2 instances of the > > > same feed, one per colo: > > > > > > feed1: > > > <cluster name=“colo1ETL type="source"> > > > <cluster name=“colo1A” type="target"> > > > > > > feed2: > > > <cluster name=“colo2ETL type="source"> > > > <cluster name=“colo2A” type="target"> > > > > > > The 1st error was coming since multiple source replication was > configured > > > (which needs partition expressions to be specified). Also that > > > configuration would have ended up moving data across colos, which is > > > against your desired goal. > > > > > > Thanks, > > > Satish > > > > > > > > > On Mon, Jul 14, 2014 at 11:52 PM, John Yu <[email protected]> > wrote: > > > > > > > Hey all, > > > > > > > > We currently have the following use case: > > > > Colo1 has 1 ETL cluster (Colo1-ETL) and 1 adhoc cluster (Colo1-A) > > > > Colo2 has 1 ETL cluster (Colo2-ETL) and 1 adhoc cluster (Colo2-A) > > > > > > > > Due to the bandwidth constraint between the two colo's, we are > thinking > > > of > > > > having the 2 ETL clusters perform the same computation to generate > the > > > same > > > > dataset, and have the 2 adhoc clusters pull from their respective > > > > colo-local ETL cluster. > > > > > > > > What would be a good way to configure this feed? > > > > > > > > I've tried the following: > > > > <cluster name=“colo1ETL type="source"> > > > > <cluster name="colo2ETL" type="source"> > > > > <cluster name=“colo1A” type="target"> > > > > <cluster name="colo2A” type="target"> > > > > Error: Partition expression has to be specified for cluster colo1ETL > as > > > > there are more than one source clusters > > > > > > > > <cluster name=“colo1ETL”> > > > > <cluster name="colo2ETL”> > > > > <cluster name=“colo1A” type="target"> > > > > <cluster name="colo2A” type="target"> > > > > Error: Feed: pve-intermediate should have atleast one source cluster > > > > defined > > > > > > > > > > > > Thanks! > > > > > > > > John > > > > > > > > > > -- > > > _____________________________________________________________ > > > The information contained in this communication is intended solely for > > the > > > use of the individual or entity to whom it is addressed and others > > > authorized to receive it. It may contain confidential or legally > > privileged > > > information. If you are not the intended recipient you are hereby > > notified > > > that any disclosure, copying, distribution or taking any action in > > reliance > > > on the contents of this information is strictly prohibited and may be > > > unlawful. If you have received this communication in error, please > notify > > > us immediately by responding to this email and then delete it from your > > > system. The firm is neither liable for the proper and complete > > transmission > > > of the information contained in this communication nor for any delay in > > its > > > receipt. > > > > > > > > > > > -- > > 余守中 John Yu (Yu, Shoou-Jong) > > Mobile: 650-691-3314 > > > > -- > _____________________________________________________________ > The information contained in this communication is intended solely for the > use of the individual or entity to whom it is addressed and others > authorized to receive it. It may contain confidential or legally privileged > information. If you are not the intended recipient you are hereby notified > that any disclosure, copying, distribution or taking any action in reliance > on the contents of this information is strictly prohibited and may be > unlawful. If you have received this communication in error, please notify > us immediately by responding to this email and then delete it from your > system. The firm is neither liable for the proper and complete transmission > of the information contained in this communication nor for any delay in its > receipt. > -- 余守中 John Yu (Yu, Shoou-Jong) Mobile: 650-691-3314
