We can create a join op that uses the incoming rows as the left side easily enough. And any contribution is obviously welcome
On Mon, Jan 11, 2010 at 8:13 PM, G. Richard Bellamy <rbell...@pteradigm.com>wrote: > One of the things I'm finding is that: > 1. Input Operations often need input to help them decide what to pull from > a > source, so that the data they deliver to the Output is correct. > 2. Output Operations often need to act as input to the next Operation (e.g. > when returning the last inserted id). > > This brings me to the Join Operation - if the first operation of either > side > of the join is an Input Operation of the type mentioned above, it would be > helpful if a Join didn't HAVE to be a root. > > I've got a couple of PassThroughOutputOperations I could contribute: > -o- ConventionPassThroughOutputOperation : AbstractCommandOperation > -o- ConventionPassThroughWithScalarOutputOperation<T> : > ConventionPassThroughOutputOperation > > -rb > > > -----Original Message----- > From: rhino-tools-dev@googlegroups.com > [mailto:rhino-tools-...@googlegroups.com] On Behalf Of Jason Meckley > Sent: Monday, January 11, 2010 9:42 AM > To: Rhino Tools Dev > Subject: [rhino-tools-dev] Re: Questions on ETL: Follow up > > So joining is similar to InputCommandOperations in that the > IEnumerable<Row> > parameter is ignored. The rows are generated by the partial process within > the JoinOperation. And this is purely a design choice. Ok, makes sense. > > "It looks like JoinOp is vestigal remains from NestedLoopJoinOp > refactoring." > I had to define vestigial(http://www.google.com/search?q=define > %3Avestigial). So is NestedLoop preferred operation or Join :) ? > > "The reason that you can join after a branch is that you can, the syntax is > just plain ugly. Basically, you need to register a join op and then > register > the final stage of each branch." > I can join after branching, albeit ugly syntax. Not a problem. > > "I think that in order to make your process happen you need an intermediary > operation that would drain the previous operation rows (executing them) and > the execute the next in line." > So instead of using the provided Database Operations I should define my own > [Abstract/Database]Operation that would process the row and then yield it? > something like: > class PassThroughOutputOperation: AbstractOperation { > public void IEnumerable<Row> Execute(IEnumerable<Row> rows) > { > foreach(var row in rows) > { > //save row to database > yield return row; > } > } > } > > > On Jan 11, 12:13 pm, Ayende Rahien <aye...@ayende.com> wrote: > > Wow, so many questions. > > The reason that join op is ignoring the source input is that it is > > joining its left & right ops, what would it do with additional input? > > You could build it (and now that I think about it, I could argue that > > way) that the input is the "left" side, but that isn't how it is > designed. > > A join op is always a root operation. > > It looks like JoinOp is vestigal remains from NestedLoopJoinOp > refactoring. > > > > Your branching reasoning is solid. > > The reason that you can join after a branch is that you can, the > > syntax is just plain ugly. > > Basically, you need to register a join op and then register the final > > stage of each branch. > > > > I think that in order to make your process happen you need an > > intermediary operation that would drain the previous operation rows > > (executing them) and the execute the next in line. > > > > On Mon, Jan 11, 2010 at 6:51 PM, Jason Meckley > <jasonmeck...@gmail.com>wrote: > > > > > I'm digging more into ETL and I have come across > > > NestedLoopsJoinOperation and JoinOperation. I cannot tell what the > > > difference is, or why I would use one over the other? Also, why are > > > the rows ignored rather than passed to the let and right operations? > > > > > I'm also trying to understand the PartialEtlProcess. Is the idea of > > > Partial to load multiple subset? Like if I wanted to branch my > > > operations with each branch containing multiple processes? > > > Register(new GetData()) > > > .Register(new BranchingOperation() > > > .Add(Partial > > > .Register(OperationA) > > > .Register(OperationB)) > > > .Add(Partial > > > .Register(OperationC) > > > .Register(OperationD)) ); > > > > > In this scenario the operations would run: > > > [Branch 1] A1, B1, A2, B2 > > > [Branch 2] C1, D1, C2, D2 > > > where A & C would be intermediate logical operations > > > (transformations) and B & D are output operations? > > > > > Trying to follow thread > > >http://groups.google.com/group/rhino-tools-dev/browse_thread/thread/e. > .. > > > , > > > why isn't it possible to join after a branch? Is this because the > > >left and right operations are passed null? > > > > > here is what I have > > > a single flat DBF file. I need to import this into a Sql Database. > > > Sql has 2 tables Parent 1-N Children 1-N GrandChildren when the data > > > is imported I need to preform 3 different operations: > > > 1. group the source by field > > > 2. insert new groups into Parent > > > 3. insert children & grand children in DBF that are not in SQL (left > > > join) > > > 4. update existing children and grand children (inner join, there > > > will always be the same number of grand children) 5. delete Parents > > > from SQL that do not have any children > > > > > Operation 1 only applies to operation 2, so I figure this could be a > > > branch. > > > Operations 3 and 4 can also be done independently of one another, > > > again branching. > > > Operations 3 and 4 are dependent on the completion of operation 2. > > > Operation 5 must be executed after 3 and 4 are complete. > > > > > It looks like Operations 3 & 4 are actually 2 output command. in > > > which case I must break this logic into two operations and branch > together. > > > > > I would like this to occur within a single transaction/ETL process, > > > but I'm not sure if that's possible, or reasonable? > > > > > -- > > > You received this message because you are subscribed to the Google > > > Groups "Rhino Tools Dev" group. > > > To post to this group, send email to rhino-tools-...@googlegroups.com. > > > To unsubscribe from this group, send email to > > > rhino-tools-dev+unsubscr...@googlegroups.com<rhino-tools-dev%2bunsubscr...@googlegroups.com> > <rhino-tools-dev%2Bunsub > > > rhino-tools-dev+scr...@googlegroups.com<rhino-tools-dev%2bscr...@googlegroups.com> > > > > > . > > > For more options, visit this group at > > >http://groups.google.com/group/rhino-tools-dev?hl=en. > > > -- > You received this message because you are subscribed to the Google Groups > "Rhino Tools Dev" group. > To post to this group, send email to rhino-tools-...@googlegroups.com. > To unsubscribe from this group, send email to > rhino-tools-dev+unsubscr...@googlegroups.com<rhino-tools-dev%2bunsubscr...@googlegroups.com> > . > For more options, visit this group at > http://groups.google.com/group/rhino-tools-dev?hl=en. > > > >--
You received this message because you are subscribed to the Google Groups "Rhino Tools Dev" group.
To post to this group, send email to rhino-tools-...@googlegroups.com.
To unsubscribe from this group, send email to rhino-tools-dev+unsubscr...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/rhino-tools-dev?hl=en.