We can create a join op that uses the incoming rows as the left side easily
enough.
And any contribution is obviously welcome

On Mon, Jan 11, 2010 at 8:13 PM, G. Richard Bellamy
<rbell...@pteradigm.com>wrote:

> One of the things I'm finding is that:
> 1. Input Operations often need input to help them decide what to pull from
> a
> source, so that the data they deliver to the Output is correct.
> 2. Output Operations often need to act as input to the next Operation (e.g.
> when returning the last inserted id).
>
> This brings me to the Join Operation - if the first operation of either
> side
> of the join is an Input Operation of the type mentioned above, it would be
> helpful if a Join didn't HAVE to be a root.
>
> I've got a couple of PassThroughOutputOperations I could contribute:
> -o- ConventionPassThroughOutputOperation : AbstractCommandOperation
> -o- ConventionPassThroughWithScalarOutputOperation<T> :
> ConventionPassThroughOutputOperation
>
> -rb
>
>
> -----Original Message-----
> From: rhino-tools-dev@googlegroups.com
> [mailto:rhino-tools-...@googlegroups.com] On Behalf Of Jason Meckley
> Sent: Monday, January 11, 2010 9:42 AM
> To: Rhino Tools Dev
> Subject: [rhino-tools-dev] Re: Questions on ETL: Follow up
>
> So joining is similar to InputCommandOperations in that the
> IEnumerable<Row>
> parameter is ignored. The rows are generated by the partial process within
> the JoinOperation. And this is purely a design choice. Ok, makes sense.
>
> "It looks like JoinOp is vestigal remains from NestedLoopJoinOp
> refactoring."
> I had to define vestigial(http://www.google.com/search?q=define
> %3Avestigial). So  is NestedLoop preferred operation or Join :) ?
>
> "The reason that you can join after a branch is that you can, the syntax is
> just plain ugly. Basically, you need to register a join op and then
> register
> the final stage of each branch."
> I can join after branching, albeit ugly syntax. Not a problem.
>
> "I think that in order to make your process happen you need an intermediary
> operation that would drain the previous operation rows (executing them) and
> the execute the next in line."
> So instead of using the provided Database Operations I should define my own
> [Abstract/Database]Operation that would process the row and then yield it?
> something like:
> class PassThroughOutputOperation: AbstractOperation {
>     public void IEnumerable<Row> Execute(IEnumerable<Row> rows)
>     {
>               foreach(var row in rows)
>               {
>                      //save row to database
>                      yield return row;
>               }
>     }
> }
>
>
> On Jan 11, 12:13 pm, Ayende Rahien <aye...@ayende.com> wrote:
> > Wow, so many questions.
> > The reason that join op is ignoring the source input is that it is
> > joining its left & right ops, what would it do with additional input?
> > You could build it (and now that I think about it, I could argue that
> > way) that the input is the "left" side, but that isn't how it is
> designed.
> > A join op is always a root operation.
> > It looks like JoinOp is vestigal remains from NestedLoopJoinOp
> refactoring.
> >
> > Your branching reasoning is solid.
> > The reason that you can join after a branch is that you can, the
> > syntax is just plain ugly.
> > Basically, you need to register a join op and then register the final
> > stage of each branch.
> >
> > I think that in order to make your process happen you need an
> > intermediary operation that would drain the previous operation rows
> > (executing them) and the execute the next in line.
> >
> > On Mon, Jan 11, 2010 at 6:51 PM, Jason Meckley
> <jasonmeck...@gmail.com>wrote:
> >
> > > I'm digging more into ETL and I have come across
> > > NestedLoopsJoinOperation and JoinOperation. I cannot tell what the
> > > difference is, or why I would use one over the other? Also, why are
> > > the rows ignored rather than passed to the let and right operations?
> >
> > > I'm also trying to understand the PartialEtlProcess. Is the idea of
> > > Partial to load multiple subset? Like if I wanted to branch my
> > > operations with each branch containing multiple processes?
> > > Register(new GetData())
> > > .Register(new BranchingOperation()
> > >        .Add(Partial
> > >                      .Register(OperationA)
> > >                      .Register(OperationB))
> > >        .Add(Partial
> > >                      .Register(OperationC)
> > >                      .Register(OperationD)) );
> >
> > > In this scenario the operations would run:
> > > [Branch 1] A1, B1, A2, B2
> > > [Branch 2] C1, D1, C2, D2
> > > where A & C would be intermediate logical operations
> > > (transformations) and B & D are output operations?
> >
> > > Trying to follow thread
> > >http://groups.google.com/group/rhino-tools-dev/browse_thread/thread/e.
> ..
> > > ,
> > > why isn't it possible to join after a branch? Is this because the
> > >left  and right operations are passed null?
> >
> > > here is what I have
> > > a single flat DBF file. I need to import this into a Sql Database.
> > > Sql has 2 tables Parent 1-N Children 1-N GrandChildren when the data
> > > is imported I need to preform 3 different operations:
> > > 1. group the source by field
> > > 2. insert new groups into Parent
> > > 3. insert children & grand children in DBF that are not in SQL (left
> > > join)
> > > 4. update existing children and grand children (inner join, there
> > > will always be the same number of grand children) 5. delete Parents
> > > from SQL that do not have any children
> >
> > > Operation 1 only applies to operation 2, so I figure this could be a
> > > branch.
> > > Operations 3 and 4 can also be done independently of one another,
> > > again branching.
> > > Operations 3 and 4 are dependent on the completion of operation 2.
> > > Operation 5 must be executed after 3 and 4 are complete.
> >
> > > It looks like Operations 3 & 4 are actually 2 output command. in
> > > which case I must break this logic into two operations and branch
> together.
> >
> > > I would like this to occur within a single transaction/ETL process,
> > > but I'm not sure if that's possible, or reasonable?
> >
> > > --
> > > You received this message because you are subscribed to the Google
> > > Groups "Rhino Tools Dev" group.
> > > To post to this group, send email to rhino-tools-...@googlegroups.com.
> > > To unsubscribe from this group, send email to
> > > rhino-tools-dev+unsubscr...@googlegroups.com<rhino-tools-dev%2bunsubscr...@googlegroups.com>
> <rhino-tools-dev%2Bunsub
> > > rhino-tools-dev+scr...@googlegroups.com<rhino-tools-dev%2bscr...@googlegroups.com>
> >
>  > > .
> > > For more options, visit this group at
> > >http://groups.google.com/group/rhino-tools-dev?hl=en.
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Rhino Tools Dev" group.
> To post to this group, send email to rhino-tools-...@googlegroups.com.
> To unsubscribe from this group, send email to
> rhino-tools-dev+unsubscr...@googlegroups.com<rhino-tools-dev%2bunsubscr...@googlegroups.com>
> .
> For more options, visit this group at
> http://groups.google.com/group/rhino-tools-dev?hl=en.
>
>
>
>
--
You received this message because you are subscribed to the Google Groups "Rhino Tools Dev" group.
To post to this group, send email to rhino-tools-...@googlegroups.com.
To unsubscribe from this group, send email to rhino-tools-dev+unsubscr...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/rhino-tools-dev?hl=en.

Reply via email to