GitHub user dmainou edited a comment on the discussion: Question regarding Metadata Injection (MDI)
Hey, not sure what is the question. I built the following on Friday if it may help you figure out what you need: Situation: I just migrated a client from Pentaho to Hop. Need to prove things work. Task: To create a validation framework that compares every table in the prod server (Pentaho) with every table in my test server (HOP) Actions: In a completely new project I created the following: 1. pipeline 1. - gets a list of files ending in *.hpl from directory (target project) - loads the files into memory - parses the xml finding the transform block - filters transfrorms of type TableOutput - extracts the database connection and table name - using a pipeline executor passes the above 2 elements to the next pipeline 2. pipeline 2 (metadata injector) - Gets the list of columns and their metadata for the connection and table - excludes anything containing an SK - Upper's all text - Casts all dates into text yyyyMMdd - Builds an SQL Select statement using the above - Figures out a sort order list - Figures out a merge-diff plan - Figures out a different rows filename and output fields - creates a placeholder for the table name and connection - Injects all of the above into a blank template. 3. pipeline 3 (Blank template) - 2x empty table input steps one pointing at prod the other at Test - 2x empty sort steps one pointing at prod the other at Test - 1 empty merge diff step - a filter splitting things into identical and everything else - everything else spits out a file with the differences - everything else also copies the rows back to identical - the identical side then sorts and executes a group by to sum the count of identical, new, deleted and modified which is appended to a csv. I did not output a hardcoded populated template as I don't really need it. I simply executed the job and validated some ~50 tables and ~40M rows in about 30 minutes. Today I have moved to a new project.    GitHub link: https://github.com/apache/hop/discussions/5486#discussioncomment-13680260 ---- This is an automatically sent email for users@hop.apache.org. To unsubscribe, please send an email to: users-unsubscr...@hop.apache.org