Hi Elliot, Thanks for letting me know. HPL-SQL sounded particularly interesting. But in the documentation I could not see any way to pass output generated by one Hive query to the next one. The tool looks good as a homogeneous PL-SQL platform for multiple big-data systems (http://www.hplsql.org/about).
However in order to break single complex hive query, DDLs look to be only way in HPL-SQL too. Or is there any alternate way that I might have missed? -- Saumitra S. Shahapure On Thu, Dec 15, 2016 at 6:21 PM, Elliot West <tea...@gmail.com> wrote: > I notice that HPL/SQL is not mentioned on the page I referenced, however I > expect that is another approach that you could use to modularise: > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=59690156 > http://www.hplsql.org/doc > > On 15 December 2016 at 17:17, Elliot West <tea...@gmail.com> wrote: > >> Some options are covered here, although there is no definitive guidance >> as far as I know: >> >> https://cwiki.apache.org/confluence/display/Hive/Unit+Testin >> g+Hive+SQL#UnitTestingHiveSQL-Modularisation >> >> On 15 December 2016 at 17:08, Saumitra Shahapure < >> saumitra.offic...@gmail.com> wrote: >> >>> Hello, >>> >>> We are running and maintaining quite big and complex Hive SELECT query >>> right now. It's basically a single SELECT query which performs JOIN of >>> about ten other SELECT query outputs. >>> >>> A simplest way to refactor that we can think of is to break this query >>> down into multiple views and then join the views. There is similar >>> possibility to create intermediate tables. >>> >>> However creating multiple DDLs in order to maintain a single DML is not >>> very smooth. We would end up polluting metadata database by creating views >>> / intermediate tables which are used in just this ETL. >>> >>> What are the other efficient ways to maintain complex SQL queries >>> written in Hive? Are there better ways to break Hive query into multiple >>> modules? >>> >>> -- Saumitra S. Shahapure >>> >> >> >