This is not currently possible, but there is a Jira case about adding Hive support to QueryDatabaseTable [1].
The comments mention that paging of results in Hive is not possible, but I've seen some examples using ROWNUMBER OVER() and such, and although complicated and messy (like the PL/SQL statement to do pagination), it may indeed be possible. Please feel free to write an Improvement Jira to add Hive support to GenerateTableFetch if you like, we can continue the discussion there. Also in an upcoming release, it will be possible to use UpdateAttribute to keep track of certain variables in state, such that you could loop using SelectHiveQL, keeping track of a timestamp value for example, and use that in the HiveQL query, thereby emulating this capability. Regards, Matt [1] https://issues.apache.org/jira/browse/NIFI-3093 On Mon, Jan 9, 2017 at 12:23 PM, Provenzano Nicolas <[email protected]> wrote: > Hi all, > > > > The GenerateTableFetch processor allow defining a « max value column » to > get only recent rows (for example). > > > > Is there any way of doing the same with the SelectHiveQL ? Currently, it > seems the SelectHiveQL processor always gets all the rows each time it is > run while I would like to get only the added or updated rows since the last > run ? > > > > Thanks in advance > > > > Nicolas
