Harish has done some good work for popular use-case of windowing on
https://issues.apache.org/jira/browse/HIVE-7062 which are available from
0.14 onwards. Will that be useful in your scenario? Or, are you targeting
non-windowing PTFs?

Thanks,
Ashutosh

On Thu, May 7, 2015 at 6:43 AM, Sivaramakrishnan Narayanan <
tarb...@gmail.com> wrote:

> Hi,
>
> I was reading through the PTFOperator and related code and was wondering if
> there is an opportunity to optimize this function in
> WindowingTableFunction.java
>
>   public void execute(PTFPartitionIterator<Object> pItr, PTFPartition
> outP) throws HiveException {
>
>  This guy iterates over the input partition once to compute outputColumns.
> This causes a full read of input partition.
>
> It then iterates over input partition again to append newly computed
> values. This causes another read of input partition and a write to output
> partition.
>
> I was wondering if it may be more efficient to append to the output
> partition as soon as window expressions have been computed. This will avoid
> one scan of the input partition.
>
> FYI - I've been looking at hive 0.13 code mostly but a glance at trunk
> suggests this logic is the same there.
>
> Thanks,
>
> Siva
>

Reply via email to