[ 
https://issues.apache.org/jira/browse/PIG-3204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14078971#comment-14078971
 ] 

Shiwei Guo commented on PIG-3204:
---------------------------------

Is the executeBatch() call in 'processDescribe' introduced by this patch 
necessary?
Can it be replaced by:
{code}
  if (mPigServer.isBatchOn()) {
            mPigServer.parseAndBuild();
        }
{code}

?

We have a usage case, which may be quite normal, like this:
{code}
describe aliaseA;
store aliaseA;

describe aliaseB;
store aliaseB;

describe aliaseC;
store aliaseC;
...
{code}

The 'executeBatch' call in describe make pig have no chance to execute the 
store operations in parallel and do more optimize.

> Change script parsing to parse entire script instead of line by line
> --------------------------------------------------------------------
>
>                 Key: PIG-3204
>                 URL: https://issues.apache.org/jira/browse/PIG-3204
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.10.1
>            Reporter: Rohini Palaniswamy
>            Assignee: Rohini Palaniswamy
>             Fix For: 0.12.0
>
>         Attachments: PIG-3204-1.patch, PIG-3204-2.patch, PIG-3204-3.patch, 
> PIG-3204-4.patch, PIG-3204-5.patch, PIG-3204-6.patch
>
>
>   Currently there are a lot of NN calls made to determine if there is a 
> schema file for a path in a LOAD statement. When there is a slow NN(caused by 
> whole bunch of other issues), it takes a lot of time for this and we found 
> the scripts spending anywhere from 5 mins to 40 mins depending upon the 
> script. It seems to be a good place for optimization. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to