Looks like it's on automatically.

Code below is from trunk, but I don't think this changed recently. I got rid
of exception handling for conciseness.

In PigServer:

    public void registerScript(String fileName) throws IOException {
            GruntParser grunt = new GruntParser(new FileReader(new
File(fileName)));
           * grunt.setInteractive(false);*
            grunt.setParams(this);
            grunt.parseStopOnError(true);
    }


In GruntParser:

    public int[] parseStopOnError(boolean sameBatch) throws IOException,
ParseException {
*        if (!mInteractive && !sameBatch) {
            setBatchOn();
        }
*        prompt();
        mDone = false;
        while(!mDone) {
            parse();
        }
        if (!sameBatch) {
            executeBatch();
        }
        int [] res = { mNumSucceededJobs, mNumFailedJobs };
        return res;
    }


On Thu, Mar 4, 2010 at 10:00 AM, Rohan Rai <rohan....@inmobi.com> wrote:

> Thanks Dmitriy
>
> Just a question more
>
> registerScript allows to register a pig script in the embedded mode
> So the confusion was does it internally tries to optimize it.
> or setBatchOn has to be explicitly called
>
> Regards
> Rohan
>
>
> Dmitriy Ryaboy wrote:
>
>> 1) Automatically, if you call it right.  Look for the setBatchOn and
>> executeBatch methods (I may be slightly off on the method names, going off
>> memory)
>>
>> 2) The optimizer moves stuff around and may be executing things in a
>> slightly different order then what you tell it. This can mean pushing up
>> projections, filters, and limits, inserting casts, and doing all kinds of
>> other manipulations. The logical plan shows you what's going to happen
>> without breaking it down into the MR plan. There are further optimizations
>> at the MR level, so both are worth checking. In practice I usually look at
>> the logical plan for order-of-operations and general sanity checking, and
>> at
>> the MR plan for number of jobs and whether things like algebraic and
>> accumulative interfaces are kicking in.
>>
>> 3) Yes. Roughly speaking, one map per block will be generated. The bigger
>> the block, the more work per mapper. The smaller the block, the more
>> mappers. Depending on the workload, there's an optimal value.
>>
>> 4) Playing with logical plan -- don't :-). It's exposed so that you can
>> look
>> at what's going on, and not intended to let you change execution plans.
>> Unless you actually want to hack Pig guts. If that's the case, look at the
>> optimizer and the MRCompiler classes to see how it's getting modified and
>> used.
>>
>> -D
>>
>> On Thu, Mar 4, 2010 at 9:14 AM, Rohan Rai <rohan....@inmobi.com> wrote:
>>
>>
>>  On using embedded Pig Server and registering a pig script for execution
>>>
>>> 1) Does Multi Query Optimization happens automatically, or has to
>>> explicitly told so.
>>>
>>> 2) Logical Plan. What one can infer out of it.
>>>
>>> 3) Does the Block Size (defined in hadoop) has an effect on performance
>>> or the number of map job getting selected.
>>>
>>> Regards
>>> Rohan
>>>
>>> The information contained in this communication is intended solely for
>>> the
>>> use of the individual or entity to whom it is addressed and others
>>> authorized to receive it. It may contain confidential or legally
>>> privileged
>>> information. If you are not the intended recipient you are hereby
>>> notified
>>> that any disclosure, copying, distribution or taking any action in
>>> reliance
>>> on the contents of this information is strictly prohibited and may be
>>> unlawful. If you have received this communication in error, please notify
>>> us
>>> immediately by responding to this email and then delete it from your
>>> system.
>>> The firm is neither liable for the proper and complete transmission of
>>> the
>>> information contained in this communication nor for any delay in its
>>> receipt.
>>>
>>>
>>>  .
>>
>>
>>
>
> The information contained in this communication is intended solely for the
> use of the individual or entity to whom it is addressed and others
> authorized to receive it. It may contain confidential or legally privileged
> information. If you are not the intended recipient you are hereby notified
> that any disclosure, copying, distribution or taking any action in reliance
> on the contents of this information is strictly prohibited and may be
> unlawful. If you have received this communication in error, please notify us
> immediately by responding to this email and then delete it from your system.
> The firm is neither liable for the proper and complete transmission of the
> information contained in this communication nor for any delay in its
> receipt.
>

Reply via email to