[ 
https://issues.apache.org/jira/browse/PARQUET-365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

li xiang updated PARQUET-365:
-----------------------------
    Description: 
In Pig code, 
https://github.com/apache/pig/blob/trunk/src/org/apache/pig/EvalFunc.java. A 
private number "inputSchemaInternal" represents the schema. Setter and Getter 
are also provided
{code}
316     private Schema inputSchemaInternal=null;

328     /**
329      * This method is for internal use. It is called by Pig core in both 
front-end
330      * and back-end to setup the right input schema for EvalFunc
331      */
332     public void setInputSchema(Schema input){
333         this.inputSchemaInternal=input;
334     }
335 
336     /**
337      * This method is intended to be called by the user in {@link EvalFunc} 
to get the input
338      * schema of the EvalFunc
339      */
340     public Schema getInputSchema(){
341         return this.inputSchemaInternal;
342     }
{code}

In parquet-mr/parquet-pig/src/main/java/parquet/pig/summary/Summary.java, class 
Summary extends EvalFunc. It uses a new number called inputSchema(vs. 
inputSchemaInternal used in class EvalFunc in Pig) to represent schema and 
override setInputSchema(), but the class does not override getInputSchema() to 
return inputSchema.

{code}
51  public class Summary extends EvalFunc<String> implements Algebraic {

54     private Schema inputSchema;

257   @Override
258   public void setInputSchema(Schema input) {
259     try {
260       // relation.bag.tuple
261       this.inputSchema=input.getField(0).schema.getField(0).schema;
262       saveSchemaToUDFContext();
263     } catch (FrontendException e) {
264       throw new RuntimeException("Usage: B = FOREACH (GROUP A ALL) GENERATE 
Summary(A); Can not get schema from " + input, e);
265     } catch (RuntimeException e) {
266       throw new RuntimeException("Usage: B = FOREACH (GROUP A ALL) GENERATE 
Summary(A); Can not get schema from "+input, e);
267     }
268   }
{code}

If setInputSchema() of class Summary is called, inputSchema is set. But if we 
call getInputSchema() afterwards, it will return the value of 
inputSchemaInternal, which can be still null.

> Class Summary does not provide a getter to return inputSchema
> -------------------------------------------------------------
>
>                 Key: PARQUET-365
>                 URL: https://issues.apache.org/jira/browse/PARQUET-365
>             Project: Parquet
>          Issue Type: Bug
>          Components: parquet-mr
>    Affects Versions: 1.6.0, 1.7.0, 1.8.0
>            Reporter: li xiang
>            Priority: Critical
>
> In Pig code, 
> https://github.com/apache/pig/blob/trunk/src/org/apache/pig/EvalFunc.java. A 
> private number "inputSchemaInternal" represents the schema. Setter and Getter 
> are also provided
> {code}
> 316     private Schema inputSchemaInternal=null;
> 328     /**
> 329      * This method is for internal use. It is called by Pig core in both 
> front-end
> 330      * and back-end to setup the right input schema for EvalFunc
> 331      */
> 332     public void setInputSchema(Schema input){
> 333         this.inputSchemaInternal=input;
> 334     }
> 335 
> 336     /**
> 337      * This method is intended to be called by the user in {@link 
> EvalFunc} to get the input
> 338      * schema of the EvalFunc
> 339      */
> 340     public Schema getInputSchema(){
> 341         return this.inputSchemaInternal;
> 342     }
> {code}
> In parquet-mr/parquet-pig/src/main/java/parquet/pig/summary/Summary.java, 
> class Summary extends EvalFunc. It uses a new number called inputSchema(vs. 
> inputSchemaInternal used in class EvalFunc in Pig) to represent schema and 
> override setInputSchema(), but the class does not override getInputSchema() 
> to return inputSchema.
> {code}
> 51  public class Summary extends EvalFunc<String> implements Algebraic {
> 54     private Schema inputSchema;
> 257   @Override
> 258   public void setInputSchema(Schema input) {
> 259     try {
> 260       // relation.bag.tuple
> 261       this.inputSchema=input.getField(0).schema.getField(0).schema;
> 262       saveSchemaToUDFContext();
> 263     } catch (FrontendException e) {
> 264       throw new RuntimeException("Usage: B = FOREACH (GROUP A ALL) 
> GENERATE Summary(A); Can not get schema from " + input, e);
> 265     } catch (RuntimeException e) {
> 266       throw new RuntimeException("Usage: B = FOREACH (GROUP A ALL) 
> GENERATE Summary(A); Can not get schema from "+input, e);
> 267     }
> 268   }
> {code}
> If setInputSchema() of class Summary is called, inputSchema is set. But if we 
> call getInputSchema() afterwards, it will return the value of 
> inputSchemaInternal, which can be still null.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to