[jira] Commented: (PIG-1602) The .classpath of eclipse template still use hbase-0.20.0

2010-09-06 Thread Dmitriy V. Ryaboy (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12906451#action_12906451
 ] 

Dmitriy V. Ryaboy commented on PIG-1602:


+1

 The .classpath of eclipse template still use hbase-0.20.0
 -

 Key: PIG-1602
 URL: https://issues.apache.org/jira/browse/PIG-1602
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Jeff Zhang
Assignee: Jeff Zhang
Priority: Minor
 Fix For: 0.8.0

 Attachments: PIG_1602.patch


 The .classpath of eclipse template still use hbase-0.20.0, it should be 
 updated to hbase-0.20.6

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1602) The .classpath of eclipse template still use hbase-0.20.0

2010-09-06 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12906467#action_12906467
 ] 

Jeff Zhang commented on PIG-1602:
-

Patch committed to both trunk and branch-0.8



 The .classpath of eclipse template still use hbase-0.20.0
 -

 Key: PIG-1602
 URL: https://issues.apache.org/jira/browse/PIG-1602
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Jeff Zhang
Assignee: Jeff Zhang
Priority: Minor
 Fix For: 0.8.0

 Attachments: PIG_1602.patch


 The .classpath of eclipse template still use hbase-0.20.0, it should be 
 updated to hbase-0.20.6

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (PIG-1602) The .classpath of eclipse template still use hbase-0.20.0

2010-09-06 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang resolved PIG-1602.
-

Resolution: Fixed

 The .classpath of eclipse template still use hbase-0.20.0
 -

 Key: PIG-1602
 URL: https://issues.apache.org/jira/browse/PIG-1602
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Jeff Zhang
Assignee: Jeff Zhang
Priority: Minor
 Fix For: 0.8.0

 Attachments: PIG_1602.patch


 The .classpath of eclipse template still use hbase-0.20.0, it should be 
 updated to hbase-0.20.6

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1178) LogicalPlan and Optimizer are too complex and hard to work with

2010-09-06 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12906592#action_12906592
 ] 

Daniel Dai commented on PIG-1178:
-

Patch PIG-1178-10.patch committed.

 LogicalPlan and Optimizer are too complex and hard to work with
 ---

 Key: PIG-1178
 URL: https://issues.apache.org/jira/browse/PIG-1178
 Project: Pig
  Issue Type: Improvement
Reporter: Alan Gates
Assignee: Daniel Dai
 Fix For: 0.8.0

 Attachments: expressions-2.patch, expressions.patch, lp.patch, 
 lp.patch, PIG-1178-10.patch, PIG-1178-4.patch, PIG-1178-5.patch, 
 PIG-1178-6.patch, PIG-1178-7.patch, PIG-1178-8.patch, PIG-1178-9.patch, 
 pig_1178.patch, pig_1178.patch, PIG_1178.patch, pig_1178_2.patch, 
 pig_1178_3.2.patch, pig_1178_3.3.patch, pig_1178_3.4.patch, pig_1178_3.patch


 The current implementation of the logical plan and the logical optimizer in 
 Pig has proven to not be easily extensible. Developer feedback has indicated 
 that adding new rules to the optimizer is quite burdensome. In addition, the 
 logical plan has been an area of numerous bugs, many of which have been 
 difficult to fix. Developers also feel that the logical plan is difficult to 
 understand and maintain. The root cause for these issues is that a number of 
 design decisions that were made as part of the 0.2 rewrite of the front end 
 have now proven to be sub-optimal. The heart of this proposal is to revisit a 
 number of those proposals and rebuild the logical plan with a simpler design 
 that will make it much easier to maintain the logical plan as well as extend 
 the logical optimizer. 
 See http://wiki.apache.org/pig/PigLogicalPlanOptimizerRewrite for full 
 details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (PIG-1594) NullPointerException in new logical planner

2010-09-06 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai resolved PIG-1594.
-

Resolution: Fixed

This issue is fixed by PIG-1178-10.patch.

 NullPointerException in new logical planner
 ---

 Key: PIG-1594
 URL: https://issues.apache.org/jira/browse/PIG-1594
 Project: Pig
  Issue Type: Bug
Reporter: Andrew Hitchcock
Assignee: Daniel Dai
 Fix For: 0.8.0


 I've been testing the trunk version of Pig on Elastic MapReduce against our 
 log processing sample application(1). When I try to run the query it throws a 
 NullPointerException and suggests I disable the new logical plan. Disabling 
 it works and the script succeeds. Here is the query I'm trying to run:
 {code}
 register file:/home/hadoop/lib/pig/piggybank.jar
   DEFINE EXTRACT org.apache.pig.piggybank.evaluation.string.EXTRACT();
   RAW_LOGS = LOAD '$INPUT' USING TextLoader as (line:chararray);
   LOGS_BASE= foreach RAW_LOGS generate FLATTEN(EXTRACT(line, '^(\\S+) (\\S+) 
 (\\S+) \\[([\\w:/]+\\s[+\\-]\\d{4})\\] (.+?) (\\S+) (\\S+) ([^]*) 
 ([^]*)')) as (remoteAddr:chararray, remoteLogname:chararray, 
 user:chararray, time:chararray, request:chararray, status:int, 
 bytes_string:chararray, referrer:chararray, browser:chararray);
   REFERRER_ONLY = FOREACH LOGS_BASE GENERATE referrer;
   FILTERED = FILTER REFERRER_ONLY BY referrer matches '.*bing.*' OR referrer 
 matches '.*google.*';
   SEARCH_TERMS = FOREACH FILTERED GENERATE FLATTEN(EXTRACT(referrer, 
 '.*[\\?]q=([^]+).*')) as terms:chararray;
   SEARCH_TERMS_FILTERED = FILTER SEARCH_TERMS BY NOT $0 IS NULL;
   SEARCH_TERMS_COUNT = FOREACH (GROUP SEARCH_TERMS_FILTERED BY $0) GENERATE 
 $0, COUNT($1) as num;
   SEARCH_TERMS_COUNT_SORTED = LIMIT(ORDER SEARCH_TERMS_COUNT BY num DESC) 50;
   STORE SEARCH_TERMS_COUNT_SORTED into '$OUTPUT';
 {code}
 And here is the stack trace that results:
 {code}
 ERROR 2042: Error in new logical plan. Try -Dpig.usenewlogicalplan=false.
 org.apache.pig.backend.executionengine.ExecException: ERROR 2042: Error in 
 new logical plan. Try -Dpig.usenewlogicalplan=false.
 at 
 org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:285)
 at org.apache.pig.PigServer.compilePp(PigServer.java:1301)
 at 
 org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1154)
 at org.apache.pig.PigServer.execute(PigServer.java:1148)
 at org.apache.pig.PigServer.access$100(PigServer.java:123)
 at org.apache.pig.PigServer$Graph.execute(PigServer.java:1464)
 at org.apache.pig.PigServer.executeBatchEx(PigServer.java:350)
 at org.apache.pig.PigServer.executeBatch(PigServer.java:324)
 at 
 org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:111)
 at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
 at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:140)
 at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:90)
 at org.apache.pig.Main.run(Main.java:491)
 at org.apache.pig.Main.main(Main.java:107)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
 Caused by: java.lang.NullPointerException
 at org.apache.pig.EvalFunc.getSchemaName(EvalFunc.java:76)
 at 
 org.apache.pig.piggybank.impl.ErrorCatchingBase.outputSchema(ErrorCatchingBase.java:76)
 at 
 org.apache.pig.newplan.logical.expression.UserFuncExpression.getFieldSchema(UserFuncExpression.java:111)
 at 
 org.apache.pig.newplan.logical.optimizer.FieldSchemaResetter.execute(SchemaResetter.java:175)
 at 
 org.apache.pig.newplan.logical.expression.AllSameExpressionVisitor.visit(AllSameExpressionVisitor.java:143)
 at 
 org.apache.pig.newplan.logical.expression.UserFuncExpression.accept(UserFuncExpression.java:55)
 at 
 org.apache.pig.newplan.ReverseDependencyOrderWalker.walk(ReverseDependencyOrderWalker.java:69)
 at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50)
 at 
 org.apache.pig.newplan.logical.optimizer.SchemaResetter.visit(SchemaResetter.java:87)
 at 
 org.apache.pig.newplan.logical.relational.LOGenerate.accept(LOGenerate.java:149)
 at 
 org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:74)
 at 
 org.apache.pig.newplan.logical.optimizer.SchemaResetter.visit(SchemaResetter.java:76)
 at 
 

[jira] Commented: (PIG-794) Use Avro serialization in Pig

2010-09-06 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12906671#action_12906671
 ] 

Jeff Zhang commented on PIG-794:


Dmitriy,

In my patch I turn InternalMap as an avro array whose element is a record 
having two datums(one is key and the other is value).
But it occurred weird exception , not know what's wrong with my code 


{code}
Exception in thread main java.lang.NullPointerException
at org.apache.avro.io.parsing.Parser.advance(Parser.java:86)
at 
org.apache.avro.io.ResolvingDecoder.readFieldOrder(ResolvingDecoder.java:121)
at 
org.apache.pig.impl.io.avro.PigDataRecordReader.readRecord(PigDataRecordReader.java:77)
at 
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:106)
at 
org.apache.pig.impl.io.avro.PigDataRecordReader.readRecord(PigDataRecordReader.java:66)
at 
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:106)
at 
org.apache.avro.generic.GenericDatumReader.readArray(GenericDatumReader.java:184)
at 
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:108)
at 
org.apache.pig.impl.io.avro.PigDataRecordReader.readRecord(PigDataRecordReader.java:81)
at 
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:106)
at 
org.apache.avro.generic.GenericDatumReader.readArray(GenericDatumReader.java:184)
at 
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:108)
at 
org.apache.pig.impl.io.avro.PigDataRecordReader.readRecord(PigDataRecordReader.java:83)
at 
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:106)
at 
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:97)
at org.apache.avro.file.DataFileStream.next(DataFileStream.java:198)
at org.apache.avro.file.DataFileStream.next(DataFileStream.java:185)
at org.apache.pig.impl.io.avro.PigData.main(PigData.java:224)

{code}

 Use Avro serialization in Pig
 -

 Key: PIG-794
 URL: https://issues.apache.org/jira/browse/PIG-794
 Project: Pig
  Issue Type: Improvement
  Components: impl
Affects Versions: 0.2.0
Reporter: Rakesh Setty
Assignee: Dmitriy V. Ryaboy
 Attachments: avro-0.1-dev-java_r765402.jar, AvroStorage.patch, 
 AvroStorage_2.patch, AvroStorage_3.patch, AvroStorage_4.patch, AvroTest.java, 
 jackson-asl-0.9.4.jar, PIG-794.patch


 We would like to use Avro serialization in Pig to pass data between MR jobs 
 instead of the current BinStorage. Attached is an implementation of 
 AvroBinStorage which performs significantly better compared to BinStorage on 
 our benchmarks.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.