pig trunk build

2009-08-26 Thread Giridharan Kesavan
Pig trunk build is now moved from minerva.apache.org to h7.grid.sp2.yahoo.net 
machine.

tnx,
Giri


[jira] Created: (PIG-933) broken link in pig-latin reference manual to hadoop file glob pattern documentation

2009-08-26 Thread Thejas M Nair (JIRA)
broken link in pig-latin reference manual to hadoop file glob pattern 
documentation
---

 Key: PIG-933
 URL: https://issues.apache.org/jira/browse/PIG-933
 Project: Pig
  Issue Type: Bug
  Components: documentation
Reporter: Thejas M Nair
Priority: Minor


http://wiki.apache.org/pig-data/attachments/FrontPage/attachments/plrm.htm#_LOAD
 has a link to 
http://lucene.apache.org/hadoop/api/org/apache/hadoop/fs/FileSystem.html#globPaths(org.apache.hadoop.fs.Path)the
 , 
it should be - 
http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/fs/FileSystem.html#globStatus(org.apache.hadoop.fs.Path)




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



RE: pig trunk build

2009-08-26 Thread Pradeep Kamath
What is the URL for the Hudson UI? I tried
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.ne
t but that did not work.

Pradeep

-Original Message-
From: Giridharan Kesavan [mailto:gkesa...@yahoo-inc.com] 
Sent: Wednesday, August 26, 2009 7:41 AM
To: pig-dev@hadoop.apache.org
Subject: pig trunk build

Pig trunk build is now moved from minerva.apache.org to
h7.grid.sp2.yahoo.net machine.

tnx,
Giri


RE: pig trunk build

2009-08-26 Thread Giridharan Kesavan
Its only the trunk build that I moved to h7 AND NOT THE PATCH BUILDS.
And this is not going to change anything from the dev point of view.
Trunk build url still remains the same:
http://hudson.zones.apache.org/hudson/view/Pig/job/Pig-trunk/


Patch builds still run on minerva and it has the same url as before.
Will keep you posted as I move the patch builds.



 -Original Message-
 From: Pradeep Kamath [mailto:prade...@yahoo-inc.com]
 Sent: Wednesday, August 26, 2009 10:37 PM
 To: pig-dev@hadoop.apache.org
 Subject: RE: pig trunk build
 
 What is the URL for the Hudson UI? I tried
 http://hudson.zones.apache.org/hudson/job/Pig-Patch-
 h7.grid.sp2.yahoo.ne
 t but that did not work.
 
 Pradeep
 
 -Original Message-
 From: Giridharan Kesavan [mailto:gkesa...@yahoo-inc.com]
 Sent: Wednesday, August 26, 2009 7:41 AM
 To: pig-dev@hadoop.apache.org
 Subject: pig trunk build
 
 Pig trunk build is now moved from minerva.apache.org to
 h7.grid.sp2.yahoo.net machine.
 
 tnx,
 Giri


[jira] Updated: (PIG-922) Logical optimizer: push up project

2009-08-26 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-922:
---

Attachment: PIG-922-p1_4.patch

Address findbug warnings.

 Logical optimizer: push up project
 --

 Key: PIG-922
 URL: https://issues.apache.org/jira/browse/PIG-922
 Project: Pig
  Issue Type: New Feature
  Components: impl
Affects Versions: 0.3.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Attachments: PIG-922-p1_0.patch, PIG-922-p1_1.patch, 
 PIG-922-p1_2.patch, PIG-922-p1_3.patch, PIG-922-p1_4.patch


 This is a continuation work of 
 [PIG-697|https://issues.apache.org/jira/browse/PIG-697]. We need to add 
 another rule to the logical optimizer: Push up project, ie, prune columns as 
 early as possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-922) Logical optimizer: push up project

2009-08-26 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-922:
---

Status: Open  (was: Patch Available)

 Logical optimizer: push up project
 --

 Key: PIG-922
 URL: https://issues.apache.org/jira/browse/PIG-922
 Project: Pig
  Issue Type: New Feature
  Components: impl
Affects Versions: 0.3.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.4.0

 Attachments: PIG-922-p1_0.patch, PIG-922-p1_1.patch, 
 PIG-922-p1_2.patch, PIG-922-p1_3.patch, PIG-922-p1_4.patch


 This is a continuation work of 
 [PIG-697|https://issues.apache.org/jira/browse/PIG-697]. We need to add 
 another rule to the logical optimizer: Push up project, ie, prune columns as 
 early as possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-922) Logical optimizer: push up project

2009-08-26 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-922:
---

Fix Version/s: (was: 0.4.0)
   Status: Patch Available  (was: Open)

 Logical optimizer: push up project
 --

 Key: PIG-922
 URL: https://issues.apache.org/jira/browse/PIG-922
 Project: Pig
  Issue Type: New Feature
  Components: impl
Affects Versions: 0.3.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Attachments: PIG-922-p1_0.patch, PIG-922-p1_1.patch, 
 PIG-922-p1_2.patch, PIG-922-p1_3.patch, PIG-922-p1_4.patch


 This is a continuation work of 
 [PIG-697|https://issues.apache.org/jira/browse/PIG-697]. We need to add 
 another rule to the logical optimizer: Push up project, ie, prune columns as 
 early as possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (PIG-934) Merge join implementation currently does not seek to right point on the right side input based on the offset provided by the index

2009-08-26 Thread Pradeep Kamath (JIRA)
Merge join implementation currently does not seek to right point on the right 
side input based on the offset provided by the index
--

 Key: PIG-934
 URL: https://issues.apache.org/jira/browse/PIG-934
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.3.1
Reporter: Pradeep Kamath


We use POLoad to seek into right file which has the following code: 
{noformat}
   public void setUp() throws IOException{

String filename = lFile.getFileName();

loader = 
(LoadFunc)PigContext.instantiateFuncFromSpec(lFile.getFuncSpec());

is = FileLocalizer.open(filename, pc);

loader.bindTo(filename , new BufferedPositionedInputStream(is), 
this.offset, Long.MAX_VALUE);

}
{noformat}

Between opening the stream and bindTo we do not seek to the right offset. 
bindTo itself does not perform any seek.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-934) Merge join implementation currently does not seek to right point on the right side input based on the offset provided by the index

2009-08-26 Thread Pradeep Kamath (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12748070#action_12748070
 ] 

Pradeep Kamath commented on PIG-934:


To get an idea of how this seeking in case of regular loads in map tasks, I 
looked at PigSlice.java, the seek happens in the init() code before bindTo():
{code}
public void init(DataStorage base) throws IOException {
..

fsis = base.asElement(base.getActiveContainer(), file).sopen();

fsis.seek(start, FLAGS.SEEK_CUR);

 
end = start + getLength();


if (file.endsWith(.bz) || file.endsWith(.bz2)) {

is = new CBZip2InputStream(fsis, 9);

} else if (file.endsWith(.gz)) {

is = new GZIPInputStream(fsis);

// We can't tell how much of the underlying stream GZIPInputStream

// has actually consumed

end = Long.MAX_VALUE;

} else {

is = fsis;

}

loader.bindTo(file.toString(), new BufferedPositionedInputStream(is,

start), start, end);

}
{code}

I think we need a FileLocalizer.sOpenSingleFile() method which can return a 
SeekableInputStream and we can use that in setup() in POLoad.
Something along the lines of :
{code}
static public InputStream open(String fileSpec, PigContext pigContext) throws 
IOException {
fileSpec = checkDefaultPrefix(pigContext.getExecType(), fileSpec);
if (!fileSpec.startsWith(LOCAL_PREFIX)) {
init(pigContext);
ElementDescriptor elem = 
pigContext.getDfs().asElement(fullPath(fileSpec, pigContext));
return elem.sopen();
}
else {
fileSpec = fileSpec.substring(LOCAL_PREFIX.length());
//buffering because we only want buffered streams to be passed to 
load functions.
/*return new BufferedInputStream(new FileInputStream(fileSpec));*/
init(pigContext);
ElementDescriptor elem = 
pigContext.getLfs().asElement(fullPath(fileSpec, pigContext));
return elem.sopen;
}
}

{code}
 The above code would only work with single files and not dirs which should be 
ok for merge join. We should probably set this up with a new constructor in 
POLoad which also indicates that a single file is being processed.



 Merge join implementation currently does not seek to right point on the right 
 side input based on the offset provided by the index
 --

 Key: PIG-934
 URL: https://issues.apache.org/jira/browse/PIG-934
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.3.1
Reporter: Pradeep Kamath

 We use POLoad to seek into right file which has the following code: 
 {noformat}
public void setUp() throws IOException{
 String filename = lFile.getFileName();
 loader = 
 (LoadFunc)PigContext.instantiateFuncFromSpec(lFile.getFuncSpec());
 is = FileLocalizer.open(filename, pc);
 loader.bindTo(filename , new BufferedPositionedInputStream(is), 
 this.offset, Long.MAX_VALUE);
 }
 {noformat}
 Between opening the stream and bindTo we do not seek to the right offset. 
 bindTo itself does not perform any seek.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-935) Skewed join throws an exception when used with map keys

2009-08-26 Thread Sriranjan Manjunath (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriranjan Manjunath updated PIG-935:


Status: Patch Available  (was: Open)

The attached patch solves this issue.

 Skewed join throws an exception when used with map keys
 ---

 Key: PIG-935
 URL: https://issues.apache.org/jira/browse/PIG-935
 Project: Pig
  Issue Type: Bug
Reporter: Sriranjan Manjunath
 Attachments: skjoinmapbug.patch


 Skewed join throws a runtime exception for the following query:
 A = load 'map.txt' as (e);
 B = load 'map.txt' as (f);
 C = join A by (chararray)e#'a', B by (chararray)f#'a' using skewed;
 explain C;
 Exception:
 Caused by: java.lang.ClassCastException: 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POCast
  cannot be cast to 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.PO
 Project
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.getSortCols(MRCompiler.java:1492)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.getSamplingJob(MRCompiler.java:1894)
 ... 27 more

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Hudson build is back to normal: Pig-Patch-minerva.apache.org #178

2009-08-26 Thread Apache Hudson Server
See http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/178/




[jira] Commented: (PIG-922) Logical optimizer: push up project

2009-08-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12748131#action_12748131
 ] 

Hadoop QA commented on PIG-922:
---

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12417760/PIG-922-p1_4.patch
  against trunk revision 806668.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/178/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/178/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/178/console

This message is automatically generated.

 Logical optimizer: push up project
 --

 Key: PIG-922
 URL: https://issues.apache.org/jira/browse/PIG-922
 Project: Pig
  Issue Type: New Feature
  Components: impl
Affects Versions: 0.3.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Attachments: PIG-922-p1_0.patch, PIG-922-p1_1.patch, 
 PIG-922-p1_2.patch, PIG-922-p1_3.patch, PIG-922-p1_4.patch


 This is a continuation work of 
 [PIG-697|https://issues.apache.org/jira/browse/PIG-697]. We need to add 
 another rule to the logical optimizer: Push up project, ie, prune columns as 
 early as possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Build failed in Hudson: Pig-Patch-minerva.apache.org #179

2009-08-26 Thread Apache Hudson Server
See http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/179/

--
[...truncated 112853 lines...]
 [exec] [junit] 09/08/26 23:20:26 INFO dfs.StateChange: BLOCK* 
NameSystem.addStoredBlock: blockMap updated: 127.0.0.1:36021 is added to 
blk_2072952556046770198_1011 size 6
 [exec] [junit] 09/08/26 23:20:26 INFO dfs.DataNode: PacketResponder 2 
for block blk_2072952556046770198_1011 terminating
 [exec] [junit] 09/08/26 23:20:26 INFO 
executionengine.HExecutionEngine: Connecting to hadoop file system at: 
hdfs://localhost:33549
 [exec] [junit] 09/08/26 23:20:26 INFO 
executionengine.HExecutionEngine: Connecting to map-reduce job tracker at: 
localhost:59257
 [exec] [junit] 09/08/26 23:20:26 INFO 
mapReduceLayer.MultiQueryOptimizer: MR plan size before optimization: 1
 [exec] [junit] 09/08/26 23:20:26 INFO 
mapReduceLayer.MultiQueryOptimizer: MR plan size after optimization: 1
 [exec] [junit] 09/08/26 23:20:26 INFO dfs.StateChange: BLOCK* ask 
127.0.0.1:36021 to delete  blk_1407819332858915959_1006 
blk_-3865270409021269559_1005 blk_-5210738776223878005_1004
 [exec] [junit] 09/08/26 23:20:26 INFO dfs.StateChange: BLOCK* ask 
127.0.0.1:43725 to delete  blk_1407819332858915959_1006 
blk_-5210738776223878005_1004
 [exec] [junit] 09/08/26 23:20:26 WARN dfs.DataNode: Unexpected error 
trying to delete block blk_-5210738776223878005_1004. BlockInfo not found in 
volumeMap.
 [exec] [junit] 09/08/26 23:20:26 INFO dfs.DataNode: Deleting block 
blk_1407819332858915959_1006 file dfs/data/data7/current/blk_1407819332858915959
 [exec] [junit] 09/08/26 23:20:26 WARN dfs.DataNode: 
java.io.IOException: Error in deleting blocks.
 [exec] [junit] at 
org.apache.hadoop.dfs.FSDataset.invalidate(FSDataset.java:1146)
 [exec] [junit] at 
org.apache.hadoop.dfs.DataNode.processCommand(DataNode.java:793)
 [exec] [junit] at 
org.apache.hadoop.dfs.DataNode.offerService(DataNode.java:663)
 [exec] [junit] at 
org.apache.hadoop.dfs.DataNode.run(DataNode.java:2888)
 [exec] [junit] at java.lang.Thread.run(Thread.java:619)
 [exec] [junit] 
 [exec] [junit] 09/08/26 23:20:27 INFO 
mapReduceLayer.JobControlCompiler: Setting up single store job
 [exec] [junit] 09/08/26 23:20:27 WARN mapred.JobClient: Use 
GenericOptionsParser for parsing the arguments. Applications should implement 
Tool for the same.
 [exec] [junit] 09/08/26 23:20:27 INFO dfs.StateChange: BLOCK* 
NameSystem.allocateBlock: 
/tmp/hadoop-hudson/mapred/system/job_200908262319_0002/job.jar. 
blk_347023940785646545_1012
 [exec] [junit] 09/08/26 23:20:27 INFO dfs.DataNode: Receiving block 
blk_347023940785646545_1012 src: /127.0.0.1:55940 dest: /127.0.0.1:43941
 [exec] [junit] 09/08/26 23:20:27 INFO dfs.DataNode: Receiving block 
blk_347023940785646545_1012 src: /127.0.0.1:49064 dest: /127.0.0.1:36021
 [exec] [junit] 09/08/26 23:20:27 INFO dfs.DataNode: Receiving block 
blk_347023940785646545_1012 src: /127.0.0.1:42096 dest: /127.0.0.1:43725
 [exec] [junit] 09/08/26 23:20:27 INFO dfs.DataNode: Received block 
blk_347023940785646545_1012 of size 1498963 from /127.0.0.1
 [exec] [junit] 09/08/26 23:20:27 INFO dfs.DataNode: PacketResponder 0 
for block blk_347023940785646545_1012 terminating
 [exec] [junit] 09/08/26 23:20:27 INFO dfs.StateChange: BLOCK* 
NameSystem.addStoredBlock: blockMap updated: 127.0.0.1:43725 is added to 
blk_347023940785646545_1012 size 1498963
 [exec] [junit] 09/08/26 23:20:27 INFO dfs.DataNode: Received block 
blk_347023940785646545_1012 of size 1498963 from /127.0.0.1
 [exec] [junit] 09/08/26 23:20:27 INFO dfs.DataNode: PacketResponder 1 
for block blk_347023940785646545_1012 terminating
 [exec] [junit] 09/08/26 23:20:27 INFO dfs.StateChange: BLOCK* 
NameSystem.addStoredBlock: blockMap updated: 127.0.0.1:36021 is added to 
blk_347023940785646545_1012 size 1498963
 [exec] [junit] 09/08/26 23:20:27 INFO dfs.DataNode: Received block 
blk_347023940785646545_1012 of size 1498963 from /127.0.0.1
 [exec] [junit] 09/08/26 23:20:27 INFO dfs.StateChange: BLOCK* 
NameSystem.addStoredBlock: blockMap updated: 127.0.0.1:43941 is added to 
blk_347023940785646545_1012 size 1498963
 [exec] [junit] 09/08/26 23:20:27 INFO dfs.DataNode: PacketResponder 2 
for block blk_347023940785646545_1012 terminating
 [exec] [junit] 09/08/26 23:20:27 INFO fs.FSNamesystem: Increasing 
replication for file 
/tmp/hadoop-hudson/mapred/system/job_200908262319_0002/job.jar. New replication 
is 2
 [exec] [junit] 09/08/26 23:20:27 INFO fs.FSNamesystem: Reducing 
replication for file 
/tmp/hadoop-hudson/mapred/system/job_200908262319_0002/job.jar. New replication 
is 2
 [exec] [junit] 09/08/26 23:20:27 INFO dfs.StateChange: BLOCK* 

[jira] Commented: (PIG-935) Skewed join throws an exception when used with map keys

2009-08-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12748201#action_12748201
 ] 

Hadoop QA commented on PIG-935:
---

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12417766/skjoinmapbug.patch
  against trunk revision 806668.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/179/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/179/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/179/console

This message is automatically generated.

 Skewed join throws an exception when used with map keys
 ---

 Key: PIG-935
 URL: https://issues.apache.org/jira/browse/PIG-935
 Project: Pig
  Issue Type: Bug
Reporter: Sriranjan Manjunath
 Attachments: skjoinmapbug.patch


 Skewed join throws a runtime exception for the following query:
 A = load 'map.txt' as (e);
 B = load 'map.txt' as (f);
 C = join A by (chararray)e#'a', B by (chararray)f#'a' using skewed;
 explain C;
 Exception:
 Caused by: java.lang.ClassCastException: 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POCast
  cannot be cast to 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.PO
 Project
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.getSortCols(MRCompiler.java:1492)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.getSamplingJob(MRCompiler.java:1894)
 ... 27 more

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.