Re: [VOTE] Release Pig 0.14.0 (candidate 0)

2014-11-14 Thread Thejas Nair
+1

Verified keys
Checked LICENSE, README, RELEASE_NOTES, CHANGES files, rat report.
Built the source
Tried running queries both using local mode and cluster

Two minor issues, that doesn’t need to block this RC
1. I think we should update README to indicate the choice of execution engine.
2. pig —help does not show “tez” as valid option for “-x” argument

I will create a jira to track these issues.

On Wed, Nov 12, 2014 at 8:46 PM, Daniel Dai da...@hortonworks.com wrote:
 Hi,

 I have created a candidate build for Pig 0.14.0.

 Keys used to sign the release are available at
 http://svn.apache.org/viewvc/pig/trunk/KEYS?view=markup.

 Please download, test, and try it out:
 http://people.apache.org/~daijy/pig-0.14.0-candidate-0/

 Release notes and the rat report are available at the same location.

 Should we release this? Vote closes on next Monday EOD, Nov 17th 2014.

 Thanks,
 Daniel

 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: [VOTE] Drop support for Hadoop 0.20 from Pig 0.14

2014-09-22 Thread Thejas Nair
+1

On Thu, Sep 18, 2014 at 5:50 PM, Mona Chitnis mona.chit...@yahoo.in wrote:

 +1 (non-binding)
  Mona Chitnis
 Yahoo!

  On Thursday, September 18, 2014 8:48 AM, Ashutosh Chauhan 
 hashut...@apache.org wrote:


  +1

 On Wed, Sep 17, 2014 at 7:02 PM, Daniel Dai da...@hortonworks.com wrote:

 +1

 On Wed, Sep 17, 2014 at 11:12 AM, Prashant Kommireddi
 prash1...@gmail.com wrote:
  +1
 
  On Wed, Sep 17, 2014 at 8:44 AM, Cheolsoo Park piaozhe...@gmail.com
 wrote:
 
  +1
 
  On Wed, Sep 17, 2014 at 7:09 AM, Xuefu Zhang xzh...@cloudera.com
 wrote:
 
   +1
  
   On Wed, Sep 17, 2014 at 7:04 AM, Julien Le Dem jul...@ledem.net
 wrote:
  
+1
   
Julien
   
 -Original Message-
 From: Rohini Palaniswamy [mailto:rohini.adi...@gmail.com]
 Sent: Wednesday, September 17, 2014 12:38 PM
 To: dev@pig.apache.org
 Subject: [VOTE] Drop support for Hadoop 0.20 from Pig 0.14

 Hi,
  Hadoop has matured far from Hadoop 0.20 and has had two major
   releases
after that and there has been no development on branch-0.20 (
 http://svn.apache.org/viewvc/hadoop/common/branches/branch-0.20/)
  for
   3
years now. It is high time we drop support for Hadoop 0.20 and only
   support
Hadoop 1.x and 2.x lines going forward. This will reduce the
  maintenance
effort and also enable us to right more efficient code and cut down
 on
reflections.

 Vote closes on Tuesday, Sep 23 2014.

 Thanks,
 Rohini
   
  
 

 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.





-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: Review Request 24789: New logical optimizer rule: ConstantCalculator

2014-08-26 Thread Thejas Nair

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24789/#review51603
---

Ship it!


Ship It!

- Thejas Nair


On Aug. 26, 2014, 10:35 p.m., Daniel Dai wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/24789/
 ---
 
 (Updated Aug. 26, 2014, 10:35 p.m.)
 
 
 Review request for pig.
 
 
 Repository: pig
 
 
 Description
 ---
 
 See PIG-4128
 
 
 Diffs
 -
 
   trunk/src/org/apache/pig/EvalFunc.java 1618727 
   trunk/src/org/apache/pig/Main.java 1618727 
   trunk/src/org/apache/pig/builtin/ABS.java 1618727 
   trunk/src/org/apache/pig/builtin/ARITY.java 1618727 
   trunk/src/org/apache/pig/builtin/AddDuration.java 1618727 
   trunk/src/org/apache/pig/builtin/Assert.java 1618727 
   trunk/src/org/apache/pig/builtin/BagSize.java 1618727 
   trunk/src/org/apache/pig/builtin/BagToString.java 1618727 
   trunk/src/org/apache/pig/builtin/BagToTuple.java 1618727 
   trunk/src/org/apache/pig/builtin/Base.java 1618727 
   trunk/src/org/apache/pig/builtin/BigDecimalAbs.java 1618727 
   trunk/src/org/apache/pig/builtin/BigIntegerAbs.java 1618727 
   trunk/src/org/apache/pig/builtin/CONCAT.java 1618727 
   trunk/src/org/apache/pig/builtin/ConstantSize.java 1618727 
   trunk/src/org/apache/pig/builtin/CubeDimensions.java 1618727 
   trunk/src/org/apache/pig/builtin/CurrentTime.java 1618727 
   trunk/src/org/apache/pig/builtin/DIFF.java 1618727 
   trunk/src/org/apache/pig/builtin/DaysBetween.java 1618727 
   trunk/src/org/apache/pig/builtin/DoubleRound.java 1618727 
   trunk/src/org/apache/pig/builtin/DoubleRoundTo.java 1618727 
   trunk/src/org/apache/pig/builtin/ENDSWITH.java 1618727 
   trunk/src/org/apache/pig/builtin/EqualsIgnoreCase.java 1618727 
   trunk/src/org/apache/pig/builtin/FloatAbs.java 1618727 
   trunk/src/org/apache/pig/builtin/FloatRound.java 1618727 
   trunk/src/org/apache/pig/builtin/FloatRoundTo.java 1618727 
   trunk/src/org/apache/pig/builtin/GetDay.java 1618727 
   trunk/src/org/apache/pig/builtin/GetHour.java 1618727 
   trunk/src/org/apache/pig/builtin/GetMilliSecond.java 1618727 
   trunk/src/org/apache/pig/builtin/GetMinute.java 1618727 
   trunk/src/org/apache/pig/builtin/GetMonth.java 1618727 
   trunk/src/org/apache/pig/builtin/GetSecond.java 1618727 
   trunk/src/org/apache/pig/builtin/GetWeek.java 1618727 
   trunk/src/org/apache/pig/builtin/GetWeekYear.java 1618727 
   trunk/src/org/apache/pig/builtin/GetYear.java 1618727 
   trunk/src/org/apache/pig/builtin/HoursBetween.java 1618727 
   trunk/src/org/apache/pig/builtin/INDEXOF.java 1618727 
   trunk/src/org/apache/pig/builtin/INVERSEMAP.java 1618727 
   trunk/src/org/apache/pig/builtin/IntAbs.java 1618727 
   trunk/src/org/apache/pig/builtin/IsEmpty.java 1618727 
   trunk/src/org/apache/pig/builtin/KEYSET.java 1618727 
   trunk/src/org/apache/pig/builtin/LAST_INDEX_OF.java 1618727 
   trunk/src/org/apache/pig/builtin/LCFIRST.java 1618727 
   trunk/src/org/apache/pig/builtin/LOWER.java 1618727 
   trunk/src/org/apache/pig/builtin/LTRIM.java 1618727 
   trunk/src/org/apache/pig/builtin/LongAbs.java 1618727 
   trunk/src/org/apache/pig/builtin/MapSize.java 1618727 
   trunk/src/org/apache/pig/builtin/MilliSecondsBetween.java 1618727 
   trunk/src/org/apache/pig/builtin/MinutesBetween.java 1618727 
   trunk/src/org/apache/pig/builtin/MonthsBetween.java 1618727 
   trunk/src/org/apache/pig/builtin/PluckTuple.java 1618727 
   trunk/src/org/apache/pig/builtin/REGEX_EXTRACT.java 1618727 
   trunk/src/org/apache/pig/builtin/REGEX_EXTRACT_ALL.java 1618727 
   trunk/src/org/apache/pig/builtin/REPLACE.java 1618727 
   trunk/src/org/apache/pig/builtin/ROUND.java 1618727 
   trunk/src/org/apache/pig/builtin/ROUND_TO.java 1618727 
   trunk/src/org/apache/pig/builtin/RTRIM.java 1618727 
   trunk/src/org/apache/pig/builtin/RollupDimensions.java 1618727 
   trunk/src/org/apache/pig/builtin/SIZE.java 1618727 
   trunk/src/org/apache/pig/builtin/SPRINTF.java 1618727 
   trunk/src/org/apache/pig/builtin/STARTSWITH.java 1618727 
   trunk/src/org/apache/pig/builtin/STRSPLIT.java 1618727 
   trunk/src/org/apache/pig/builtin/SUBSTRING.java 1618727 
   trunk/src/org/apache/pig/builtin/SUBTRACT.java 1618727 
   trunk/src/org/apache/pig/builtin/SecondsBetween.java 1618727 
   trunk/src/org/apache/pig/builtin/StringConcat.java 1618727 
   trunk/src/org/apache/pig/builtin/StringSize.java 1618727 
   trunk/src/org/apache/pig/builtin/SubtractDuration.java 1618727 
   trunk/src/org/apache/pig/builtin/TOBAG.java 1618727 
   trunk/src/org/apache/pig/builtin/TOKENIZE.java 1618727 
   trunk/src/org/apache/pig/builtin/TOMAP.java 1618727 
   trunk/src/org/apache/pig/builtin/TOTUPLE.java 1618727 
   trunk/src/org/apache/pig/builtin/TRIM.java 1618727

Re: Review Request 24789: New logical optimizer rule: ConstantCalculator

2014-08-25 Thread Thejas Nair

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24789/#review51430
---



trunk/src/org/apache/pig/builtin/CurrentTime.java
https://reviews.apache.org/r/24789/#comment89748

if the optimization is disabled, don't we want to go to old behavior of 
using pig.job.submitted ?



trunk/src/org/apache/pig/newplan/logical/rules/ConstantCalculator.java
https://reviews.apache.org/r/24789/#comment89769

There is no processedOperators.add happening. Is this variable needed ?



trunk/src/org/apache/pig/newplan/logical/rules/ConstantCalculator.java
https://reviews.apache.org/r/24789/#comment89755

does it make sense to do this setPlan in moveTree call itself?


- Thejas Nair


On Aug. 19, 2014, 5:41 p.m., Daniel Dai wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/24789/
 ---
 
 (Updated Aug. 19, 2014, 5:41 p.m.)
 
 
 Review request for pig.
 
 
 Repository: pig
 
 
 Description
 ---
 
 See PIG-4128
 
 
 Diffs
 -
 
   trunk/src/org/apache/pig/EvalFunc.java 1618727 
   trunk/src/org/apache/pig/Main.java 1618727 
   trunk/src/org/apache/pig/builtin/ABS.java 1618727 
   trunk/src/org/apache/pig/builtin/ARITY.java 1618727 
   trunk/src/org/apache/pig/builtin/AddDuration.java 1618727 
   trunk/src/org/apache/pig/builtin/Assert.java 1618727 
   trunk/src/org/apache/pig/builtin/BagSize.java 1618727 
   trunk/src/org/apache/pig/builtin/BagToString.java 1618727 
   trunk/src/org/apache/pig/builtin/BagToTuple.java 1618727 
   trunk/src/org/apache/pig/builtin/Base.java 1618727 
   trunk/src/org/apache/pig/builtin/BigDecimalAbs.java 1618727 
   trunk/src/org/apache/pig/builtin/BigIntegerAbs.java 1618727 
   trunk/src/org/apache/pig/builtin/CONCAT.java 1618727 
   trunk/src/org/apache/pig/builtin/ConstantSize.java 1618727 
   trunk/src/org/apache/pig/builtin/CubeDimensions.java 1618727 
   trunk/src/org/apache/pig/builtin/CurrentTime.java 1618727 
   trunk/src/org/apache/pig/builtin/DIFF.java 1618727 
   trunk/src/org/apache/pig/builtin/DaysBetween.java 1618727 
   trunk/src/org/apache/pig/builtin/DoubleRound.java 1618727 
   trunk/src/org/apache/pig/builtin/DoubleRoundTo.java 1618727 
   trunk/src/org/apache/pig/builtin/ENDSWITH.java 1618727 
   trunk/src/org/apache/pig/builtin/EqualsIgnoreCase.java 1618727 
   trunk/src/org/apache/pig/builtin/FloatAbs.java 1618727 
   trunk/src/org/apache/pig/builtin/FloatRound.java 1618727 
   trunk/src/org/apache/pig/builtin/FloatRoundTo.java 1618727 
   trunk/src/org/apache/pig/builtin/GetDay.java 1618727 
   trunk/src/org/apache/pig/builtin/GetHour.java 1618727 
   trunk/src/org/apache/pig/builtin/GetMilliSecond.java 1618727 
   trunk/src/org/apache/pig/builtin/GetMinute.java 1618727 
   trunk/src/org/apache/pig/builtin/GetMonth.java 1618727 
   trunk/src/org/apache/pig/builtin/GetSecond.java 1618727 
   trunk/src/org/apache/pig/builtin/GetWeek.java 1618727 
   trunk/src/org/apache/pig/builtin/GetWeekYear.java 1618727 
   trunk/src/org/apache/pig/builtin/GetYear.java 1618727 
   trunk/src/org/apache/pig/builtin/HoursBetween.java 1618727 
   trunk/src/org/apache/pig/builtin/INDEXOF.java 1618727 
   trunk/src/org/apache/pig/builtin/INVERSEMAP.java 1618727 
   trunk/src/org/apache/pig/builtin/IntAbs.java 1618727 
   trunk/src/org/apache/pig/builtin/IsEmpty.java 1618727 
   trunk/src/org/apache/pig/builtin/KEYSET.java 1618727 
   trunk/src/org/apache/pig/builtin/LAST_INDEX_OF.java 1618727 
   trunk/src/org/apache/pig/builtin/LCFIRST.java 1618727 
   trunk/src/org/apache/pig/builtin/LOWER.java 1618727 
   trunk/src/org/apache/pig/builtin/LTRIM.java 1618727 
   trunk/src/org/apache/pig/builtin/LongAbs.java 1618727 
   trunk/src/org/apache/pig/builtin/MapSize.java 1618727 
   trunk/src/org/apache/pig/builtin/MilliSecondsBetween.java 1618727 
   trunk/src/org/apache/pig/builtin/MinutesBetween.java 1618727 
   trunk/src/org/apache/pig/builtin/MonthsBetween.java 1618727 
   trunk/src/org/apache/pig/builtin/PluckTuple.java 1618727 
   trunk/src/org/apache/pig/builtin/REGEX_EXTRACT.java 1618727 
   trunk/src/org/apache/pig/builtin/REGEX_EXTRACT_ALL.java 1618727 
   trunk/src/org/apache/pig/builtin/REPLACE.java 1618727 
   trunk/src/org/apache/pig/builtin/ROUND.java 1618727 
   trunk/src/org/apache/pig/builtin/ROUND_TO.java 1618727 
   trunk/src/org/apache/pig/builtin/RTRIM.java 1618727 
   trunk/src/org/apache/pig/builtin/RollupDimensions.java 1618727 
   trunk/src/org/apache/pig/builtin/SIZE.java 1618727 
   trunk/src/org/apache/pig/builtin/SPRINTF.java 1618727 
   trunk/src/org/apache/pig/builtin/STARTSWITH.java 1618727 
   trunk/src/org/apache/pig/builtin/STRSPLIT.java 1618727 
   trunk/src/org/apache/pig/builtin/SUBSTRING.java 1618727 
   trunk/src/org/apache

Re: [ANNOUNCE] Apache Pig 0.12.1 released

2014-04-15 Thread Thejas Nair
Thanks Prashant!


On Tue, Apr 15, 2014 at 10:58 AM, Cheolsoo Park piaozhe...@gmail.com wrote:
 Thank you Prashant for your hard work!


 On Mon, Apr 14, 2014 at 5:37 PM, Daniel Dai da...@hortonworks.com wrote:

 Thanks Prashant!

 On Mon, Apr 14, 2014 at 5:30 PM, Prashant Kommireddi
 prkommire...@apache.org wrote:
  The Pig team is happy to announce the Pig 0.12.1 release.
 
  Apache Pig provides a high-level data-flow language and execution
 framework
  for parallel computation on Hadoop clusters.
 
  More details about Pig can be found at http://pig.apache.org/.
 
  This is a maintenance release of Pig 0.12 and contains several bug fixes
  and improvements. The details of the release can be found at
  http://pig.apache.org/releases.html.
 
  You can download the release here
  http://www.apache.org/dyn/closer.cgi/pig
 
  The released maven artifacts have been made available on
  repository.apache.org
 
  We would like to thank all contributors that made this release possible.
 
  Thanks,
  Prashant Kommireddi

 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: [VOTE] Release Pig 0.12.1 (Candidate 0)

2014-04-11 Thread Thejas Nair
Here is my late +1.
Checked the md5 and asc keys.
Checked release notes, CHANGES.txt.
Build from source tar, tried some local queries. Checked output of
version command (pig -version)
The output of version command in binary is accurate.

However, in case of source tar, when I build using just ant (without
-Dversion..) , the version shows up as
Apache Pig version 0.12.2-SNAPSHOT . I don't think this issue warrants a new RC.
I think we should update the release instructions to change the
version in build.xml to appropriate release version before tagging svn
(and create tar using this tagged version), and then increment the
version number in the next commit. If people agree, I can update the
instructions in wiki.


On Fri, Apr 11, 2014 at 1:54 AM, Prashant Kommireddi
prash1...@gmail.com wrote:
 The release vote passes with 4 +1s (4 binding votes), and no -1s.

 +1s (binding)
 Dmitriy Ryaboy
 Daniel Dai
 Cheolsoo Park
 Alan Gates

 +1s (non-binding)
 None

 -1s
 None

 Thanks Daniel for pointing out the missing pig-0.12.1.tar.gz.asc file. I
 have added it to the RC.

 I will proceed with the release process.

 Thanks,
 Prashant


 On Thu, Apr 10, 2014 at 10:54 AM, Alan Gates ga...@hortonworks.com wrote:

 +1

 Reviewed LICENSE, NOTICE, RELEASE_NOTES, and README.  Built, built
 piggybank and ran tests, ran a local smoke test.

 Alan.

 On Apr 7, 2014, at 1:22 PM, Prashant Kommireddi prkommire...@apache.org
 wrote:

  I have created a candidate build for Pig 0.12.1. This is a maintenance
  release to Pig 0.12.0 with a few critical bug fixes.
 
  Keys used to sign the release are available at
  http://svn.apache.org/viewvc/pig/trunk/KEYS?view=markup.
 
  Please download, test, and try it out:
 
  http://people.apache.org/~prkommireddi/pig-0.12.1-candidate-0/
 
 
  Release notes and the rat report are available from the same location.
 
 
  List of issues fixed in this release
 
 
 http://svn.apache.org/viewvc/pig/branches/branch-0.12/CHANGES.txt?view=markup
 
  Should we release this? Vote closes EOD this Thursday, April 10th.
 
  -Prashant


 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: [VOTE] Release Pig 0.12.1 (Candidate 0)

2014-04-11 Thread Thejas Nair
I have updated the wiki for this.
I have a post-release section where the version in number gets updated
to the next version.
https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=26120105selectedPageVersions=27selectedPageVersions=26


On Fri, Apr 11, 2014 at 11:57 AM, Daniel Dai da...@hortonworks.com wrote:
 This is on our release document and we do follow in prior releases. It
 does not seems to be an Apache convention. I don't know the motivation
 of this, but if it is confusing enough, it might be better to change
 in next release.

 On Fri, Apr 11, 2014 at 11:17 AM, Prashant Kommireddi
 prash1...@gmail.com wrote:
 Thanks Thejas. This actually came up during 0.12.0 RC as well and this is
 Daniel's reply. I do agree with you on having pig.version in build.xml
 reflect the current build rather than the next one. But I'm not aware of
 what the Apache convention is, or what other projects are doing. You guys
 know better :)

 *Hi, Mark,*





 *Thanks for reviewing. -SNAPSHOT is intentional according to
 https://cwiki.apache.org/confluence/display/PIG/HowToRelease
 https://cwiki.apache.org/confluence/display/PIG/HowToRelease. When
 userbuild the release, the version will be {next
 version}-SNAPSHOT.Thanks,Daniel*


 On Fri, Apr 11, 2014 at 10:45 AM, Thejas Nair the...@hortonworks.comwrote:

 Here is my late +1.
 Checked the md5 and asc keys.
 Checked release notes, CHANGES.txt.
 Build from source tar, tried some local queries. Checked output of
 version command (pig -version)
 The output of version command in binary is accurate.

 However, in case of source tar, when I build using just ant (without
 -Dversion..) , the version shows up as
 Apache Pig version 0.12.2-SNAPSHOT . I don't think this issue warrants a
 new RC.
 I think we should update the release instructions to change the
 version in build.xml to appropriate release version before tagging svn
 (and create tar using this tagged version), and then increment the
 version number in the next commit. If people agree, I can update the
 instructions in wiki.


 On Fri, Apr 11, 2014 at 1:54 AM, Prashant Kommireddi
 prash1...@gmail.com wrote:
  The release vote passes with 4 +1s (4 binding votes), and no -1s.
 
  +1s (binding)
  Dmitriy Ryaboy
  Daniel Dai
  Cheolsoo Park
  Alan Gates
 
  +1s (non-binding)
  None
 
  -1s
  None
 
  Thanks Daniel for pointing out the missing pig-0.12.1.tar.gz.asc file. I
  have added it to the RC.
 
  I will proceed with the release process.
 
  Thanks,
  Prashant
 
 
  On Thu, Apr 10, 2014 at 10:54 AM, Alan Gates ga...@hortonworks.com
 wrote:
 
  +1
 
  Reviewed LICENSE, NOTICE, RELEASE_NOTES, and README.  Built, built
  piggybank and ran tests, ran a local smoke test.
 
  Alan.
 
  On Apr 7, 2014, at 1:22 PM, Prashant Kommireddi 
 prkommire...@apache.org
  wrote:
 
   I have created a candidate build for Pig 0.12.1. This is a maintenance
   release to Pig 0.12.0 with a few critical bug fixes.
  
   Keys used to sign the release are available at
   http://svn.apache.org/viewvc/pig/trunk/KEYS?view=markup.
  
   Please download, test, and try it out:
  
   http://people.apache.org/~prkommireddi/pig-0.12.1-candidate-0/
  
  
   Release notes and the rat report are available from the same location.
  
  
   List of issues fixed in this release
  
  
 
 http://svn.apache.org/viewvc/pig/branches/branch-0.12/CHANGES.txt?view=markup
  
   Should we release this? Vote closes EOD this Thursday, April 10th.
  
   -Prashant
 
 
  --
  CONFIDENTIALITY NOTICE
  NOTICE: This message is intended for the use of the individual or
 entity to
  which it is addressed and may contain information that is confidential,
  privileged and exempt from disclosure under applicable law. If the
 reader
  of this message is not the intended recipient, you are hereby notified
 that
  any printing, copying, dissemination, distribution, disclosure or
  forwarding of this communication is strictly prohibited. If you have
  received this communication in error, please contact the sender
 immediately
  and delete it from your system. Thank You.
 

 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.


 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby

Re: [ANNOUNCE] Congratulations to our new PMC members Rohini Palaniswamy and Cheolsoo Park

2013-09-13 Thread Thejas Nair
Congrats Rohini and Cheolsoo!

On Thu, Sep 12, 2013 at 11:24 AM, Bill Graham billgra...@gmail.com wrote:
 Congrats guys! Well deserved indeed.


 On Wed, Sep 11, 2013 at 10:58 PM, Jarek Jarcec Cecho jar...@apache.orgwrote:

 Congratulations Rohini and Cheolsoo, awesome work!

 Jarcec

 On Wed, Sep 11, 2013 at 04:24:21PM -0700, Julien Le Dem wrote:
  Please welcome Rohini Palaniswamy and Cheolsoo Park as our latest Pig
 PMC members.
 
  Congrats Rohini and Cheolsoo !




 --
 *Note that I'm no longer using my Yahoo! email address. Please email me at
 billgra...@gmail.com going forward.*

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: Welcome new Pig Committer - Koji Noguchi

2013-09-13 Thread Thejas Nair
Congrats Koji! Very well deserved!


On Wed, Sep 11, 2013 at 9:49 AM, Daniel Dai da...@hortonworks.com wrote:
 Congratulation! You are well deserved.




 On Wed, Sep 11, 2013 at 6:33 AM, Miguel Angel Martin junquera 
 mianmarjun.mailingl...@gmail.com wrote:

 Congratulations K 


 Miguel Angel Martín Junquera
 Analyst Engineer.
 miguelangel.mar...@brainsins.com



 2013/9/11 kun yan yankunhad...@gmail.com

  Congratulations Koji!
 
 
  2013/9/11 Koji Noguchi knogu...@yahoo-inc.com
 
   Thanks everyone!
  
   Koji
  
  
   On Sep 11, 2013, at 2:18 AM, Bill Graham wrote:
  
Congrats Koji!
   
   
On Tue, Sep 10, 2013 at 10:29 PM, Cheolsoo Park 
 piaozhe...@gmail.com
   wrote:
   
Congratulations Koji!
   
   
On Wed, Sep 11, 2013 at 7:32 AM, Prashant Kommireddi 
   prash1...@gmail.com
wrote:
   
Congrats Koji!
   
   
On Tue, Sep 10, 2013 at 10:01 AM, Xuefu Zhang xzh...@cloudera.com
 
wrote:
   
Congratulations, Koji. Looking forward to more of your
  contributions.
   
--Xuefu
   
   
On Tue, Sep 10, 2013 at 8:58 AM, Olga Natkovich 
  onatkov...@yahoo.com
wrote:
   
It is my pleasure to announce that Koji Noguchi became the newest
addition
to the Pig Committers!
   
Koji has been actively contributing to Pig for over a year now
 and
has
been a part of larger Hadoop community (including Hadoop
 Committer)
for
many years now.
   
Please, join me in congratulating Koji!
   
Olga
   
   
   
   
   
   
   
--
*Note that I'm no longer using my Yahoo! email address. Please email
 me
   at
billgra...@gmail.com going forward.*
  
  
 
 
  --
 
  In the Hadoop world, I am just a novice, explore the entire Hadoop
  ecosystem, I hope one day I can contribute their own code
 
  YanBit
  yankunhad...@gmail.com
 


 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: [VOTE] Release Pig 0.10.1 (candidate 3)

2013-01-03 Thread Thejas Nair

+1
Verified md5 checksums of src and binary tar.gz .
Build the src tar.gz and ran queries against a hadoop 1.1 cluster, ran 
fs and sh commands.


-Thejas


On 1/3/13 12:11 PM, Rohini Palaniswamy wrote:

+1. Downloaded the tar binary, checked signature, ran unit tests, piggybank
unit tests, checked docs/release notes, ran a simple script locally and
against a cluster.


On Mon, Dec 31, 2012 at 8:41 AM, Alan Gates ga...@hortonworks.com wrote:


+1, yet again :).  Checked the key signature and checksum on the source
package.   Built and ran commit unit tests on src, ran a test job in local
mode.  Downloaded the tar binary and ran a job in local and cluster mode.

Alan.

On Dec 28, 2012, at 11:50 PM, Daniel Dai wrote:


Hi,

I have created a candidate build for Pig 0.10.1. This is a maintenance
release of Pig 0.10.

Keys used to sign the release are available at
http://svn.apache.org/viewvc/pig/trunk/KEYS?view=markup

Please download, test, and try it out:

http://people.apache.org/~daijy/pig-0.10.1-candidate-3/

Should we release this? Vote closes on EOD next Friday, Jan 4th.

Thanks,
Daniel









Re: Review Request: Review for PIG-1314 - add datetime type in pig

2012-08-16 Thread Thejas Nair
 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/data/SizeUtil.java 
1373741 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/data/TypeAwareTuple.java
 1373741 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/io/NullableDateTimeWritable.java
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/logicalLayer/schema/Schema.java
 1373741 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/logicalLayer/schema/SchemaUtil.java
 1373741 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/util/CastUtils.java
 1373741 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/util/NumValCarrier.java
 1373741 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/util/StorageUtil.java
 1373741 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/visitor/TypeCheckingExpVisitor.java
 1373741 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/parser/AliasMasker.g
 1373741 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/parser/AstPrinter.g
 1373741 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/parser/AstValidator.g
 1373741 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/parser/LogicalPlanGenerator.g
 1373741 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/parser/QueryLexer.g
 1373741 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/parser/QueryParser.g
 1373741 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/pen/AugmentBaseDataVisitor.java
 1373741 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/scripting/jruby/JrubyScriptEngine.java
 1373741 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/scripting/jruby/RubySchema.java
 1373741 
  
http://svn.apache.org/repos/asf/pig/trunk/test/e2e/pig/udfs/java/org/apache/pig/test/udf/storefunc/PigPerformanceLoader.java
 1373741 
  
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/data/TestSchemaTuple.java
 1373741 
  
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestAdd.java 
1373741 
  
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestBuiltin.java
 1373741 
  
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestConversions.java
 1373741 
  
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestDivide.java
 1373741 
  
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestEqualTo.java
 1373741 
  
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestGTOrEqual.java
 1373741 
  
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestGreaterThan.java
 1373741 
  
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestLTOrEqual.java
 1373741 
  
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestLessThan.java
 1373741 
  
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestMod.java 
1373741 
  
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestMultiply.java
 1373741 
  
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestNotEqualTo.java
 1373741 
  
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestNull.java
 1373741 
  
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestOrderBy.java
 1373741 
  
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestPOBinCond.java
 1373741 
  
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestPOCast.java
 1373741 
  
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestPackage.java
 1373741 
  
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestPigTupleRawComparator.java
 1373741 
  
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestResourceSchema.java
 1373741 
  
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestSchema.java
 1373741 
  
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestStore.java
 1373741 
  
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestSubtract.java
 1373741 
  
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestTextDataParser.java
 1373741 
  
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestTypeCheckingValidatorNewLP.java
 1373741 
  
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/utils/GenRandomData.java
 1373741 

Diff: https://reviews.apache.org/r/5414/diff/


Testing
---


Thanks,

Thejas Nair



Re: Review Request: Review for PIG-1314 - add datetime type in pig

2012-08-14 Thread Thejas Nair
://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/io/NullableDateTimeWritable.java
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/logicalLayer/schema/Schema.java
 1371785 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/logicalLayer/schema/SchemaUtil.java
 1371785 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/util/CastUtils.java
 1371785 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/util/NumValCarrier.java
 1371785 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/util/StorageUtil.java
 1371785 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/visitor/TypeCheckingExpVisitor.java
 1371785 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/parser/AliasMasker.g
 1371785 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/parser/AstPrinter.g
 1371785 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/parser/AstValidator.g
 1371785 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/parser/LogicalPlanGenerator.g
 1371785 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/parser/QueryLexer.g
 1371785 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/parser/QueryParser.g
 1371785 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/pen/AugmentBaseDataVisitor.java
 1371785 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/scripting/jruby/JrubyScriptEngine.java
 1371785 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/scripting/jruby/RubySchema.java
 1371785 
  
http://svn.apache.org/repos/asf/pig/trunk/test/e2e/pig/udfs/java/org/apache/pig/test/udf/storefunc/PigPerformanceLoader.java
 1371785 
  
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/data/TestSchemaTuple.java
 1371785 
  
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestAdd.java 
1371785 
  
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestBuiltin.java
 1371785 
  
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestConversions.java
 1371785 
  
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestDivide.java
 1371785 
  
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestEqualTo.java
 1371785 
  
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestGTOrEqual.java
 1371785 
  
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestGreaterThan.java
 1371785 
  
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestLTOrEqual.java
 1371785 
  
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestLessThan.java
 1371785 
  
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestMod.java 
1371785 
  
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestMultiply.java
 1371785 
  
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestNotEqualTo.java
 1371785 
  
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestNull.java
 1371785 
  
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestOrderBy.java
 1371785 
  
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestPOBinCond.java
 1371785 
  
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestPOCast.java
 1371785 
  
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestPigTupleRawComparator.java
 1371785 
  
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestResourceSchema.java
 1371785 
  
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestSchema.java
 1371785 
  
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestStore.java
 1371785 
  
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestSubtract.java
 1371785 
  
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestTextDataParser.java
 1371785 
  
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestTypeCheckingValidatorNewLP.java
 1371785 
  
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/utils/GenRandomData.java
 1371785 

Diff: https://reviews.apache.org/r/5414/diff/


Testing
---


Thanks,

Thejas Nair



Re: Breaking down big unit tests

2012-07-19 Thread Thejas Nair
We certainly need to look at ways to reduce the runtime of the 'unit' 
tests. Some of them should be migrated to the e2e tests.


But what you want for being able to re-test easily seems to be a way to 
specify specific test case within a Test*.java file . I wonder if junit 
lets you do that.


-Thejas


On 7/19/12 2:11 PM, Jie Li wrote:

Hi all,

Apparently some unit test classes are so fat that retesting them is a pain.
While reducing the full testing time is a long-term goal, shall we just
break down those big units into smaller pieces? Here are the running time
of top 20 big units :

3,432.68 org.apache.pig.test.TestEvalPipeline2
2,944.075 org.apache.pig.test.TestSkewedJoin
1,819.059 org.apache.pig.test.TestMergeJoin
1,797.877 org.apache.pig.test.TestFRJoin
1,476.097 org.apache.pig.test.TestEvalPipeline
1,261.661 org.apache.pig.test.TestFRJoin2
1,164.076 org.apache.pig.test.TestAccumulator
801.747 org.apache.pig.test.TestBZip
799.689 org.apache.pig.test.TestJoin
792.808 org.apache.pig.test.TestPigRunner
750.614 org.apache.pig.test.TestStreaming
743.728 org.apache.pig.test.TestNativeMapReduce
739.31 org.apache.pig.test.TestLimitVariable
674.208 org.apache.pig.test.TestScriptLanguageJavaScript
664.857 org.apache.pig.test.TestJoinSmoke
653.671 org.apache.pig.test.TestCounters
621.06 org.apache.pig.test.TestBestFitCast
541.43 org.apache.pig.test.TestAlgebraicEval
539.939 org.apache.pig.test.TestGrunt

While the full tests take about 10 hours to finish, these top 20 classes
account for almost half the time. The idea is to cut them each to 10-minute
pieces. Any comment?

Jie






Re: Review Request: Review for PIG-1314 - add datetime type in pig

2012-07-11 Thread Thejas Nair
 
in the toDate udf .



http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/data/BinInterSedes.java
https://reviews.apache.org/r/5414/#comment19278

can we just compare the longs ? That way we can avoid the object creation. 
creating objects reduces the performance advantage of using rawcomparator .



http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/data/DataType.java
https://reviews.apache.org/r/5414/#comment19279

as we allow long to be cast to float/double, i think it will be more 
consistent to allow that for datetime as well.



http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/data/DataType.java
https://reviews.apache.org/r/5414/#comment19280

we need to deal with timezone in the date string



http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/data/SizeUtil.java
https://reviews.apache.org/r/5414/#comment19281

how did you arrive at this number ? 



http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/pen/AugmentBaseDataVisitor.java
https://reviews.apache.org/r/5414/#comment19282

check for DATETIME should be not added here.



http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/pen/AugmentBaseDataVisitor.java
https://reviews.apache.org/r/5414/#comment19283

check for DATETIME should be not added here.


- Thejas Nair


On July 10, 2012, 5:41 p.m., Thejas Nair wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/5414/
 ---
 
 (Updated July 10, 2012, 5:41 p.m.)
 
 
 Review request for pig.
 
 
 Description
 ---
 
 Review for PIG-1314
 
 
 This addresses bug PIG-1314.
 https://issues.apache.org/jira/browse/PIG-1314
 
 
 Diffs
 -
 
   http://svn.apache.org/repos/asf/pig/trunk/conf/pig.properties 1359212 
   
 http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/DBStorage.java
  1359212 
   
 http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/HiveColumnarLoader.java
  1359212 
   
 http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/SequenceFileLoader.java
  1359212 
   
 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/LoadCaster.java 
 1359212 
   http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/PigServer.java 
 1359212 
   
 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/PigWarning.java 
 1359212 
   
 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/StoreCaster.java 
 1359212 
   
 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/DateTimeWritable.java
  PRE-CREATION 
   
 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/HDataType.java
  1359212 
   
 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java
  1359212 
   
 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigDateTimeRawComparator.java
  PRE-CREATION 
   
 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/PhysicalOperator.java
  1359212 
   
 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/ConstantExpression.java
  1359212 
   
 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/EqualToExpr.java
  1359212 
   
 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/ExpressionOperator.java
  1359212 
   
 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/GTOrEqualToExpr.java
  1359212 
   
 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/GreaterThanExpr.java
  1359212 
   
 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/LTOrEqualToExpr.java
  1359212 
   
 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/LessThanExpr.java
  1359212 
   
 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/NotEqualToExpr.java
  1359212 
   
 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POBinCond.java
  1359212 
   
 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POCast.java
  1359212

Re: Review Request: Review for PIG-1314 - add datetime type in pig

2012-07-10 Thread Thejas Nair
/AddDuration.java
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/builtin/BinStorage.java
 1359212 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/builtin/CurrentTime.java
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/builtin/DaysBetween.java
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/builtin/DiffDate.java
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/builtin/GetDay.java
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/builtin/GetHour.java
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/builtin/GetMinute.java
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/builtin/GetMonth.java
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/builtin/GetSecond.java
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/builtin/GetYear.java
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/builtin/HoursBetween.java
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/builtin/MinutesBetween.java
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/builtin/MonthsBetween.java
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/builtin/SecondsBetween.java
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/builtin/SubtractDuration.java
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/builtin/TextLoader.java
 1359212 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/builtin/ToDate.java
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/builtin/ToString.java
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/builtin/ToUnixTime.java
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/builtin/Utf8StorageConverter.java
 1359212 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/builtin/YearsBetween.java
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/data/BinInterSedes.java
 1359212 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/data/DataReaderWriter.java
 1359212 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/data/DataType.java 
1359212 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/data/DefaultTuple.java
 1359212 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/data/SchemaTupleClassGenerator.java
 1359212 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/data/SizeUtil.java 
1359212 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/data/TypeAwareTuple.java
 1359212 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/io/NullableDateTimeWritable.java
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/logicalLayer/schema/SchemaUtil.java
 1359212 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/util/CastUtils.java
 1359212 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/util/NumValCarrier.java
 1359212 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/util/StorageUtil.java
 1359212 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/visitor/TypeCheckingExpVisitor.java
 1359212 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/parser/AliasMasker.g
 1359212 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/parser/AstPrinter.g
 1359212 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/parser/AstValidator.g
 1359212 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/parser/LogicalPlanGenerator.g
 1359212 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/parser/QueryLexer.g
 1359212 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/parser/QueryParser.g
 1359212 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/pen/AugmentBaseDataVisitor.java
 1359212 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/scripting/jruby/JrubyScriptEngine.java
 1359212 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/scripting/jruby/RubySchema.java
 1359212 
  
http://svn.apache.org/repos/asf/pig/trunk/test/e2e/pig/udfs/java/org/apache/pig/test/udf/storefunc/PigPerformanceLoader.java
 1359212 
  
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestConversions.java
 1359212 
  
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestPOCast.java
 1359212 

Diff: https://reviews.apache.org/r/5414/diff/


Testing
---


Thanks,

Thejas Nair



Re: Are there any explanations of the implementation of illustrate?

2012-07-05 Thread Thejas Nair
Earlier implementation of illustrate used the pig local mode execution 
engine (which corresponds to the time when paper was published) .


As part of illustrate reword in PIG-1712, Yan replaced the default Map 
and Reduce context objects with a IllustratorContext. Look for 
IllustratorContext and LocalMapReduceSimulator in 
https://issues.apache.org/jira/secure/attachment/12459267/illustrator_2.patch

The context objects write their output and read input from memory.

We can consider using this for pig local mode as well, by replacing the 
in memory list with something that can spill to disk.


-Thejas


On 7/3/12 6:34 PM, Jonathan Coveney wrote:

Jie, that's perfect, thanks. This doc, specifically:
http://i.stanford.edu/~olston/publications/sigmod09.pdf is exactly the
detailed explanation I was looking for.

2012/7/3 Jie Li ji...@cs.duke.edu


Some document here: http://wiki.apache.org/pig/PigIllustrate

I agree that more tests are needed for illustrate, otherwise it can be
easily broken without notice.

Jie

On Tue, Jul 3, 2012 at 12:45 PM, Jon Coveney jcove...@gmail.com wrote:

I was curious at a level slightly higher than dig through the code how

illustrate is so fast, and how it deals with joins effectively. Are there
any resources on this (or does anyone at Hortonworks want to write a tech
oriented blog post? :)











Re: Possible bug in replicated join?

2012-06-21 Thread Thejas Nair
That certainly looks like a bug. The replicated join should not flatten 
the tuple.
I didn't actually know that pig supported doing joins on tuples (i guess 
it does not allow that on maps and bags).


-Thejas


On 6/21/12 11:29 AM, Jonathan Coveney wrote:

Am posting before making a ticket just to make sure I'm not doing something
stupid or missing something obvious.


$ cat data

1

2

3

4

5


a = load 'data' as (x:int);

b = foreach a generate TOTUPLE(x);


c = load 'data' as (x:int);

d = foreach c generate TOTUPLE(x);


e = join b by $0, d by $0;

dump e;


((1),(1))

((2),(2))

((3),(3))

((4),(4))

((5),(5))

ok
but
f = join b by $0, d by $0 using 'replicated';

dump f;


(1,1)

(2,2)

(3,3)

(4,4)

(5,5)







Re: [jira] [Resolved] (PIG-2650) Convenience mock Loader and Storer to simplify unit testing of Pig scripts

2012-04-26 Thread Thejas Nair
In my opinion, we should only commit changes to released branches that 
are either critical bug fixes, or very useful minor changes which are 
not likely to affect the stability of the branch.


This change would fall into 2nd category.

Thanks,
Thejas


On 4/26/12 2:32 PM, Bill Graham wrote:

What's fair game to commit to the the 0.10 branch? Just bug fixes, or are
new small features that didn't make it into 0.10 ok?

On Thu, Apr 26, 2012 at 2:15 PM, Daniel Daida...@hortonworks.com  wrote:


I am fine with it. Please also include the following tiny patch to fix
hadoop 23 build after the patch.

--- pig/trunk/ivy.xml (original)
+++ pig/trunk/ivy.xml Thu Apr 26 21:11:36 2012
@@ -178,7 +178,7 @@
dependency org=net.java.dev.javacc name=javacc
rev=${javacc.version}
  conf=compile-master/
dependency org=junit name=junit rev=${junit.version}
-  conf=test-default/
+  conf=compile-master/
dependency org=com.google.code.p.arat name=rat-lib
rev=${rats-lib.version}
  conf=releaseaudit-default/
dependency org=org.codehaus.jackson name=jackson-mapper-asl
rev=${jackson.version}

Daniel

On Thu, Apr 26, 2012 at 2:07 PM, Julien Le Demjul...@twitter.com  wrote:

I'm planning to commit this in 0.10 branch as well
The patch has only new files so it will apply cleanly.
Any objection?
Julien


On Apr 26, 2012, at 1:30 PM, Julien Le Dem (JIRA) wrote:



 [

https://issues.apache.org/jira/browse/PIG-2650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel]


Julien Le Dem resolved PIG-2650.


   Resolution: Fixed
Fix Version/s: 0.11


Convenience mock Loader and Storer to simplify unit testing of Pig

scripts



--


Key: PIG-2650
URL: https://issues.apache.org/jira/browse/PIG-2650
Project: Pig
 Issue Type: New Feature
   Reporter: Julien Le Dem
   Assignee: Julien Le Dem
Fix For: 0.11

Attachments: PIG-2650-a.patch, PIG-2650-b.patch,

PIG-2650-c.patch, PIG-2650.patch



A test would look as follows:
{code}
PigServer pigServer = new PigServer(ExecType.LOCAL);
TupleFactory tf = TupleFactory.getInstance();
Data data = Storage.resetData(pigServer.getPigContext());
data.set(foo, Arrays.asList(
tf.newTuple(a),
tf.newTuple(b),
tf.newTuple(c)
));
pigServer.registerQuery(A = LOAD 'foo' USING mock.Storage(););
// some complex script to test
pigServer.registerQuery(STORE A INTO 'bar' USING mock.Storage(););
IteratorTuple  out = data.get(bar).iterator();
assertEquals(a, out.next().get(0));
assertEquals(b, out.next().get(0));
assertEquals(c, out.next().get(0));
{code}


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA

administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa

For more information on JIRA, see:

http://www.atlassian.com/software/jira















Re: HCatalog scans all partition even after mentioning date filter

2012-04-25 Thread Thejas Nair

cc'ing dev@pig as this is a pig issue.

Aniket, What you saw is not related to PIG-2339 .

In your example query, the logical plan will look like this -

Load (A)
|
Split
  |
---
| |
Filter(B1)   Filter(B2) ...

Because of the split operator introduced between the filter conditions 
and load, the filter does not get pushed into the load function.


A simple way to fix this in pig would be to not share the load across 
the filter operators. Another option is to push the condition (B1 or B2 
or B3) into Load operator and retain rest of the current plan (split and 
filters following the split).


You can ofcourse achieve the same effect by having a separate load 
statememnt as input for each of the filters.


I agree that we should make it possible to ask pig to throw a 
warning/error if the query is going to result in a full table scan on a 
partitioned table.


Thanks,
Thejas




On 4/24/12 7:56 PM, Aniket Mokashi wrote:

Sorry Thejas, I didnt look into the jira properly earlier.
EMR pig-0.9.1 already has that patch for PIG-2339 and hence I did not
hit that issue earlier (and I patched datanucleus). filter-union was a
workaround I was using to avoid some of the thrift timeout problems
earlier. Thrift api's timeout on client side in 20sec by default (I
found the config to change this later) and I hence used a = load
'table'; b1= filter by cond1; b2=filter by cond2;.. b= union b1, b2..;
to expect to push these filters separately to the loader. But, that
doesn't work in pig. (I can open a jira, but I havent done enough
investigation at the code level). Thoughts?

Thanks,
Aniket

On Tue, Apr 24, 2012 at 7:00 PM, Thejas Nair the...@hortonworks.com
mailto:the...@hortonworks.com wrote:

The issue was not specific to filter-union
- https://issues.apache.org/__jira/browse/PIG-2339
https://issues.apache.org/jira/browse/PIG-2339.
The fix was to do filter PushUpFilter before PartitionFilterOptimizer .

As this is not a hcat issue, it should not matter if you have an
older hcat version .  fyi, this bug was not there in pig 0.8.x .
Was it pig 0.9.0 or 0.9.1 that you used ?

Thanks,
Thejas



On 4/24/12 5:21 PM, Aniket Mokashi wrote:

Hi Thejas,

Can you point me to jira that fixes filter-union problem (in pig)? I
haven't tried hcat-0.4 yet, good to know about that issue. I
will keep a
watcher.

Thanks,
Aniket

On Tue, Apr 24, 2012 at 4:51 PM, Thejas Nair
the...@hortonworks.com mailto:the...@hortonworks.com
mailto:the...@hortonworks.com
mailto:the...@hortonworks.com__ wrote:

Hi Aniket,
Are you using pig 0.9 or 0.9.1 ?
If yes, can you try with pig 0.9.2 ?
Wondering if you are also hitting the issue that Thomas
mentioned .

Thanks,
Thejas




On 4/23/12 7:39 PM, Aniket Mokashi wrote:

Something similar I have noticed is -

A = load ...
B1 = filter A by cond1;
B2 = filter A by cond2;
B3 = filter A by cond3;

B = union B1, B2, B3; does not push projection.

Is that expected?

Ideally, we should have strict mode under hcatalog,
that when
turned
on will avoid executing pig queries on the full
(partitioned) table.

Thanks,
Aniket

On Mon, Apr 23, 2012 at 7:32 PM, Rajesh Balamohan
rajesh.balamo...@gmail.com mailto:rajesh.balamo...@gmail.com
mailto:rajesh.balamohan@__gmail.com
mailto:rajesh.balamo...@gmail.com
mailto:rajesh.balamohan@
mailto:rajesh.balamohan@__gma__il.com http://gmail.com

mailto:rajesh.balamohan@__gmail.com
mailto:rajesh.balamo...@gmail.com wrote:

Hi Alan,

Thanks for the quick response.

I am using HCatalog 0.4.

With simple PIG script it works great. HCatalog
beautifully
scans
only the relevant information. However, full scan
happens
only when
we have couple of additional joins and when we
change the
INNER JOIN
order (we also use using skewed).

Though we have looked into the debug logs, we saw the
scanning of
number of records from the JobTracker's counters
itself. Without
pruning, the m/r job was pretty much scanning the
entire set
of rows.

I am not sure if there is a corner case, where in
skewed
join is
trying to override the filtering.

~Rajesh.B



On Tue, Apr

Re: HCatalog scans all partition even after mentioning date filter

2012-04-25 Thread Thejas Nair

yes, please create one.
Thanks,
Thejas

On 4/25/12 1:47 PM, Aniket Mokashi wrote:

Hi Dmitriy and Thejas,

Should I open a jira for the same?

Thanks,
Aniket


On Wed, Apr 25, 2012 at 1:45 PM, Dmitriy Ryaboy dvrya...@gmail.com
mailto:dvrya...@gmail.com wrote:

Yeah I think we just need to get projection pushdown to work through
Split operators.

D

On Wed, Apr 25, 2012 at 12:52 PM, Thejas Nair
the...@hortonworks.com mailto:the...@hortonworks.com wrote:
  cc'ing dev@pig as this is a pig issue.
 
  Aniket, What you saw is not related to PIG-2339 .
 
  In your example query, the logical plan will look like this -
 
  Load (A)
  |
  Split
   |
  ---
  | |
  Filter(B1)   Filter(B2) ...
 
  Because of the split operator introduced between the filter
conditions and
  load, the filter does not get pushed into the load function.
 
  A simple way to fix this in pig would be to not share the load
across the
  filter operators. Another option is to push the condition (B1 or
B2 or B3)
  into Load operator and retain rest of the current plan (split and
filters
  following the split).
 
  You can ofcourse achieve the same effect by having a separate load
  statememnt as input for each of the filters.
 
  I agree that we should make it possible to ask pig to throw a
warning/error
  if the query is going to result in a full table scan on a
partitioned table.
 
  Thanks,
  Thejas
 
 
 
 
  On 4/24/12 7:56 PM, Aniket Mokashi wrote:
 
  Sorry Thejas, I didnt look into the jira properly earlier.
  EMR pig-0.9.1 already has that patch for PIG-2339 and hence I
did not
  hit that issue earlier (and I patched datanucleus). filter-union
was a
  workaround I was using to avoid some of the thrift timeout problems
  earlier. Thrift api's timeout on client side in 20sec by default (I
  found the config to change this later) and I hence used a = load
  'table'; b1= filter by cond1; b2=filter by cond2;.. b= union b1,
b2..;
  to expect to push these filters separately to the loader. But, that
  doesn't work in pig. (I can open a jira, but I havent done enough
  investigation at the code level). Thoughts?
 
  Thanks,
  Aniket
 
  On Tue, Apr 24, 2012 at 7:00 PM, Thejas Nair
the...@hortonworks.com mailto:the...@hortonworks.com
  mailto:the...@hortonworks.com mailto:the...@hortonworks.com
wrote:
 
 The issue was not specific to filter-union
 - https://issues.apache.org/__jira/browse/PIG-2339
  https://issues.apache.org/jira/browse/PIG-2339.
 The fix was to do filter PushUpFilter before
PartitionFilterOptimizer .
 
 As this is not a hcat issue, it should not matter if you have an
 older hcat version .  fyi, this bug was not there in pig 0.8.x .
 Was it pig 0.9.0 or 0.9.1 that you used ?
 
 Thanks,
 Thejas
 
 
 
 On 4/24/12 5:21 PM, Aniket Mokashi wrote:
 
 Hi Thejas,
 
 Can you point me to jira that fixes filter-union problem
(in pig)?
  I
 haven't tried hcat-0.4 yet, good to know about that issue. I
 will keep a
 watcher.
 
 Thanks,
 Aniket
 
 On Tue, Apr 24, 2012 at 4:51 PM, Thejas Nair
  the...@hortonworks.com mailto:the...@hortonworks.com
mailto:the...@hortonworks.com mailto:the...@hortonworks.com
  mailto:the...@hortonworks.com mailto:the...@hortonworks.com
  mailto:the...@hortonworks.com
mailto:the...@hortonworks.com__ wrote:
 
 Hi Aniket,
 Are you using pig 0.9 or 0.9.1 ?
 If yes, can you try with pig 0.9.2 ?
 Wondering if you are also hitting the issue that Thomas
 mentioned .
 
 Thanks,
 Thejas
 
 
 
 
 On 4/23/12 7:39 PM, Aniket Mokashi wrote:
 
 Something similar I have noticed is -
 
 A = load ...
 B1 = filter A by cond1;
 B2 = filter A by cond2;
 B3 = filter A by cond3;
 
 B = union B1, B2, B3; does not push projection.
 
 Is that expected?
 
 Ideally, we should have strict mode under hcatalog,
 that when
 turned
 on will avoid executing pig queries on the full
 (partitioned) table.
 
 Thanks,
 Aniket
 
 On Mon, Apr 23, 2012 at 7:32 PM, Rajesh Balamohan
  rajesh.balamo...@gmail.com mailto:rajesh.balamo...@gmail.com

Re: [VOTE] Release Pig 0.10.0 (candidate 0)

2012-04-24 Thread Thejas Nair

+1 .
Checked checksum and signatures of all 3 packages. Ran simple queries in 
MR and local modes using tar package on unsecure cluster, and rpm 
package on secure cluster.


Thanks,
Thejas

On 4/20/12 12:39 AM, Daniel Dai wrote:

Hi,

I have created a candidate build for Pig 0.10.0.

Keys used to sign the release are available at
http://svn.apache.org/viewvc/pig/trunk/KEYS?view=markup.

Please download, test, and try it out:

http://people.apache.org/~daijy/pig-0.10.0-candidate-0/

Should we release this? Vote closes on next Tuesday, Apr 24th.

Daniel




piggybank on github (was Re: Apache Pig hackday @ Twitter (SF))

2012-04-19 Thread Thejas Nair



On 4/18/12 3:24 PM, Russell Jurney wrote:

I'm in. I'm going to work on getting piggybank on github, including for Jython 
and JRuby UDFs.


I think the major work involved there is to figure out how to lower the 
barrier to contribute and having independent release cycles for the udfs 
while also having a way to ensure quality. ie, figuring out the policies 
for it are the harder part. Just copying the udfs to github will not help.


CPAN seems to have figured that out. We need to see if we can adopt a 
policy like that.


Thanks,
Thejas



Re: Apache Pig hackday @ Twitter (SF)

2012-04-18 Thread Thejas Nair

Count me in
-Thejas


On 4/18/12 2:18 PM, Dmitriy Ryaboy wrote:

Hi folks,
The Analytics Infra team at Twitter will be hosting a Pig hackday on May 11.

On the agenda:
- get newcomers set up with the apache ticket process
- review and commit a bunch of stuff that's not been getting love
- hack on exciting new features
- fix boring old problems
- the Dmitriy critiques everyone's APIs hour
- the Jonathan and Julien make fun of Dmitriy for being a hater hour
- whatever else y'all want to do.

Conveniently, the Twitter office is across the street from Yerba Buena
Gardens in the middle of SF downtown, a 15 minute walk from Cal Train
and Bart.
After hacking, we can do bowling or something. Or drinking.

Unfortunately we have very limited space, so let me know early if you
would like to come and hack!

Looking forward to hacking some Pig.

-Dmitriy




Fw: GSoC 2012 mentor signup

2012-03-20 Thread Thejas Nair
fyi, For those who expressed interest in mentoring students for GSoC,  seem 
email below for instructions to register .

Here is the apache mentoring guide - 
http://community.apache.org/guide-to-being-a-mentor.html




- Forwarded Message -
From: Ulrich Stärk u...@apache.org
To: p...@apache.org; code-awa...@apache.org 
Sent: Tuesday, March 20, 2012 1:28 AM
Subject: GSoC 2012 mentor signup
 
[PMCs, please see the PMC section below!]

Potential GSoC 2012 mentors,

It is time now to sign up to be a mentor for your GSoC 2012 project(s) if you 
haven't already done
so. To do so, follow these 4 steps:

1. sign up at Google Melange [1] and note your link_id.

2. Add your link_id to [2] if it is not already in there. If you were using a 
different email
address for registration with Google Melange, make sure that your alternate 
email address is listed
at [3]. You can manage email aliases through [4].

3. Request to be a mentor for the ASF within Google Melange.

4. Send an email to code-awa...@apache.org, cc'ing the PMC(s) for which you 
want to mentor projects,
stating that you want to be a mentor for PMC(s) x,y and z asking for silent 
ackknowledgement from
the PMC.

IMPORTANT: We won't process mentor requests in melange if you have not copied 
your PMC in your
mentor request to code-awa...@apache.org.

--

PMCs,

Potential mentors will be asking you to ACK their mentor requests. If you feel 
that the person
asking to be a mentor is not fit to mentor projects for your PMC for whatever 
reasons, it is your
duty to NACK their request by replying to their email and copying 
code-awa...@apache.org. If you
don't have any objections either stay silent or better yet, ACK their request.

Also, please forward this email to would-be mentors not on your PMC.

For the GSoC admins,

Uli

[1] http://www.google-melange.com/gsoc/homepage/google/gsoc2012
[2] https://svn.apache.org/repos/private/committers/GsocLinkId.txt
[3] https://id.apache.org/info/MailAlias.txt
[4] https://id.apache.org

Re: Where do we want to put non-java source files?

2012-03-15 Thread Thejas Nair

Sounds good to me.
My thoughts on the costs of this change -
- svn will still retain the history of the moved files. So that is not a 
problem.

- build.xml would need some minor changes
- some extra steps will be required to apply the patches generated 
against old directory structure.


Thanks,
Thejas


On 3/15/12 5:54 PM, Bill Graham wrote:

+1 for src/main/ruby and src/main/java.


On Thu, Mar 15, 2012 at 5:22 PM, Jonathan Coveneyjcove...@gmail.comwrote:


So with the jruby addition (which I'm putting a cherry on top of as we
speak!), there's going to be some source files in ruby. Given that we don't
currently have (afaik) any code in languages other than java, there isn't a
clear place to put this. The files are such that they can be packaged in
pig.jar and referenced via that (hooray for jruby), but we need a home for
them.

The ideal would be src/main/ruby/, and move all the java to src/main/java/,
but this seems like a pretty traumatic change at this point to accomodate
one file...even if we add some python and more ruby files, it doesn't seem
worth killing old patches.

We could also do src-ruby in the base dir and just go from there?

Thoughts?
Jon









Re: How Logical Plan Generator works?

2012-01-30 Thread Thejas Nair
See initial sections in 
http://infolab.stanford.edu/~olston/publications/vldb09.pdf for overview 
of logical plan.


LogicalPlanGenerator.g is a the place where logical plan is created from 
parse tree. You would need to look at antlr basics to understand that.


(almost?) all pig relational operations correspond to a subclass of 
LogicalRelationalOperator in org.apache.pig.newplan.logical.relational 
package. Expressions within a relation are subclasses of 
LogicalExpressionOperator.



This document talks about motivations behind the logical plan redesign 
and about some special operations like LOInnerLoad, and special handling 
for foreach operator.


http://wiki.apache.org/pig/PigLogicalPlanOptimizerRewrite

-Thejas


On 1/29/12 8:41 PM, Prasanth J wrote:

Hello Everyone

I am a newbie to pig. I was going through
https://cwiki.apache.org/PIG/guide-for-new-contributors.html, specifically
the grammar files to start off with.

I could not understand how LogicalPlanGenerator.g works by looking into the
grammar file. Also there isn't much documentation available which explains
how logical plans are generated for different pig operators. Is there any
reference from which I can learn more about the internals (especially the
logical plan generation part)?

Thanks
Prasanth





Re: [VOTE] Release Pig 0.9.2 (candidate 1)

2012-01-20 Thread Thejas Nair

+1
Checked the md5 checksums, keys of all 3 packages. Ran some simple 
queries using the rpm package on a secure and unsecure cluster. Checked 
the -version command.


-Thejas

On 1/18/12 11:21 AM, Daniel Dai wrote:

For your information, I took a shortcut last night to refresh the candidate
1 to include 2 hadoop 23 fix (PIG-2347-4, PIG-2480). If you download the
candidate yesterday, you may need to redownload the candidate.

Thanks,
Daniel

On Tue, Jan 17, 2012 at 5:16 PM, Daniel Daida...@hortonworks.com  wrote:


Hi,

I have created a candidate build for Pig 0.9.2. This is the second
maintenance release of Pig 0.9.

The rat report showed no issues in Java files outside of build directory.

Keys used to sign the release are available at
http://svn.apache.org/viewvc/pig/trunk/KEYS?view=markup.

Please download, test, and try it out:
http://people.apache.org/%7Edaijy/pig-0.9.2-candidate-0
http://people.apache.org/~daijy/pig-0.9.2-candidate-1/

Should we release this? Vote closes on this Friday EOD, Jan 20th.

Thanks,
Daniel






problems with @hortonworks.com email and apache mailing lists?

2012-01-09 Thread Thejas Nair
This is the 2nd apache user group that has reported that emails to my 
@hortonworks.com address are bouncing.


Is anybody else seeing this ? Any way to fix it ?
I tried searching for a solution for this, but didn't find any.

-Thejas



 Original Message 
Subject: warning from u...@pig.apache.org
Date: 9 Jan 2012 16:30:57 -
From: user-h...@pig.apache.org
To: the...@hortonworks.com

Hi! This is the ezmlm program. I'm managing the
u...@pig.apache.org mailing list.

I'm working for my owner, who can be reached
at user-ow...@pig.apache.org.


Messages to you from the user mailing list seem to
have been bouncing. I've attached a copy of the first bounce
message I received.

If this message bounces too, I will send you a probe. If the probe bounces,
I will remove your address from the user mailing list,
without further notice.


I've kept a list of which messages from the user mailing list have
bounced from your address.

Copies of these messages may be in the archive.
To retrieve a set of messages 123-145 (a maximum of 100 per request),
send a short message to:
   user-get.123_...@pig.apache.org

To receive a subject and author list for the last 100 or so messages,
send a short message to:
   user-in...@pig.apache.org

Here are the message numbers:

   7950

--- Enclosed is a copy of the bounce message I received.

Return-Path: 
Received: (qmail 9035 invoked for bounce); 29 Dec 2011 00:09:21 -
Date: 29 Dec 2011 00:09:21 -
From: mailer-dae...@apache.org
To: user-return-79...@pig.apache.org
Subject: failure notice

Hi. This is the qmail-send program at apache.org.
I'm afraid I wasn't able to deliver your message to the following addresses.
This is a permanent error; I've given up. Sorry it didn't work out.

the...@hortonworks.com:
74.125.53.26 failed after I sent the message.
Remote host said: 550 5.7.1 Unauthenticated email is not accepted from 
this domain. d6si14020775pbk.191




Re: Next Pig release proposal

2011-10-25 Thread Thejas Nair
 changes.

Nov release will be 1.0.0, Feb release will be 1.1.0. There will be
1.0.1,
1.1.1 etc for bug fixes.

I personally prefer scheme 2, increasing major version too
frequently might be confusing to users. How's other folks feel?

Daniel


On Sat, Oct 22, 2011 at 2:31 AM, Gianmarco De Francisci Morales
g...@apache.orgwrote:

  Hi,


just my 2 cents.

I think the issue here is not 1.0 vs 0.10, but what's the
versioning


scheme


we want to use for Pig.
Up to now it has been just an increasing number after a '0.'
prefix, changed when the community felt it was time. I think this
works well for a small project, but it is somewhat fuzzy.

I like the idea of havingmajor.minor.patchversions like
many


other


projects. It's a very clear and almost standard way of versioning
a


piece


of
software. It has clear rules on when to change each of the
numbers, and lets the user get an idea of backward compatibility
at a glance.

So, to conclude, I am in favor of going 1.0 (or 1.0.0) as long as
we


decide


a clear versioning policy (whichever it is).
So that the 1.0 milestone would mark the beginning of our new policy.

Cheers,
--
Gianmarco



On Fri, Oct 21, 2011 at
23:10,Milind.Bhandarkar@emc.**commilind.bhandar...@emc.com
  wrote:

  If one were to rewrite input and output formats to use the
webhdfs://

APIs, this would not be an issue, right ?

- milind


On 10/21/11 1:50 PM, Santhosh Srinivasans...@yahoo-inc.comwrote:

  If I was not clear in my earlier email, I apologize for the lack
of

clarity. I am no longer in favour of waiting for Hadoop API
stability across Hadoop versions. It's a pipe dream.

When we had PigInputFormat and PigOutputFormat, your reasoning
would


be



spot on. I am concerned about the following. Our tight integration



with



Hadoop due to the use of Input and Output format might lead to a



break



in


backward compatibility. I am not sure if the comparison with that
of



Java



is valid. Probably a majority of the users don't use JNI. Its
very



hard



to use Pig without writing custom load and store functions. The



default



load and store don't suffice for a majority of use cases that I
have

observed.

I am trying to get all factors that might influence this decision.


From



the few emails that have been exchanged since yesterday, we have
the

following factors:

1. Hadoop 0.20.205 (support for Append) 2. Hadoop 0.22 3. Hadoop
0.23 4. Maturity of the new parser 5. Stability of the new
logical plan 6. Other components in the eco system.
   - Avro (1.5.4, 1.4.1, ...)
   - Cassandra (1.0.0, 0.8.7, ...)
   - Chukwa (0.4.0, 0.3.0, ...)
   - Hama (0.3.0, 0.2.0, ...)
   - Hbase (0.90.4, 0.90.3, 0.90.2, 0.90.1, ...)
   - Hive (Releases - 0.7.1, 0.7.0, 0.6.0, ...)
   - Zookeeper (3.3.3, 3.3.2, 3.2.2, 3.1.2, ...)

Santhosh


-Original Message-
From: Thejas Nair [mailto:the...@hortonworks.com**]
Sent: Friday, October 21, 2011 11:22 AM
To: dev@pig.apache.org
Subject: Re: Next Pig release proposal


Santosh,
I thought you meant API stability for hadoop across major
versions,


but



I


guess you are referring to stability within 0.23 versions. But



argument



applies to that as well, if 0.23.1 is not compatible with 0.23.0,
we



need



to call the release for 0.23.1 as 'pig 1.x for 0.23.1 api' .


We just need to communicate to the users that the
InputFormat/OutputFormat api's (and any anything else we expose
from
hadoop) depends on the hadoop version they are using.

I think it is just like different JNI libraries that you would
write


for



different OS. But the java version remains the same across OSs.


-Thejas


On 10/21/11 10:59 AM, Santhosh Srinivasan wrote:


Thejas,

I guess you did not read my email completely. You are referring
to


the



premise without examining the conclusion. I am repasting my entire



email



to avoid confusion (I hate truncated references). If you could



respond



again, it will bring us onto the same page.


email

Ref: http://tinyurl.com/4ng8upa (last discussion on 1.0)

How far have we progressed from our last discussion in March.
There


was



no consensus on the 1.0 release. Opinions ranged from having more

releases to bake in the maturity of the new parser and logical
plan changes to compatibility with Hadoop API (was compared to
Social Security - a very hot topic these days).

My concerns were around Hadoop API stability. I have heard that
the APIs will not be stable for at least 1 year. This is taking
me away


from



the Hadoop API stability factor (They passed healthcare in that

duration. Really!) Do we want compatibility with 0.23 as a
gating


factor



- not sure if this is anywhere close to getting done in the near



future.



Will we support append (0.20.205)?


Btw, Hbase has been doing 0.90.1, 0.90.2, etc. So we can take a


look



at


this option too.


Santhosh



-Original Message-
From: Olga Natkovich [mailto:ol...@yahoo-inc.com]
Sent: Thursday

Re: Next Pig release proposal

2011-10-24 Thread Thejas Nair

On 10/24/11 12:43 PM, Dmitriy Ryaboy wrote:

We are finding a fair number of issues trying to move from Pig 0.8.1 to 0.9,
and I don't think those issues are fixed in 10, either.. not sure that this
stabilization process has happened yet.

D



What kind of issues are these ? Are they related to major changes in 0.8 
(logical plan) or 0.9 (antlr parser, or semantic cleanup (in terms of 
backward compat) ) ?


-Thejas


Re: Next Pig release proposal

2011-10-24 Thread Thejas Nair

Dmitriy,
I think what you are saying is something similar to alpha/beta releases. 
(maybe beta1, beta2 .. is better).
So the first release could be 1.0.0_beta1. I scheme will be easier for 
users to understand.
But I am not sure what the criteria for promoting a release from betaX 
to general release should be.



Thanks,
Thejas


On 10/24/11 5:38 PM, Dmitriy Ryaboy wrote:

To be a little more concrete about what I am saying here -- I don't think we
should put a 1.0 label on any *.0 release. 0.8.1 is pretty solid; 0.9.0
has some holes, 0.9.1 is better. If we put 1.0 on what is currently being
thought of as 0.10, it will have some stability / usability issues (things
tend to show up after we make a release and people in the wild start trying
it), and those issues will make a poor impression on those who expect 1.0 to
be shiny and polished after so much time. I'm in favor of waiting a couple
of dot releases, promoting a stabilized release into 1.0, and going from
there. So, pictorially:

-- trunk --- 0.11-dev --0.12-dev--| 1.2-dev!
 \   \
  \   \  0.11.0 | 1.1.0!
   \
\--- 0.10.0 --- 0.10.1 --- 0.10.2 | 1.0.0 !!

On Mon, Oct 24, 2011 at 12:43 PM, Dmitriy Ryaboydvrya...@gmail.com  wrote:


I am good with Scheme 2.

We are finding a fair number of issues trying to move from Pig 0.8.1 to
0.9, and I don't think those issues are fixed in 10, either.. not sure that
this stabilization process has happened yet.

D


On Mon, Oct 24, 2011 at 11:59 AM, Daniel Daida...@hortonworks.comwrote:


Yes, we need a versioning scheme. There are two versioning scheme I can
think of:

Scheme 1:
major.patch
major  will be the feature rich release every 3 month
patch  will be the bug fix release when necessary

Nov release will be 1.0, Feb release will be 2.0. There will be 1.1, 2.1
etc
for bug fixes.

Scheme 2:
major.minor.patch
Most of our 3 month release will be counted asminor  release unless
there
are major user facing/disruptive changes.

Nov release will be 1.0.0, Feb release will be 1.1.0. There will be 1.0.1,
1.1.1 etc for bug fixes.

I personally prefer scheme 2, increasing major version too frequently
might
be confusing to users. How's other folks feel?

Daniel


On Sat, Oct 22, 2011 at 2:31 AM, Gianmarco De Francisci Morales
g...@apache.org  wrote:


Hi,

just my 2 cents.

I think the issue here is not 1.0 vs 0.10, but what's the versioning

scheme

we want to use for Pig.
Up to now it has been just an increasing number after a '0.' prefix,
changed
when the community felt it was time. I think this works well for a small
project, but it is somewhat fuzzy.

I like the idea of havingmajor.minor.patch  versions like many

other

projects. It's a very clear and almost standard way of versioning a

piece

of
software. It has clear rules on when to change each of the numbers, and
lets
the user get an idea of backward compatibility at a glance.

So, to conclude, I am in favor of going 1.0 (or 1.0.0) as long as we

decide

a clear versioning policy (whichever it is).
So that the 1.0 milestone would mark the beginning of our new policy.

Cheers,
--
Gianmarco



On Fri, Oct 21, 2011 at 23:10,milind.bhandar...@emc.com  wrote:


If one were to rewrite input and output formats to use the webhdfs://
APIs, this would not be an issue, right ?

- milind


On 10/21/11 1:50 PM, Santhosh Srinivasans...@yahoo-inc.com  wrote:


If I was not clear in my earlier email, I apologize for the lack of
clarity. I am no longer in favour of waiting for Hadoop API stability
across Hadoop versions. It's a pipe dream.

When we had PigInputFormat and PigOutputFormat, your reasoning would

be

spot on. I am concerned about the following. Our tight integration

with

Hadoop due to the use of Input and Output format might lead to a

break

in

backward compatibility. I am not sure if the comparison with that of

Java

is valid. Probably a majority of the users don't use JNI. Its very

hard

to use Pig without writing custom load and store functions. The

default

load and store don't suffice for a majority of use cases that I have
observed.

I am trying to get all factors that might influence this decision.

From

the few emails that have been exchanged since yesterday, we have the
following factors:

1. Hadoop 0.20.205 (support for Append)
2. Hadoop 0.22
3. Hadoop 0.23
4. Maturity of the new parser
5. Stability of the new logical plan
6. Other components in the eco system.
   - Avro (1.5.4, 1.4.1, ...)
   - Cassandra (1.0.0, 0.8.7, ...)
   - Chukwa (0.4.0, 0.3.0, ...)
   - Hama (0.3.0, 0.2.0, ...)
   - Hbase (0.90.4, 0.90.3, 0.90.2, 0.90.1, ...)
   - Hive (Releases - 0.7.1, 0.7.0, 0.6.0, ...)
   - Zookeeper (3.3.3, 3.3.2, 3.2.2, 3.1.2, ...)

Santhosh


-Original Message-
From: Thejas Nair [mailto:the...@hortonworks.com]
Sent: Friday, October 21, 2011 11:22 AM
To: dev@pig.apache.org
Subject: Re: Next

LogicalExpressionSimplifier rules

2011-10-10 Thread Thejas Nair

Sending this email for getting wider attention.
I propose disabling LogicalExpressionSimplifier optimizer rule, because 
the complexity of that rule and number of bugs that seem to come from 
there does not justify the expected performance gains - 
https://issues.apache.org/jira/browse/PIG-2316?focusedCommentId=13124489page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13124489


In general, I think any new code that is significantly complex (ie hard 
to maintain, and likely source of bugs) should be added to pig only if 
there are enough gains to justify it.


-Thejas




Re: Review Request: Using COR function in Piggybank results in ERROR 2018: Internal error. Unable to introduce the combiner for optimization

2011-09-20 Thread Thejas Nair

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1929/#review1974
---



trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/CombinerOptimizer.java
https://reviews.apache.org/r/1929/#comment4462

I think a comment will be useful - 
// The algebraic udf can have more than one input. Add the udf only once



trunk/src/org/apache/pig/builtin/COR.java
https://reviews.apache.org/r/1929/#comment4463

The size of the tuple would need to be size*(size-1).
Details -
the inner loop is executed - (n-1) + (n-2) + .. (n - (n-1)) = n(n-1)/2 . 
Each time the inner loop is executed two columns are being added. So 2 * 
n(n-1)/2 = n(n-1)




trunk/src/org/apache/pig/builtin/COR.java
https://reviews.apache.org/r/1929/#comment4464

I don't understand why the values are being added to a tuple as columns. 
That does not look right.



- Thejas


On 2011-09-16 18:11:08, Daniel Dai wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/1929/
 ---
 
 (Updated 2011-09-16 18:11:08)
 
 
 Review request for pig and Thejas Nair.
 
 
 Summary
 ---
 
 See PIG-2286
 
 
 This addresses bug PIG-2286.
 https://issues.apache.org/jira/browse/PIG-2286
 
 
 Diffs
 -
 
   trunk/src/org/apache/pig/builtin/COR.java 1171325 
   
 trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/CombinerOptimizer.java
  1171325 
   trunk/test/e2e/pig/tests/nightly.conf 1171325 
 
 Diff: https://reviews.apache.org/r/1929/diff
 
 
 Testing
 ---
 
 Unit-test:
 all pass
 
 Piggybank-test:
 TestDBStorage fail for other reason, unrelated to patch
 
 Test-patch:
  [exec] +1 overall.  
  [exec] 
  [exec] +1 @author.  The patch does not contain any @author tags.
  [exec] 
  [exec] +1 tests included.  The patch appears to include 3 new or 
 modified tests.
  [exec] 
  [exec] +1 javadoc.  The javadoc tool did not generate any warning 
 messages.
  [exec] 
  [exec] +1 javac.  The applied patch does not increase the total 
 number of javac compiler warnings.
  [exec] 
  [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
 warnings.
  [exec] 
  [exec] +1 release audit.  The applied patch does not increase the 
 total number of release audit warnings.
 
 
 Thanks,
 
 Daniel
 




Re: Review Request: PIG-2228: support partial aggregation in map task

2011-09-15 Thread Thejas Nair


 On 2011-09-13 09:15:46, Dmitriy Ryaboy wrote:
  trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POLocalRearrange.java,
   line 296
  https://reviews.apache.org/r/1817/diff/1/?file=40193#file40193line296
 
  Not sure about the value of this comment :)

cleaning that


- Thejas


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1817/#review1868
---


On 2011-09-15 17:27:08, Thejas Nair wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/1817/
 ---
 
 (Updated 2011-09-15 17:27:08)
 
 
 Review request for pig, Daniel Dai and Dmitriy Ryaboy.
 
 
 Summary
 ---
 
 See PIG-2228
 
 
 This addresses bug PIG-2228.
 https://issues.apache.org/jira/browse/PIG-2228
 
 
 Diffs
 -
 
   trunk/conf/pig.properties 1170885 
   trunk/src/org/apache/pig/Algebraic.java 1170885 
   trunk/src/org/apache/pig/Main.java 1170885 
   
 trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/CombinerOptimizer.java
  1170885 
   
 trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MapReduceLauncher.java
  1170885 
   
 trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PhyPlanSetter.java
  1170885 
   
 trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/plans/EndOfAllInputSetter.java
  1170885 
   
 trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PhyPlanVisitor.java
  1170885 
   
 trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PlanPrinter.java
  1170885 
   
 trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POLocalRearrange.java
  1170885 
   
 trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPartialAgg.java
  PRE-CREATION 
   trunk/src/org/apache/pig/data/DefaultTuple.java 1170885 
   trunk/src/org/apache/pig/data/InternalCachedBag.java 1170885 
   trunk/src/org/apache/pig/data/InternalDistinctBag.java 1170885 
   trunk/src/org/apache/pig/data/InternalSortedBag.java 1170885 
   trunk/src/org/apache/pig/data/SelfSpillBag.java PRE-CREATION 
   trunk/src/org/apache/pig/data/SizeUtil.java PRE-CREATION 
   trunk/src/org/apache/pig/data/SortedSpillBag.java 1170885 
   trunk/src/org/apache/pig/tools/pigstats/ScriptState.java 1170885 
   trunk/test/e2e/pig/tests/nightly.conf 1170885 
   trunk/test/org/apache/pig/test/TestDataBag.java 1170885 
   trunk/test/org/apache/pig/test/TestPOPartialAgg.java PRE-CREATION 
   trunk/test/org/apache/pig/test/TestPOPartialAggPlan.java PRE-CREATION 
   trunk/test/org/apache/pig/test/Util.java 1170885 
   trunk/test/org/apache/pig/test/utils/GenPhyOp.java 1170885 
 
 Diff: https://reviews.apache.org/r/1817/diff
 
 
 Testing
 ---
 
 test-patch 
  [exec] -1 overall.
  [exec]
  [exec] +1 @author.  The patch does not contain any @author tags.
  [exec]
  [exec] +1 tests included.  The patch appears to include 21 new or 
 modified tests.
  [exec]
  [exec] +1 javadoc.  The javadoc tool did not generate any warning 
 messages.
  [exec]
  [exec] +1 javac.  The applied patch does not increase the total 
 number of javac compiler warnings.
  [exec]
  [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
 warnings.
  [exec]
  [exec] -1 release audit.  The applied patch generated 461 release 
 audit warnings (more than the trunk's current 455 warnings).
 release audit failures are because of jdiff changes
 
 All  unit tests pass, new e2e tests added .
 
 
 Thanks,
 
 Thejas
 




Re: Review Request: PIG-2228: support partial aggregation in map task

2011-09-15 Thread Thejas Nair

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1817/#review1916
---



trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPartialAgg.java
https://reviews.apache.org/r/1817/#comment4397

removed the extra ; in the patch checked in.



- Thejas


On 2011-09-15 17:27:08, Thejas Nair wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/1817/
 ---
 
 (Updated 2011-09-15 17:27:08)
 
 
 Review request for pig, Daniel Dai and Dmitriy Ryaboy.
 
 
 Summary
 ---
 
 See PIG-2228
 
 
 This addresses bug PIG-2228.
 https://issues.apache.org/jira/browse/PIG-2228
 
 
 Diffs
 -
 
   trunk/conf/pig.properties 1170885 
   trunk/src/org/apache/pig/Algebraic.java 1170885 
   trunk/src/org/apache/pig/Main.java 1170885 
   
 trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/CombinerOptimizer.java
  1170885 
   
 trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MapReduceLauncher.java
  1170885 
   
 trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PhyPlanSetter.java
  1170885 
   
 trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/plans/EndOfAllInputSetter.java
  1170885 
   
 trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PhyPlanVisitor.java
  1170885 
   
 trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PlanPrinter.java
  1170885 
   
 trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POLocalRearrange.java
  1170885 
   
 trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPartialAgg.java
  PRE-CREATION 
   trunk/src/org/apache/pig/data/DefaultTuple.java 1170885 
   trunk/src/org/apache/pig/data/InternalCachedBag.java 1170885 
   trunk/src/org/apache/pig/data/InternalDistinctBag.java 1170885 
   trunk/src/org/apache/pig/data/InternalSortedBag.java 1170885 
   trunk/src/org/apache/pig/data/SelfSpillBag.java PRE-CREATION 
   trunk/src/org/apache/pig/data/SizeUtil.java PRE-CREATION 
   trunk/src/org/apache/pig/data/SortedSpillBag.java 1170885 
   trunk/src/org/apache/pig/tools/pigstats/ScriptState.java 1170885 
   trunk/test/e2e/pig/tests/nightly.conf 1170885 
   trunk/test/org/apache/pig/test/TestDataBag.java 1170885 
   trunk/test/org/apache/pig/test/TestPOPartialAgg.java PRE-CREATION 
   trunk/test/org/apache/pig/test/TestPOPartialAggPlan.java PRE-CREATION 
   trunk/test/org/apache/pig/test/Util.java 1170885 
   trunk/test/org/apache/pig/test/utils/GenPhyOp.java 1170885 
 
 Diff: https://reviews.apache.org/r/1817/diff
 
 
 Testing
 ---
 
 test-patch 
  [exec] -1 overall.
  [exec]
  [exec] +1 @author.  The patch does not contain any @author tags.
  [exec]
  [exec] +1 tests included.  The patch appears to include 21 new or 
 modified tests.
  [exec]
  [exec] +1 javadoc.  The javadoc tool did not generate any warning 
 messages.
  [exec]
  [exec] +1 javac.  The applied patch does not increase the total 
 number of javac compiler warnings.
  [exec]
  [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
 warnings.
  [exec]
  [exec] -1 release audit.  The applied patch generated 461 release 
 audit warnings (more than the trunk's current 455 warnings).
 release audit failures are because of jdiff changes
 
 All  unit tests pass, new e2e tests added .
 
 
 Thanks,
 
 Thejas
 




going to request yourkit license for committers

2011-08-29 Thread Thejas Nair

FYI-

Yourkit is very useful java profiling tool and they give license for 
free for use by open source projects.

I am planning to request license for use by pig committers.

But they need a reference from the web pages of the project to their 
website. - http://www.yourkit.com/purchase/index.jsp . I believe a link 
from a credits page should be sufficient.
As the project would need to thank them, I am sharing my plan before 
contacting them.


Thanks,
Thejas


Re: Review Request: Limit produce wrong number of records after foreach flatten

2011-08-24 Thread Thejas Nair

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1627/#review1621
---

Ship it!


+1

- Thejas


On 2011-08-23 17:08:10, Daniel Dai wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/1627/
 ---
 
 (Updated 2011-08-23 17:08:10)
 
 
 Review request for pig and Thejas Nair.
 
 
 Summary
 ---
 
 See PIG-2231
 
 
 This addresses bug PIG-2231.
 https://issues.apache.org/jira/browse/PIG-2231
 
 
 Diffs
 -
 
   
 trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRCompiler.java
  1160494 
   trunk/test/org/apache/pig/test/TestEvalPipeline2.java 1160494 
 
 Diff: https://reviews.apache.org/r/1627/diff
 
 
 Testing
 ---
 
 test-patch:
  [exec] +1 overall.  
  [exec] 
  [exec] +1 @author.  The patch does not contain any @author tags.
  [exec] 
  [exec] +1 tests included.  The patch appears to include 3 new or 
 modified tests.
  [exec] 
  [exec] +1 javadoc.  The javadoc tool did not generate any warning 
 messages.
  [exec] 
  [exec] +1 javac.  The applied patch does not increase the total 
 number of javac compiler warnings.
  [exec] 
  [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
 warnings.
  [exec] 
  [exec] +1 release audit.  The applied patch does not increase the 
 total number of release audit warnings.
 
 Unit test:
 all pass
 
 
 Thanks,
 
 Daniel
 




Re: Review Request: NullPointerException while Accessing Empty Bag in FOREACH { FILTER }

2011-08-19 Thread Thejas Nair

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1600/#review1571
---

Ship it!


+1

- Thejas


On 2011-08-19 20:36:09, Daniel Dai wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/1600/
 ---
 
 (Updated 2011-08-19 20:36:09)
 
 
 Review request for pig and Thejas Nair.
 
 
 Summary
 ---
 
 See PIG-2185
 
 
 This addresses bug PIG-2185.
 https://issues.apache.org/jira/browse/PIG-2185
 
 
 Diffs
 -
 
   
 trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POProject.java
  1159742 
   trunk/test/org/apache/pig/test/TestEvalPipeline2.java 1159742 
 
 Diff: https://reviews.apache.org/r/1600/diff
 
 
 Testing
 ---
 
 test-patch pass:
 [exec] +1 overall.
 [exec]
 [exec] +1 @author. The patch does not contain any @author tags.
 [exec]
 [exec] +1 tests included. The patch appears to include 3 new or modified 
 tests.
 [exec]
 [exec] +1 javadoc. The javadoc tool did not generate any warning messages.
 [exec]
 [exec] +1 javac. The applied patch does not increase the total number of 
 javac compiler warnings.
 [exec]
 [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.
 [exec]
 [exec] +1 release audit. The applied patch does not increase the total number 
 of release audit warnings.
 
 Unit tests pass.
 
 
 Thanks,
 
 Daniel
 




Re: Failing tests after parser change?

2011-08-11 Thread Thejas Nair

Dmitriy,
You don't realize how lucky you are! ;)
I have been trying hard to reproduce this problem, so that I can check 
if the patch in PIG-2055 actually fixes the issue. I ran build+ 
(small)test in a loop for 2000+ times, and this hasn't happened yet.


If this is happening (almost) consistently, can you try the patch in 
PIG-2055 and see if that helps ?


Thanks,
Thejas



On 8/11/11 9:44 AM, Alan Gates wrote:

This looks like the intermittent Antlr bug we're seeing 
(https://issues.apache.org/jira/browse/PIG-2055).  We're testing other versions 
of Antlr to try to fix this, but until we find one that addresses the issue the 
only solution is to do ant clean, and then rebuild and see if it goes away.  We 
have also noticed it happens more often when built on Mac than on Linux, if you 
happen to have a Linux box you could build on.

Alan.

On Aug 10, 2011, at 11:24 PM, Dmitriy Ryaboy wrote:


HBaseStorage is failing, and it's not something we did to HBaseStorage...
Looks like the parser.

Any takers?

Testcase: testStoreToHBase_2_with_projection took 0.34 sec
Caused an ERROR
Error during parsing.line 1, column 84   mismatched input '(' expecting
SEMI_COLON
org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error during
parsing.line 1, column 84   mismatched input '(' expecting SEMI_COLON
at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1597)
at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1540)
at org.apache.pig.PigServer.registerQuery(PigServer.java:540)
at org.apache.pig.PigServer.registerQuery(PigServer.java:553)
at
org.apache.pig.test.TestHBaseStorage.scanTable1(TestHBaseStorage.java:771)
at
org.apache.pig.test.TestHBaseStorage.scanTable1(TestHBaseStorage.java:767)
at
org.apache.pig.test.TestHBaseStorage.testStoreToHBase_2_with_projection(TestHBaseStorage.java:706)
Caused by: Failed to parse:line 1, column 84   mismatched input '('
expecting SEMI_COLON
at
org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:222)
at
org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:164)
at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1589)






Please welcome pig's newest committer -  Gianmarco De Francisci Morales

2011-08-05 Thread Thejas Nair

Dear pig community,
Please welcome Gianmarco as the newest committer to apache pig project!
He has been contributing to pig for more than a year. His contributions 
include the use of binary comparator in secondary sort , support for 
default output in split operator, use of scalar expression in 
limit/sample and several other bug fixes. He has also been helping users 
out in the mailing lists.


Congratulations Gianmarco!

- Thejas


Re: [VOTE] Release Pig 0.9.0 (candidate 1)

2011-07-26 Thread Thejas Nair

+1
Ran queries in local mode on mac, test-commit, and verified md5 checksum.
-Thejas

On 7/22/11 4:24 PM, Alan Gates wrote:

+1.

Ran the test-commit, tutorial, and quick sanity test against a real cluster on 
Linux, ran a quick sanity test in local mode on Mac.  Checked signature key and 
md5.

Alan.

On Jul 22, 2011, at 2:12 PM, Olga Natkovich wrote:


I have created the second candidate build for Pig 0.9.0 release. This release 
introduces control structures, changes query parser, and performs semantic 
cleanup.



The rat report showed no issues in Java files outside of build directory.



Keys used to sign the release are available at 
http://svn.apache.org/viewvc/pig/trunk/KEYS?view=markup.



Please try it out: http://people.apache.org/~olga/pig-0.9.0-candidate-1/



Should we release this? Vote closes on Wednesday, July 27.



Olga









Re: Review Request: Project UDF output inside a non-foreach statement fail on 0.8

2011-07-18 Thread thejas . nair

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/767/#review1105
---

Ship it!


+1

- thejas


On 2011-05-19 22:26:01, Daniel Dai wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/767/
 ---
 
 (Updated 2011-05-19 22:26:01)
 
 
 Review request for pig and thejas.
 
 
 Summary
 ---
 
 See PIG-2077
 
 
 This addresses bug PIG-2077.
 https://issues.apache.org/jira/browse/PIG-2077
 
 
 Diffs
 -
 
   
 branches/branch-0.8/src/org/apache/pig/newplan/logical/LogicalExpPlanMigrationVistor.java
  1104455 
   branches/branch-0.8/test/org/apache/pig/test/TestEvalPipeline2.java 1104455 
 
 Diff: https://reviews.apache.org/r/767/diff
 
 
 Testing
 ---
 
 Test patch:
  [exec] +1 overall.  
  [exec] 
  [exec] +1 @author.  The patch does not contain any @author tags.
  [exec] 
  [exec] +1 tests included.  The patch appears to include 3 new or 
 modified tests.
  [exec] 
  [exec] +1 javadoc.  The javadoc tool did not generate any warning 
 messages.
  [exec] 
  [exec] +1 javac.  The applied patch does not increase the total 
 number of javac compiler warnings.
  [exec] 
  [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
 warnings.
  [exec] 
  [exec] +1 release audit.  The applied patch does not increase the 
 total number of release audit warnings.
 
 Unit test:
 all pass
 
 End to end test:
 all pass
 
 
 Thanks,
 
 Daniel
 




Re: Cubing in Pig

2011-07-14 Thread Thejas Nair
+1 to what Gianmarco said about the place to do it.  See sample_clause 
in LogicalPlanGenerator.g.


I tried the expanded query (2 dimensions) with 0.8, it results only in 2 
MR jobs, the 1st MR job has all the computation being done in a single 
MR job. The 2nd MR job just concats the outputs into one file. See- 
http://pastebin.com/aarBELC2. I got an exception in 0.9 for same query, 
I have created a jira (PIG-2164) to address that.


The CubeDimensions udf would be a nice way to get around a combiner 
issue, but the combiner issue (if any) should actually get fixed.


In the example, you are putting all records into same file. That would 
lead to a problem, because it will not be possible to distinguish 
between the records for (group by (a,b)) that have value of b as null 
and (group by (a,null)). If all inputs go into same file, it would need 
to have a marker column to indicate the input it belongs to.
I think, in most cases people would read the results of different 
group-by combinations separately, so it makes sense to have different 
output files. (eg, 8 files if there are 3 dimensions). Ie, a split on 
the marker column might have to be introduced.




Thanks,
Thejas




On 7/13/11 6:05 PM, Dmitriy Ryaboy wrote:

Arnab has a really interesting presentation at the post-hadoop-summit
Pig meeting about how Cubing could work in Map-Reduce, and suggested a
straightforward path to integrating into Pig. Arnab, do you have the
presentation posted somewhere?

In any case, I started mucking around a little with this, trying to
hack in the naive solution.

So far, one interesting result, followed by a question:

I manually cubed by writing a bunch of group-bys, like so (using pig 8) :

ab = foreach (group rel by (a, b)) generate flatten(group) as (a, b),
COUNT_STAR(rel) as cnt;
a_only = foreach (group rel by (a, null)) generate flatten(group) as
(a, b), COUNT_STAR(rel) as cnt;
b_only = foreach (group rel by (null, b)) generate flatten(group) as
(a, b), COUNT_STAR(rel) as cnt;
ab = foreach (group rel by (null, null)) generate flatten(group) as
(a, b), COUNT_STAR(rel) as cnt;
cube = union ab, a_only, b_only, ab;
store cube 

Except for extra fun, I did this with 3 dimensions and therefore 8
groupings. This generated 4 MR jobs, the first of which moved all the
data across the wire despite the fact that COUNT_STAR is algebraic. On
my test dataset, the work took 18 minutes.

I then wrote a UDF that given a tuple, created all the cube dimensions
of the tuple -- so CubeDimensions(a, b) returns { (a, b), (a, null),
(null, b), (null, null) }, and this works on any number of dimensions.
The naive cube then simply becomes this:

cubed = foreach rel generate flatten(CubeDimensions(a, b));
cube = foreach (group rel by $0) generate flatten(group) as (a, b),
COUNT_STAR(rel);

On the same dataset, this generated only 1 MR job, and ran in 3
minutes because we were able to take advantage of the combiners!

Assuming algebraic aggregations, this is actually pretty good given
how little work it involves.

I looked at adding a new operator that would be (for now) syntactic
sugar around this pattern -- basically, CUBE rel by (a, b, c) would
insert the operators equivalent to the code above.

I can muddle my way through the grammar. What's the appropriate place
to put the translation logic? Logical to physical compiler? Optimizer?
The LogicalPlanBuilder?

D




Re: Cubing in Pig

2011-07-14 Thread Thejas Nair

On 7/14/11 3:03 PM, Dmitriy Ryaboy wrote:

In the dw world, using a single table and using null as an all marker is the 
standard thing to do


But I imagine that in the dw world, the cube results would get stored in 
such a way that you can efficiently retrieve results of specific 
group-bys (partitions?). That would be similar to storing results of 
different group-bys operations in different output files.


On the other hand, its possible that the results of most cube operations 
are probably small enough that you could do rest of the processing using 
a excel spreadsheet! (so partitioning does not matter)



. In my udf I actually allow an optional string to be passed to the constructor 
to denote all if null is a valid value... I'll post the udf shortly, it's a 
prerequisite to LOCube.
If results of all group-by's are are stored together, I think some such 
feature to indicate if its actually a null or a '*' ( the 'all' marker 
symbol used in Arnab's presentation) will be essential.




I suspect the case of splitting out the agg levels is actually more rare, and 
can easily be accomplished with a SPLIT operator.
The other nice Thing about the udf is how much code it saves, esp for larger 
numbers of dimensions.


The udf code saving is important if the script is being written 
manually. But if pig is doing automatic translation (and assuming 
multiple output files is what makes sense), translating into multiple 
group-by statements might be more efficient, as it can avoid the 
filtering that would need to be done for split.
But I agree that implementing this feature using udf is going to be 
easier. Any changes to make it more efficient can be done later.




Perhaps my sample script generated 4 jobs because I had 3 dimensions?



I doubt if it is because of number of dimensions, I think there might 
have been something else in the query that prevented the group-by's from 
being combined together.
Do you still have the original script ? Can you send the script (and 
maybe the explain output) ?


Thanks,
Thejas





On Jul 14, 2011, at 4:10 PM, Thejas Nairthe...@hortonworks.com  wrote:


+1 to what Gianmarco said about the place to do it.  See sample_clause in 
LogicalPlanGenerator.g.

I tried the expanded query (2 dimensions) with 0.8, it results only in 2 MR 
jobs, the 1st MR job has all the computation being done in a single MR job. The 
2nd MR job just concats the outputs into one file. See- 
http://pastebin.com/aarBELC2. I got an exception in 0.9 for same query, I have 
created a jira (PIG-2164) to address that.

The CubeDimensions udf would be a nice way to get around a combiner issue, but 
the combiner issue (if any) should actually get fixed.

In the example, you are putting all records into same file. That would lead to 
a problem, because it will not be possible to distinguish between the records 
for (group by (a,b)) that have value of b as null and (group by (a,null)). If 
all inputs go into same file, it would need to have a marker column to indicate 
the input it belongs to.
I think, in most cases people would read the results of different group-by 
combinations separately, so it makes sense to have different output files. (eg, 
8 files if there are 3 dimensions). Ie, a split on the marker column might have 
to be introduced.



Thanks,
Thejas




On 7/13/11 6:05 PM, Dmitriy Ryaboy wrote:

Arnab has a really interesting presentation at the post-hadoop-summit
Pig meeting about how Cubing could work in Map-Reduce, and suggested a
straightforward path to integrating into Pig. Arnab, do you have the
presentation posted somewhere?

In any case, I started mucking around a little with this, trying to
hack in the naive solution.

So far, one interesting result, followed by a question:

I manually cubed by writing a bunch of group-bys, like so (using pig 8) :

ab = foreach (group rel by (a, b)) generate flatten(group) as (a, b),
COUNT_STAR(rel) as cnt;
a_only = foreach (group rel by (a, null)) generate flatten(group) as
(a, b), COUNT_STAR(rel) as cnt;
b_only = foreach (group rel by (null, b)) generate flatten(group) as
(a, b), COUNT_STAR(rel) as cnt;
ab = foreach (group rel by (null, null)) generate flatten(group) as
(a, b), COUNT_STAR(rel) as cnt;
cube = union ab, a_only, b_only, ab;
store cube 

Except for extra fun, I did this with 3 dimensions and therefore 8
groupings. This generated 4 MR jobs, the first of which moved all the
data across the wire despite the fact that COUNT_STAR is algebraic. On
my test dataset, the work took 18 minutes.

I then wrote a UDF that given a tuple, created all the cube dimensions
of the tuple -- so CubeDimensions(a, b) returns { (a, b), (a, null),
(null, b), (null, null) }, and this works on any number of dimensions.
The naive cube then simply becomes this:

cubed = foreach rel generate flatten(CubeDimensions(a, b));
cube = foreach (group rel by $0) generate flatten(group) as (a, b),
COUNT_STAR(rel);

On the same dataset, this generated only 1 MR job, and 

Re: Pig testing proposal

2011-07-14 Thread Thejas Nair

On 7/14/11 2:39 PM, Alan Gates wrote:

I have posted a proposal for changes in Pig's testing that I would like to 
make.  https://cwiki.apache.org/confluence/display/PIG/PigTestProposal  Please 
take a look and provide feedback.

Alan.


+1 for the proposal.
-Thejas



Re: Pig testing proposal

2011-07-14 Thread Thejas Nair
I think having SQL as a way to generate benchmark has some value, and we 
should be open to having that option in e2e harness as well. But I don't 
see that as a blocker.


In some cases, I would expect that writing an alternative pig-latin 
query to generate benchmark might not be easy, and there is also the 
danger that the alternative script also has the same bug which results 
buggy benchmark data.


-Thejas



On 7/14/11 3:51 PM, Thejas Nair wrote:

On 7/14/11 2:39 PM, Alan Gates wrote:

I have posted a proposal for changes in Pig's testing that I would
like to make.
https://cwiki.apache.org/confluence/display/PIG/PigTestProposal Please
take a look and provide feedback.

Alan.


+1 for the proposal.
-Thejas




Re: Review Request: POProject.getNext(DataBag) does not handle null

2011-05-19 Thread thejas . nair

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/763/#review687
---

Ship it!


+1

- thejas


On 2011-05-19 17:46:48, Daniel Dai wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/763/
 ---
 
 (Updated 2011-05-19 17:46:48)
 
 
 Review request for pig and thejas.
 
 
 Summary
 ---
 
 See PIG-2078
 
 
 This addresses bug PIG-2078.
 https://issues.apache.org/jira/browse/PIG-2078
 
 
 Diffs
 -
 
   
 trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POProject.java
  1100118 
   trunk/test/org/apache/pig/test/TestEvalPipeline2.java 1100118 
 
 Diff: https://reviews.apache.org/r/763/diff
 
 
 Testing
 ---
 
 Test-patch:
  [exec] +1 overall.  
  [exec] 
  [exec] +1 @author.  The patch does not contain any @author tags.
  [exec] 
  [exec] +1 tests included.  The patch appears to include 3 new or 
 modified tests.
  [exec] 
  [exec] +1 javadoc.  The javadoc tool did not generate any warning 
 messages.
  [exec] 
  [exec] +1 javac.  The applied patch does not increase the total 
 number of javac compiler warnings.
  [exec] 
  [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
 warnings.
  [exec] 
  [exec] +1 release audit.  The applied patch does not increase the 
 total number of release audit warnings.
 
 Unit test:
 all pass
 
 End-to-end test:
 all pass
 
 
 Thanks,
 
 Daniel
 




Re: Review Request: complex type casting should return null on casting failure

2011-04-28 Thread thejas . nair

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/673/#review606
---

Ship it!


+1

- thejas


On 2011-04-28 20:56:30, Daniel Dai wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/673/
 ---
 
 (Updated 2011-04-28 20:56:30)
 
 
 Review request for pig and thejas.
 
 
 Summary
 ---
 
 See PIG-1989
 
 
 This addresses bug PIG-1989.
 https://issues.apache.org/jira/browse/PIG-1989
 
 
 Diffs
 -
 
   
 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POCast.java
  1097304 
   
 http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestPOCast.java
  1097304 
 
 Diff: https://reviews.apache.org/r/673/diff
 
 
 Testing
 ---
 
 Test-patch:
  [exec] +1 overall.  
  [exec] 
  [exec] +1 @author.  The patch does not contain any @author tags.
  [exec] 
  [exec] +1 tests included.  The patch appears to include 3 new or 
 modified tests.
  [exec] 
  [exec] +1 javadoc.  The javadoc tool did not generate any warning 
 messages.
  [exec] 
  [exec] +1 javac.  The applied patch does not increase the total 
 number of javac compiler warnings.
  [exec] 
  [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
 warnings.
  [exec] 
  [exec] +1 release audit.  The applied patch does not increase the 
 total number of release audit warnings.
 
 Unit test:
 all pass
 
 
 Thanks,
 
 Daniel
 




Re: Review Request: incorrect schema shown when project-star is used with other projections

2011-04-19 Thread thejas . nair

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/624/#review499
---



http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/visitor/LineageFindRelVisitor.java
https://reviews.apache.org/r/624/#comment1029

If there are multiple group-by columns, the group column will be a tuple. 
This will associate the load function only to the tuple and not the uids of the 
columns within the tuple.
Need to associated load function to inner-uids as well like its done in 
mapMatchLoadFuncToUid


- thejas


On 2011-04-19 21:20:10, Daniel Dai wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/624/
 ---
 
 (Updated 2011-04-19 21:20:10)
 
 
 Review request for pig and thejas.
 
 
 Summary
 ---
 
 See PIG-1910
 
 
 This addresses bug PIG-1910.
 https://issues.apache.org/jira/browse/PIG-1910
 
 
 Diffs
 -
 
   
 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/expression/DereferenceExpression.java
  1095145 
   
 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/expression/ExpToPhyTranslationVisitor.java
  1095145 
   
 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/expression/ProjectExpression.java
  1095145 
   
 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/relational/LOCogroup.java
  1095145 
   
 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/visitor/ColumnAliasConversionVisitor.java
  1095145 
   
 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/visitor/LineageFindRelVisitor.java
  1095145 
   
 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/visitor/UDFFinder.java
  PRE-CREATION 
   
 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/parser/QueryParserDriver.java
  1095145 
   
 http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestPigServer.java
  1095145 
   
 http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestPlanGeneration.java
  PRE-CREATION 
   
 http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestTypeCheckingValidatorNewLP.java
  1095145 
   
 http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/Util.java 
 1095145 
 
 Diff: https://reviews.apache.org/r/624/diff
 
 
 Testing
 ---
 
 Test-patch:
  [exec] +1 overall.  
  [exec] 
  [exec] +1 @author.  The patch does not contain any @author tags.
  [exec] 
  [exec] +1 tests included.  The patch appears to include 12 new or 
 modified tests.
  [exec] 
  [exec] +1 javadoc.  The javadoc tool did not generate any warning 
 messages.
  [exec] 
  [exec] +1 javac.  The applied patch does not increase the total 
 number of javac compiler warnings.
  [exec] 
  [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
 warnings.
  [exec] 
  [exec] +1 release audit.  The applied patch does not increase the 
 total number of release audit warnings.
 
 Unit test:
 all pass
 
 
 Thanks,
 
 Daniel
 




Re: Review Request: Secondary sort fail when dereferencing two fields inside foreach

2011-04-19 Thread thejas . nair

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/621/#review500
---

Ship it!


+1

- thejas


On 2011-04-19 00:37:31, Daniel Dai wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/621/
 ---
 
 (Updated 2011-04-19 00:37:31)
 
 
 Review request for pig and thejas.
 
 
 Summary
 ---
 
 See PIG-1978
 
 
 This addresses bug PIG-1978.
 https://issues.apache.org/jira/browse/PIG-1978
 
 
 Diffs
 -
 
   
 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/SecondaryKeyOptimizer.java
  1091982 
   
 http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestSecondarySort.java
  1091982 
 
 Diff: https://reviews.apache.org/r/621/diff
 
 
 Testing
 ---
 
 Test-patch:
  [exec] +1 @author.  The patch does not contain any @author tags.
  [exec] 
  [exec] +1 tests included.  The patch appears to include 3 new or 
 modified tests.
  [exec] 
  [exec] +1 javadoc.  The javadoc tool did not generate any warning 
 messages.
  [exec] 
  [exec] +1 javac.  The applied patch does not increase the total 
 number of javac compiler warnings.
  [exec] 
  [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
 warnings.
  [exec] 
  [exec] +1 release audit.  The applied patch does not increase the 
 total number of release audit warnings.
 
 Unit test:
 all pass
 
 
 Thanks,
 
 Daniel
 




Re: Review Request: New logical plan: Should not push up filter in front of Bincond

2011-04-04 Thread thejas . nair

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/544/#review382
---

Ship it!


+1

- thejas


On 2011-04-04 18:10:55, Daniel Dai wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/544/
 ---
 
 (Updated 2011-04-04 18:10:55)
 
 
 Review request for pig and thejas.
 
 
 Summary
 ---
 
 The following script produce wrong result:
 
 data = LOAD 'data.txt' using PigStorage() as (referrer:chararray, 
 canonical_url:chararray, ip:chararray);
 best_url = FOREACH data GENERATE ((canonical_url != '' and canonical_url is 
 not null) ? canonical_url : referrer) AS url, ip;
 filtered = FILTER best_url BY url == 'badsite.com';
 dump filtered;
 
 data.txt:
 badsite.com 127.0.0.1
 goodsite.com/1?foo=true goodsite.com 127.0.0.1
 
 Expected:
 (badsite.com,127.0.0.1)
 
 We get nothing.
 
 
 This addresses bug PIG-1935.
 https://issues.apache.org/jira/browse/PIG-1935
 
 
 Diffs
 -
 
   
 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/expression/BinCondExpression.java
  1085215 
   
 http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestNewPlanFilterAboveForeach.java
  1085215 
 
 Diff: https://reviews.apache.org/r/544/diff
 
 
 Testing
 ---
 
 test-patch:
  [exec] +1 overall.  
  [exec] 
  [exec] +1 @author.  The patch does not contain any @author tags.
  [exec] 
  [exec] +1 tests included.  The patch appears to include 3 new or 
 modified tests.
  [exec] 
  [exec] +1 javadoc.  The javadoc tool did not generate any warning 
 messages.
  [exec] 
  [exec] +1 javac.  The applied patch does not increase the total 
 number of javac compiler warnings.
  [exec] 
  [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
 warnings.
  [exec] 
  [exec] +1 release audit.  The applied patch does not increase the 
 total number of release audit warnings.
 
 Unit test:
 all pass
 
 End-to-end test:
 all pass
 
 
 Thanks,
 
 Daniel
 




Re: Review Request: Dereference a bag within a tuple does not work

2011-04-01 Thread thejas . nair

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/524/#review375
---

Ship it!


+1

- thejas


On 2011-03-24 12:22:48, Daniel Dai wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/524/
 ---
 
 (Updated 2011-03-24 12:22:48)
 
 
 Review request for pig and thejas.
 
 
 Summary
 ---
 
 The following script does not work (both in new and old logical plan):
 
 a = load '1.txt' as (t : tuple(i: int, b1: bag { b_tuple : tuple ( b_str: 
 chararray) }));
 b = foreach a generate t.b1;
 dump b;
 
 1.txt:
 (1,{(one),(two)})
 
 Error from old logical plan:
 java.lang.ClassCastException: org.apache.pig.data.BinSedesTuple cannot be 
 cast to org.apache.pig.data.DataBag
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:482)
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:197)
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:480)
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:197)
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:339)
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:291)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:237)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:232)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:53)
 at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
 at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
 at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
 
 Error from new logical plan:
 java.lang.NullPointerException
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.consumeInputBag(POProject.java:246)
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:200)
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:339)
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:291)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:237)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:232)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:53)
 at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
 at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
 at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
 
 If we change b = foreach a generate t.b1; to b = foreach a generate t.i;, 
 it works fine, only refer to a bag does not work.
 
 
 This addresses bug PIG-1866.
 https://issues.apache.org/jira/browse/PIG-1866
 
 
 Diffs
 -
 
   
 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRCompiler.java
  1084415 
   
 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POProject.java
  1084415 
   
 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/relational/LogToPhyTranslationVisitor.java
  1084415 
   
 http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestEvalPipeline2.java
  1084415 
   
 http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/MRC15.gld
  1084415 
 
 Diff: https://reviews.apache.org/r/524/diff
 
 
 Testing
 ---
 
 test-patch:
  [exec] +1 overall.  
  [exec] 
  [exec] +1 @author.  The patch does not contain any @author tags.
  [exec] 
  [exec] +1 tests included.  The patch appears to include 6 new or 
 modified tests.
  [exec] 
  [exec] +1 javadoc.  The javadoc tool did not generate any warning 
 messages.
  [exec] 
  [exec] +1 javac.  The applied patch does not increase the total 
 number of javac compiler warnings.
  [exec] 
  [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
 

Re: Review Request: New logical plan fails when I have complex data types from udf

2011-03-28 Thread thejas . nair

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/526/#review354
---

Ship it!


+1

- thejas


On 2011-03-25 11:51:15, Daniel Dai wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/526/
 ---
 
 (Updated 2011-03-25 11:51:15)
 
 
 Review request for pig and thejas.
 
 
 Summary
 ---
 
 The new logical plan fails when I have complex data types returning from my 
 eval function.
 
 The below is my script :
 
 register myudf.jar;   
 B1 = load 'myinput' as (id:chararray,ts:int,url:chararray);
 B2 = group B1 by id;
 B = foreach B2 {
  Tuples = order B1 by ts;
  generate Tuples;
 };
 C1 = foreach B generate TransformToMyDataType(Tuples,-1,0,1) as seq: { t: ( 
 previous, current, next ) };
 C2 = foreach C1 generate FLATTEN(seq);
 C3 = foreach C2 generate  current.id as id;
 dump C3;
 
 On C3 it fails with below message :
 
 Couldn't find matching uid -1 for project (Name: Project Type: bytearray Uid: 
 45 Input: 0 Column: 1)
 
 The below is the describe on C1 ;
 
 C1: {seq: {t: (previous: (id: chararray,ts: int,url: chararray),current: (id: 
 chararray,ts: int,url: chararray),next: (id: chararray,ts: int,url: 
 chararray))}}
 
 The script works if I turn off new logical plan or use Pig 0.7.
 
 
 This addresses bug PIG-1868.
 https://issues.apache.org/jira/browse/PIG-1868
 
 
 Diffs
 -
 
   
 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/relational/LogicalSchema.java
  1081999 
   
 http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestSchema.java
  1081999 
 
 Diff: https://reviews.apache.org/r/526/diff
 
 
 Testing
 ---
 
 
 Thanks,
 
 Daniel
 




Re: Review Request: Switch to new parser generator technology

2011-03-02 Thread thejas . nair

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/459/#review282
---



http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/util/MultiMap.java
https://reviews.apache.org/r/459/#comment528

In several places in the code, an assumption is made that what it returns 
is a list (including casts to list), so I changed the return type to list. To 
prevent findbugs warnings, any casts to lists of the return value has now been 
removed from other classes.



http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/parser/FunctionType.java
https://reviews.apache.org/r/459/#comment524

this is likely to give findbug warnings for unused variables.  (Change can 
be part of separate incremental patch).



http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/parser/ParserException.java
https://reviews.apache.org/r/459/#comment521

Typo Failed to parse:  . (Change can be part of separate incremental 
patch).



http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestLogToPhyCompiler.java
https://reviews.apache.org/r/459/#comment527

It will be good to have these tests migrated to new logical plan.


- thejas


On 2011-03-02 17:16:11, Daniel Dai wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/459/
 ---
 
 (Updated 2011-03-02 17:16:11)
 
 
 Review request for pig, Daniel Dai, thejas, and Xuefu Zhang.
 
 
 Summary
 ---
 
 There are many bugs in Pig related to the parser, particularly to bad error 
 messages. After review of Java CC we feel these will be difficult to address 
 using that tool. Also, the .jjt files used by JavaCC are hard to understand 
 and maintain.
 
 ANTLR is being reviewed as the most likely choice to move to, but other 
 parsers will be reviewed as well.
 
 This JIRA will act as an umbrella issue for other parser issues.
 
 
 This addresses bug PIG-1618.
 https://issues.apache.org/jira/browse/PIG-1618
 
 
 Diffs
 -
 
   http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/Main.java 
 1076316 
   http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/PigServer.java 
 1076316 
   
 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/SortInfoSetter.java
  PRE-CREATION 
   
 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/StandAloneParser.java
  1076316 
   
 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/HExecutionEngine.java
  1076316 
   
 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRCompiler.java
  1076316 
   
 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MapReduceLauncher.java
  1076316 
   
 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/LogToPhyTranslationVisitor.java
  1076316 
   
 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POMergeJoin.java
  1076316 
   
 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/logicalLayer/LOCogroup.java
  1076316 
   
 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/logicalLayer/LOJoin.java
  1076316 
   
 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/logicalLayer/ProjectFixerUpper.java
  1076316 
   
 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/logicalLayer/optimizer/PushDownForeachFlatten.java
  1076316 
   
 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/logicalLayer/optimizer/PushUpFilter.java
  1076316 
   
 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/logicalLayer/schema/Schema.java
  1076316 
   
 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/plan/OperatorPlan.java
  1076316 
   
 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/util/MultiMap.java
  1076316 
   
 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/BaseOperatorPlan.java
  1076316 
   
 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/OperatorPlan.java
  1076316 
   
 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/OperatorSubPlan.java
  1076316 
   
 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/LogicalExpPlanMigrationVistor.java
  1076316 
   
 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/LogicalPlanMigrationVistor.java
  1076316 
   
 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/Util.java
  1076316 
   
 

Re: Review Request: New logical plan: FilterLogicExpressionSimplifier fail to deal with UDF

2011-02-14 Thread thejas . nair

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/356/#review223
---

Ship it!


- thejas


On 2011-02-14 17:00:02, Daniel Dai wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/356/
 ---
 
 (Updated 2011-02-14 17:00:02)
 
 
 Review request for pig and thejas.
 
 
 Summary
 ---
 
 The following script fail:
 
 a = load '1.txt' as (a0, a1);
 b = filter a by (a0 is not null or a1 is not null) and IsEmpty(a0);
 explain b;
 
 Error message:
 Caused by: java.lang.ClassCastException: 
 org.apache.pig.newplan.logical.expression.UserFuncExpression cannot be cast 
 to org.apache.pig.newplan.logical.expression.BinaryExpression
 at 
 org.apache.pig.newplan.logical.rules.LogicalExpressionSimplifier$LogicalExpressionSimplifierTransformer.handleBinary(LogicalExpressionSimplifier.java:561)
 at 
 org.apache.pig.newplan.logical.rules.LogicalExpressionSimplifier$LogicalExpressionSimplifierTransformer.handleAnd(LogicalExpressionSimplifier.java:429)
 at 
 org.apache.pig.newplan.logical.rules.LogicalExpressionSimplifier$LogicalExpressionSimplifierTransformer.inferRelationship(LogicalExpressionSimplifier.java:397)
 at 
 org.apache.pig.newplan.logical.rules.LogicalExpressionSimplifier$LogicalExpressionSimplifierTransformer.handleDNFOr(LogicalExpressionSimplifier.java:281)
 at 
 org.apache.pig.newplan.logical.rules.LogicalExpressionSimplifier$LogicalExpressionSimplifierTransformer.checkDNFLeaves(LogicalExpressionSimplifier.java:192)
 at 
 org.apache.pig.newplan.logical.rules.LogicalExpressionSimplifier$LogicalExpressionSimplifierTransformer.transform(LogicalExpressionSimplifier.java:108)
 at 
 org.apache.pig.newplan.optimizer.PlanOptimizer.optimize(PlanOptimizer.java:110)
 
 
 This addresses bug PIG-1820.
 https://issues.apache.org/jira/browse/PIG-1820
 
 
 Diffs
 -
 
   
 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/rules/LogicalExpressionSimplifier.java
  1062989 
   
 http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestFilterSimplification.java
  1062989 
 
 Diff: https://reviews.apache.org/r/356/diff
 
 
 Testing
 ---
 
 Test-patch:
  [exec] +1 overall.  
  [exec] 
  [exec] +1 @author.  The patch does not contain any @author tags.
  [exec] 
  [exec] +1 tests included.  The patch appears to include 3 new or 
 modified tests.
  [exec] 
  [exec] +1 javadoc.  The javadoc tool did not generate any warning 
 messages.
  [exec] 
  [exec] +1 javac.  The applied patch does not increase the total 
 number of javac compiler warnings.
  [exec] 
  [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
 warnings.
  [exec] 
  [exec] +1 release audit.  The applied patch does not increase the 
 total number of release audit warnings.
 
 Unit test:
 all pass
 
 End-to-end test:
 all pass
 
 
 Thanks,
 
 Daniel
 




Re: Review Request: Disable converting bytes loading from BinStorage

2010-12-02 Thread thejas . nair

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/134/#review55
---



http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/builtin/BinStorageWithCaster.java
https://reviews.apache.org/r/134/#comment37

I think BinStorageWithCaster should implement LoadCaster interface.


- thejas


On 2010-12-01 13:43:29, Daniel Dai wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/134/
 ---
 
 (Updated 2010-12-01 13:43:29)
 
 
 Review request for pig.
 
 
 Summary
 ---
 
 Change behavior of converting bytes loading from BinStorage.
 1. Converting bytes loading from BinStorage() will now result an error.
 2. If user clearly understand that the data is load from PigStorage (or other 
 LoadFunc using Utf8StorageConverter), he/she should use BinStorageWithCaster. 
 By doing this, converting bytes to other type will still work.
 
 
 This addresses bug PIG-1745.
 https://issues.apache.org/jira/browse/PIG-1745
 
 
 Diffs
 -
 
   
 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/builtin/BinStorage.java
  1040653 
   
 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/builtin/BinStorageWithCaster.java
  PRE-CREATION 
   
 http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestEvalPipeline2.java
  1040653 
 
 Diff: https://reviews.apache.org/r/134/diff
 
 
 Testing
 ---
 
 test-patch:
  [exec] +1 overall.
  [exec]
  [exec] +1 @author.  The patch does not contain any @author tags.
  [exec]
  [exec] +1 tests included.  The patch appears to include 3 new or 
 modified tests.
  [exec]
  [exec] +1 javadoc.  The javadoc tool did not generate any warning 
 messages.
  [exec]
  [exec] +1 javac.  The applied patch does not increase the total 
 number of javac compiler warnings.
  [exec]
  [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
 warnings.
  [exec]
  [exec] +1 release audit.  The applied patch does not increase the 
 total number of release audit warnings.
 
 unit-test:
 all pass
 
 end-to-end test:
 all pass
 
 
 Thanks,
 
 Daniel