from:"Olga Natkovich"

[jira] [Commented] (PIG-4764) Make Pig work with Hive 2.0

2020-01-17 Thread Olga Natkovich (Jira)



[ 
https://issues.apache.org/jira/browse/PIG-4764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17018246#comment-17018246
 ] 

Olga Natkovich commented on PIG-4764:
-

I agree with Koji. As long as this does not break Pig with Hive 1 and as long 
as somebody commits to test the release with Hive 2, that should work.

> Make Pig work with Hive 2.0
> ---
>
> Key: PIG-4764
> URL: https://issues.apache.org/jira/browse/PIG-4764
> Project: Pig
>  Issue Type: Improvement
>  Components: impl
>Reporter: Jianyong Dai
>Assignee: Jianyong Dai
>Priority: Major
> Fix For: 0.18.0
>
> Attachments: PIG-4764-0.patch, PIG-4764-1.patch, PIG-4764-2.patch, 
> PIG-4764-3.patch, PIG-4764-4.patch
>
>
> There are a lot of changes especially around ORC in Hive 2.0. We need to make 
> Pig work with it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: New Apache Pig Committer: Nandor Kollar

2018-09-06 Thread Olga Natkovich

 Congratulations, Nandor!!!
Olga
On Thursday, September 6, 2018, 12:59:38 PM PDT, Koji Noguchi 
 wrote:  
 
 On behalf of the Apache Pig PMC, it is my pleasure to announce that
Nandor Kollar has accepted the invitation to become an Apache Pig committer.
We appreciate all the work Nandor has done and look forward to seeing
continued involvement.

Please join me in congratulating Nandor!

Thanks,
Koji

[jira] [Commented] (PIG-5336) Drop old documents from the site

2018-06-12 Thread Olga Natkovich (JIRA)



[ 
https://issues.apache.org/jira/browse/PIG-5336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509857#comment-16509857
 ] 

Olga Natkovich commented on PIG-5336:
-

+1

> Drop old documents from the site
> 
>
> Key: PIG-5336
> URL: https://issues.apache.org/jira/browse/PIG-5336
> Project: Pig
>  Issue Type: Improvement
>  Components: site
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Trivial
> Attachments: pig-5336-redirect.patch
>
>
> When working on PIG-5334, saw bunch of old documents still being uploaded on 
> svn
> {noformat}
> knoguchi@truelisten-lm site> ls publish/docs/ | sort -V
> r0.7.0/
> r0.8.1/
> r0.9.1/
> r0.9.2/
> r0.10.0/
> r0.10.1/
> r0.11.0/
> r0.11.1/
> r0.12.0/
> r0.12.1/
> r0.13.0/
> r0.14.0/
> r0.15.0/
> r0.16.0/
> r0.17.0/
> {noformat}
> Sometimes I see our users referencing old documents due to this.
> We should retire most of them and leave the recent ones.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (PIG-5336) Drop old documents from the site

2018-04-11 Thread Olga Natkovich (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434557#comment-16434557
 ] 

Olga Natkovich commented on PIG-5336:
-

If I remember correctly, in the past we had a policy of keeping 3 most recent 
releases available.

> Drop old documents from the site
> 
>
> Key: PIG-5336
> URL: https://issues.apache.org/jira/browse/PIG-5336
> Project: Pig
>  Issue Type: Improvement
>  Components: site
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Trivial
>
> When working on PIG-5334, saw bunch of old documents still being uploaded on 
> svn
> {noformat}
> knoguchi@truelisten-lm site> ls publish/docs/ | sort -V
> r0.7.0/
> r0.8.1/
> r0.9.1/
> r0.9.2/
> r0.10.0/
> r0.10.1/
> r0.11.0/
> r0.11.1/
> r0.12.0/
> r0.12.1/
> r0.13.0/
> r0.14.0/
> r0.15.0/
> r0.16.0/
> r0.17.0/
> {noformat}
> Sometimes I see our users referencing old documents due to this.
> We should retire most of them and leave the recent ones.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (PIG-5334) Update our site to follow a foundation request

2018-04-11 Thread Olga Natkovich (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434440#comment-16434440
 ] 

Olga Natkovich commented on PIG-5334:
-

+1. Koji thanks for taking care of this!

> Update our site to follow a foundation request
> --
>
> Key: PIG-5334
> URL: https://issues.apache.org/jira/browse/PIG-5334
> Project: Pig
>  Issue Type: Improvement
>  Components: site
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Minor
> Attachments: Screen Shot 2018-04-09 at 4.11.08 PM.png, Screen 
> Shot-02-top-left.png, Screen Shot-03-bottom-left.png.png, Screen 
> Shot-04-left-column.png, Screen Shot-05-left-column-annoted.png, 
> pig-5334-v01.patch, pig-5334-v02-top-left.patch, 
> pig-5334-v03-bottom-left.patch, pig-5334-v04-left-column.patch, 
> pig-5334-v05-left-column.patch
>
>
> Today, there was a request from the foundation to add an Apache event logo to 
> our Apache Pig site.
>  Details at [http://apache.org/events/README.txt]
>  Basically asking us to add
>  [https://www.apache.org/events/current-event-234x60.png]
>  or
>  [https://www.apache.org/events/current-event-125x125.png]
>  to our site.
> Besides from this, email mentioned about general apache site suggestion 
> outlined at [https://www.apache.org/foundation/marks/pmcs#navigation]
>  * "License" should link to: [http://www.apache.org/licenses/]
>  * "Sponsorship" or "Donate" should link to:   
> [http://www.apache.org/foundation/sponsorship.html]
>  * "Thanks" should link to: [http://www.apache.org/foundation/thanks.html]
>  * "Security" should link to either to a project-specific page detailing how 
> users may securely report potential vulnerabilities, or to the main 
> [http://www.apache.org/security/] page



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (PIG-5334) Update our site to follow a foundation request

2018-04-10 Thread Olga Natkovich (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16432921#comment-16432921
 ] 

Olga Natkovich commented on PIG-5334:
-

So are we required to put this advertisement or is it part of good "will"? 
Perhaps we put the one for the ApacheCon as replacement for Hadoop and then 
revert?

 

I don't it is worth our time to make changes to the Forest setup.

> Update our site to follow a foundation request
> --
>
> Key: PIG-5334
> URL: https://issues.apache.org/jira/browse/PIG-5334
> Project: Pig
>  Issue Type: Improvement
>  Components: site
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Minor
> Attachments: Screen Shot 2018-04-09 at 4.11.08 PM.png, Screen 
> Shot-02-top-left.png, Screen Shot-03-bottom-left.png.png, pig-5334-v01.patch, 
> pig-5334-v02-top-left.patch, pig-5334-v03-bottom-left.patch
>
>
> Today, there was a request from the foundation to add an Apache event logo to 
> our Apache Pig site.
>  Details at [http://apache.org/events/README.txt]
>  Basically asking us to add
>  [https://www.apache.org/events/current-event-234x60.png]
>  or
>  [https://www.apache.org/events/current-event-125x125.png]
>  to our site.
> Besides from this, email mentioned about general apache site suggestion 
> outlined at [https://www.apache.org/foundation/marks/pmcs#navigation]
>  * "License" should link to: [http://www.apache.org/licenses/]
>  * "Sponsorship" or "Donate" should link to:   
> [http://www.apache.org/foundation/sponsorship.html]
>  * "Thanks" should link to: [http://www.apache.org/foundation/thanks.html]
>  * "Security" should link to either to a project-specific page detailing how 
> users may securely report potential vulnerabilities, or to the main 
> [http://www.apache.org/security/] page



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (PIG-5334) Update our site to follow a foundation request

2018-04-10 Thread Olga Natkovich (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16432809#comment-16432809
 ] 

Olga Natkovich commented on PIG-5334:
-

Agree with Rohini. Did not realize this was not a temp change.

> Update our site to follow a foundation request
> --
>
> Key: PIG-5334
> URL: https://issues.apache.org/jira/browse/PIG-5334
> Project: Pig
>  Issue Type: Improvement
>  Components: site
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Minor
> Attachments: Screen Shot 2018-04-09 at 4.11.08 PM.png, Screen 
> Shot-02-top-left.png, Screen Shot-03-bottom-left.png.png, pig-5334-v01.patch, 
> pig-5334-v02-top-left.patch, pig-5334-v03-bottom-left.patch
>
>
> Today, there was a request from the foundation to add an Apache event logo to 
> our Apache Pig site.
>  Details at [http://apache.org/events/README.txt]
>  Basically asking us to add
>  [https://www.apache.org/events/current-event-234x60.png]
>  or
>  [https://www.apache.org/events/current-event-125x125.png]
>  to our site.
> Besides from this, email mentioned about general apache site suggestion 
> outlined at [https://www.apache.org/foundation/marks/pmcs#navigation]
>  * "License" should link to: [http://www.apache.org/licenses/]
>  * "Sponsorship" or "Donate" should link to:   
> [http://www.apache.org/foundation/sponsorship.html]
>  * "Thanks" should link to: [http://www.apache.org/foundation/thanks.html]
>  * "Security" should link to either to a project-specific page detailing how 
> users may securely report potential vulnerabilities, or to the main 
> [http://www.apache.org/security/] page



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (PIG-5334) Update our site to follow a foundation request

2018-04-10 Thread Olga Natkovich (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16432687#comment-16432687
 ] 

Olga Natkovich commented on PIG-5334:
-

[~knoguchi] thanks for working on this. Your changes to incorporate license, 
etc. look good to me. For the picture, top left or top right would work. Since 
it is a temp change, I don't have a strong opinion. (Don't think it would be 
visible if we put it at the bottom.) Seems like we should put the security link 
in just to satisfy the requirement but again no strong opinion.

> Update our site to follow a foundation request
> --
>
> Key: PIG-5334
> URL: https://issues.apache.org/jira/browse/PIG-5334
> Project: Pig
>  Issue Type: Improvement
>  Components: site
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Minor
> Attachments: Screen Shot 2018-04-09 at 4.11.08 PM.png, Screen 
> Shot-02-top-left.png, Screen Shot-03-bottom-left.png.png, pig-5334-v01.patch, 
> pig-5334-v02-top-left.patch, pig-5334-v03-bottom-left.patch
>
>
> Today, there was a request from the foundation to add an Apache event logo to 
> our Apache Pig site.
>  Details at [http://apache.org/events/README.txt]
>  Basically asking us to add
>  [https://www.apache.org/events/current-event-234x60.png]
>  or
>  [https://www.apache.org/events/current-event-125x125.png]
>  to our site.
> Besides from this, email mentioned about general apache site suggestion 
> outlined at [https://www.apache.org/foundation/marks/pmcs#navigation]
>  * "License" should link to: [http://www.apache.org/licenses/]
>  * "Sponsorship" or "Donate" should link to:   
> [http://www.apache.org/foundation/sponsorship.html]
>  * "Thanks" should link to: [http://www.apache.org/foundation/thanks.html]
>  * "Security" should link to either to a project-specific page detailing how 
> users may securely report potential vulnerabilities, or to the main 
> [http://www.apache.org/security/] page



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (PIG-5307) NPE in TezOperDependencyParallelismEstimator

2017-10-02 Thread Olga Natkovich (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16188921#comment-16188921
 ] 

Olga Natkovich commented on PIG-5307:
-

+1

> NPE in TezOperDependencyParallelismEstimator
> 
>
> Key: PIG-5307
> URL: https://issues.apache.org/jira/browse/PIG-5307
> Project: Pig
>  Issue Type: Bug
>Reporter: Rohini Palaniswamy
>Assignee: Rohini Palaniswamy
>Priority: Minor
> Fix For: 0.18.0
>
> Attachments: PIG-5307-1.patch
>
>
> In case of the constant being null, NPE is thrown. This was encountered by a 
> user who was generating the field name based on a condition which expanded to 
> NULL when condition was not met. For eg:
> {code}
> x = FILTER x BY (chararray) NULL == 'fieldvalue';
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (PIG-5229) TestPigTest.testSpecificOrderOutput and testSpecificOrderOutputForAlias failing

2017-04-28 Thread Olga Natkovich (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15989190#comment-15989190
 ] 

Olga Natkovich commented on PIG-5229:
-

+1

> TestPigTest.testSpecificOrderOutput and testSpecificOrderOutputForAlias 
> failing
> ---
>
> Key: PIG-5229
> URL: https://issues.apache.org/jira/browse/PIG-5229
> Project: Pig
>  Issue Type: Test
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Trivial
> Attachments: pig-5229-v01.patch, pig-5229-v02.patch
>
>
> Error message 
> {noformat}
> junit.framework.AssertionFailedError: expected:<([twitter,7)
> (yahoo,25)
> (facebook,15])> but was:<([yahoo,25)
> (facebook,15)
> (twitter,7])>
>   at org.apache.pig.pigunit.PigTest.assertEquals(PigTest.java:438)
>   at org.apache.pig.pigunit.PigTest.assertOutput(PigTest.java:385)
>   at org.apache.pig.pigunit.PigTest.assertOutput(PigTest.java:375)
>   at 
> org.apache.pig.test.pigunit.TestPigTest.testSpecificOrderOutput(TestPigTest.java:572)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (PIG-2315) Make as clause work in generate

2014-07-15 Thread Olga Natkovich (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-2315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14062246#comment-14062246
 ] 

Olga Natkovich commented on PIG-2315:
-

Are there plans to get this one into Pig 14?

 Make as clause work in generate
 ---

 Key: PIG-2315
 URL: https://issues.apache.org/jira/browse/PIG-2315
 Project: Pig
  Issue Type: Bug
Reporter: Olga Natkovich
Assignee: Gianmarco De Francisci Morales
 Fix For: 0.14.0

 Attachments: PIG-2315-1.patch, PIG-2315-1.patch


 Currently, the following syntax is supported and ignored causing confusing 
 with users:
 A1 = foreach A1 generate a as a:chararray ;
 After this statement a just retains its previous type



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: [ANNOUNCE] Welcome new Pig Committer - Lorand Bendig

2014-06-24 Thread Olga Natkovich

Congrats, Lorand!


On Tuesday, June 24, 2014 9:04 AM, Mona Chitnis m...@apache.org wrote:
 


Congrats Lorand!

 
Mona Chitnis
Yahoo!



On Tuesday, June 24, 2014 7:14 AM, Aniket Mokashi aniket...@gmail.com wrote:



Congrats



On Tue, Jun 24, 2014 at 2:03 AM, Lorand Bendig lben...@gmail.com wrote:

 Thank you for all of you!

 --Lorand


 On 06/23/2014 11:41 PM, Mark Wagner wrote:

 Congrats and welcome, Lorand!

 On Sun, Jun 22, 2014 at 6:39 PM, Koji Noguchi
 knogu...@yahoo-inc.com.invalid wrote:

 Congrats!!!

 On 6/22/14, 9:08 PM, Rohini Palaniswamy rohini.adi...@gmail.com
 wrote:

  Congratulations Lorand !!!


 On Sun, Jun 22, 2014 at 2:47 PM, Xuefu Zhang xzh...@cloudera.com
 wrote:

  Many congrats, Lorand!

 --Xuefu


 On Sun, Jun 22, 2014 at 12:54 PM, Daniel Dai da...@hortonworks.com
 wrote:

  Congratulations!

 On Sun, Jun 22, 2014 at 7:00 AM, Jarek Jarcec Cecho

 jar...@apache.org

 wrote:

 Congratulations Lorand, well deserved!

 Jarcec

 On Sat, Jun 21, 2014 at 10:30:01PM -0700, Cheolsoo Park wrote:

 It is my pleasure to announce that Lorand Bendig became the newest

 addition

 to the Pig Committers! Lorand has been actively contributing to Pig

 for

 a

 year now.

 Please join me in congratulating Lorand!

 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or

 entity
 to

 which it is addressed and may contain information that is

 confidential,

 privileged and exempt from disclosure under applicable law. If the

 reader

 of this message is not the intended recipient, you are hereby notified

 that

 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender

 immediately

 and delete it from your system. Thank You.





-- 
...:::Aniket:::... Quetzalco@tl

Re: Plan to merge tez branch into trunk and branch 0.13

2014-05-15 Thread Olga Natkovich

Is anybody interested in driving Pig 13 to release? It would be good to get a 
volunteer to be the release manager for 13 before deciding to branch.


On Wednesday, May 14, 2014 9:07 AM, Cheolsoo Park piaozhe...@gmail.com wrote:
 
I'm also +1 on releasing 0.13 and then merging Tez branch.



On Tue, May 13, 2014 at 11:52 PM, Prashant Kommireddi
prash1...@gmail.comwrote:

 I'm a +1 on branching 0.13 and merging Tez branch with trunk. That would
 give us ample time to make 0.14 stable with tez merged.


 On Tue, May 13, 2014 at 9:05 PM, Daniel Dai da...@hortonworks.com wrote:

  We will make sure all existing unit tests / e2e tests pass in MR mode
  before merge, but it is possible we might hit some issues which are
  not captured by existing tests after merge. I can hardly tell how much
  time will it take to say the codebase is stable enough at this moment,
  but it is better to merge to trunk early, so more Pig developers can
  try the merged codebase and have more time to capture issues before we
  release Pig with tez (most probably Pig 0.14.0).
 
  Thanks,
  Daniel
 
  On Tue, May 13, 2014 at 3:48 PM, Prashant Kommireddi
  prash1...@gmail.com wrote:
   Hi Daniel,
  
   How long do you think might it take for the merge to stabilize?
  
   Thanks,
   Prashant
  
  
   On Tue, May 13, 2014 at 11:47 AM, Daniel Dai da...@hortonworks.com
  wrote:
  
   Hi, Pig devs,
  
   After several months development, Tez branch is becoming stable and we
   plan to merge tez branch to trunk in the next few weeks.
  
   Several weeks ago, we have a discussion about branching 0.13, and if
   we still have interest to do a release before merging tez, we shall do
   it now.
  
   Thoughts?
  
   Thanks,
   Daniel
  
   --
   CONFIDENTIALITY NOTICE
   NOTICE: This message is intended for the use of the individual or
  entity to
   which it is addressed and may contain information that is
 confidential,
   privileged and exempt from disclosure under applicable law. If the
  reader
   of this message is not the intended recipient, you are hereby notified
  that
   any printing, copying, dissemination, distribution, disclosure or
   forwarding of this communication is strictly prohibited. If you have
   received this communication in error, please contact the sender
  immediately
   and delete it from your system. Thank You.
  
 
  --
  CONFIDENTIALITY NOTICE
  NOTICE: This message is intended for the use of the individual or entity
 to
  which it is addressed and may contain information that is confidential,
  privileged and exempt from disclosure under applicable law. If the reader
  of this message is not the intended recipient, you are hereby notified
 that
  any printing, copying, dissemination, distribution, disclosure or
  forwarding of this communication is strictly prohibited. If you have
  received this communication in error, please contact the sender
 immediately
  and delete it from your system. Thank You.

Re: Pig 0.13.0 release

2014-02-05 Thread Olga Natkovich

Just going by the list that Aniket provided, I don't really see enough for a 
full release. Two mentioned JIRAs are doc updates and one is a bug fix that was 
ported into Pig 12.



On Wednesday, February 5, 2014 3:13 PM, Aniket Mokashi aniket...@gmail.com 
wrote:
 
Hi All,

A good number of improvements and bug fixes have gone into trunk recently.
I'd like to know if we can roll out a Pig 0.13 release around mid-March?

I am aware that we are planning to merge tez branch into trunk soon.
However, making a release before tez branch is merged will be good. Any
objections?

Following are few jiras we need to wrap up before 0.13 release-
PIG-3591
PIG-3740
PIG-3745
PIG-3347
PIG-3731
Any other?

Thanks,
Aniket

Re: Welcome to the newest Pig Committer - Mark Wagner

2014-02-03 Thread Olga Natkovich

Congrats, Mark!



On Monday, February 3, 2014 1:41 PM, Mona Chitnis mona.chit...@yahoo.in wrote:
 
Congrats Mark!

 
--

Mona Chitnis
Yahoo!




On Saturday, February 1, 2014 12:01 AM, Mark Wagner wagner.mar...@gmail.com 
wrote:

Thanks everyone! I'm very excited to join all of you and I look
forward to making some good contributions in the future!

-Mark


On Fri, Jan 31, 2014 at 8:51 PM, Koji Noguchi knogu...@yahoo-inc.com wrote:
 Congrats!!!

 On Jan 31, 2014, at 8:42 PM, Cheolsoo Park piaozhe...@gmail.com wrote:

 Congrats Mark! Look forward to many more contributions!


 On Fri, Jan 31, 2014 at 5:36 PM, Jarek Jarcec Cecho jar...@apache.orgwrote:

 Congratulations Mark, good job!

 Jarcec

 On Fri, Jan 31, 2014 at 05:20:26PM -0800, Julien Le Dem wrote:
 It is my pleasure to announce that Mark Wagner became the newest
 addition to the Pig Committers!
 Mark has been actively contributing to Pig and in particular to the
 Pig-on-Tez effort.
 Please, join me in congratulating Mark!

Re: Welcome to the new Pig PMC member Aniket Mokashi

2014-01-14 Thread Olga Natkovich

Congrats, Aniket!



On Tuesday, January 14, 2014 8:32 PM, Tongjie Chen tongjie.c...@gmail.com 
wrote:
 
Congrats Aniket!



On Tue, Jan 14, 2014 at 8:12 PM, Cheolsoo Park piaozhe...@gmail.com wrote:

 Congrats Aniket!


 On Tue, Jan 14, 2014 at 7:01 PM, Jarek Jarcec Cecho jar...@apache.org
 wrote:

  Congratulations Aniket, good work!
 
  Jarcec
 
  On Tue, Jan 14, 2014 at 06:52:10PM -0800, JULIEN LE DEM wrote:
   It's my pleasure to announce that Aniket Mokashi became the newest
  addition to the Pig PMC.
   Aniket has been actively contributing to Pig for years.
   Please join me in congratulating Aniket!
  
   Julien

Re: How do we determine 'stable' pig version?

2013-10-29 Thread Olga Natkovich

If by stable we mean something we released, I don't see this label to be 
needed/useful at all.



On Wednesday, October 23, 2013 8:01 AM, Koji Noguchi knogu...@yahoo-inc.com 
wrote:
 
Thanks Alan, Daniel.

Taking back my request on 'stable' criteria. 

Koji


On Oct 22, 2013, at 7:18 PM, Alan Gates ga...@hortonworks.com wrote:

 I don't think we should change our use of stable.  Our usage is in line with 
 the Hadoop usage of the term in their releases.  To the best of our knowledge 
 as Apache developers it is stable.  It passes all of the tests we have.  We 
 have no criteria for deciding stability beyond this.
 
 Alan.
 
 On Oct 22, 2013, at 4:00 PM, Daniel Dai wrote:
 
 Yes, we can revisit. The question is how to determine the stability? 0.11.1
 is released for a while and should be considered stable, but actually it
 contains problem raised just recently. After we release 0.12.1, how soon
 should we declare it a stable release?
 
 Thanks,
 Daniel
 
 
 On Tue, Oct 22, 2013 at 2:25 PM, Koji Noguchi knogu...@yahoo-inc.comwrote:
 
 Thanks Daniel, Olga!  Keeping 3 versions would be nice.
 
 As for 'stable', can we revisit the definition?
 If it's *always* pointing to the latest release, I don't see the need for
 having this link(dir).
 Is it adding any value?
 
 Koji
 
 
 
 
 On Oct 22, 2013, at 1:43 PM, Daniel Dai da...@hortonworks.com wrote:
 
 That's totally make sense. Let's keep both download/documentation for 3
 versions.
 
 Thanks,
 Daniel
 
 
 On Tue, Oct 22, 2013 at 10:20 AM, Olga Natkovich onatkov...@yahoo.com
 wrote:
 
 Couple of suggestions:
 
 (1) I think we are trying to go for a more frequent release model and in
 that case it would make sense to keep perhaps 3 releases. Based on our
 experience at Yahoo, Pig 10 is the really stable release. We recently
 found
 a couple of critical bugs in 11 for which we posted patches. Also the
 community knows that we delayed a couple of key bugs in 12 till 12.1
 (2) Our documentation needs to be consistent with the number of releases
 we advertise as supported. Our docs currently go all the way to Pig 9.
 
 Olga
 
 
 
 On Tuesday, October 22, 2013 10:13 AM, Daniel Dai 
 da...@hortonworks.com
 wrote:
 
 Hi, Koji,
 Here is the criteria I use:
 (i) How do we determine how many releases to show on the front download
 page?
 We usually keep two most recent releases on the front page according to
 https://cwiki.apache.org/confluence/display/PIG/HowToRelease.
 
 (ii) How do we determine which release is considered 'stable' ?
 Here stable means passing all tests, peer reviewed. It does not mean
 production stable. Actually there is no way for us to know production
 stable after user download it, use it and gives feedback. That's why
 we
 will continue fixing bugs after major release. and make minor releases.
 
 Thanks,
 Daniel
 
 
 
 On Tue, Oct 22, 2013 at 9:45 AM, Koji Noguchi knogu...@yahoo-inc.com
 wrote:
 
 
 When I went to the pig release download page (through
 http://www.apache.org/dyn/closer.cgi/pig), I only saw 0.11.1 and 0.12
 available.
 I later learned that there is an 'archive' link(
 http://archive.apache.org/dist/pig/)  that list other versions (0.8 to
 0.10).
 
 Two questions.
 
 (i) How do we determine how many releases to show on the front download
 page?
 
 (ii) How do we determine which release is considered 'stable' ?
 
 I still consider the stable version to be 0.10.1 so I was surprised not
 to
 see that available on the front download page
 and even more surprised to see release 0.12 flagged as 'stable'.
 
 Koji
 
 
 
 
 
 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or
 entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the
 reader
 of this message is not the intended recipient, you are hereby notified
 that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender
 immediately
 and delete it from your system. Thank You.
 
 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity
 to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified
 that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender
 immediately
 and delete it from your system. Thank You.
 
 
 
 -- 
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to 
 which it is addressed and may contain information that is confidential, 
 privileged and exempt from disclosure under applicable law

Re: How do we determine 'stable' pig version?

2013-10-22 Thread Olga Natkovich

Couple of suggestions:

(1) I think we are trying to go for a more frequent release model and in that 
case it would make sense to keep perhaps 3 releases. Based on our experience at 
Yahoo, Pig 10 is the really stable release. We recently found a couple of 
critical bugs in 11 for which we posted patches. Also the community knows that 
we delayed a couple of key bugs in 12 till 12.1
(2) Our documentation needs to be consistent with the number of releases we 
advertise as supported. Our docs currently go all the way to Pig 9.

Olga



On Tuesday, October 22, 2013 10:13 AM, Daniel Dai da...@hortonworks.com wrote:
 
Hi, Koji,
Here is the criteria I use:
(i) How do we determine how many releases to show on the front download
page?
We usually keep two most recent releases on the front page according to
https://cwiki.apache.org/confluence/display/PIG/HowToRelease.

(ii) How do we determine which release is considered 'stable' ?
Here stable means passing all tests, peer reviewed. It does not mean
production stable. Actually there is no way for us to know production
stable after user download it, use it and gives feedback. That's why we
will continue fixing bugs after major release. and make minor releases.

Thanks,
Daniel



On Tue, Oct 22, 2013 at 9:45 AM, Koji Noguchi knogu...@yahoo-inc.comwrote:


 When I went to the pig release download page (through
 http://www.apache.org/dyn/closer.cgi/pig), I only saw 0.11.1 and 0.12
 available.
 I later learned that there is an 'archive' link(
 http://archive.apache.org/dist/pig/)  that list other versions (0.8 to
 0.10).

 Two questions.

 (i) How do we determine how many releases to show on the front download
 page?

 (ii) How do we determine which release is considered 'stable' ?

 I still consider the stable version to be 0.10.1 so I was surprised not to
 see that available on the front download page
 and even more surprised to see release 0.12 flagged as 'stable'.

 Koji





-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

[jira] [Commented] (PIG-3480) TFile-based tmpfile compression crashes in some cases

2013-09-30 Thread Olga Natkovich (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-3480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13781959#comment-13781959
 ] 

Olga Natkovich commented on PIG-3480:
-

Agree with Rohini. Changing default just because we found a bug does not seem 
like a sound approach,

 TFile-based tmpfile compression crashes in some cases
 -

 Key: PIG-3480
 URL: https://issues.apache.org/jira/browse/PIG-3480
 Project: Pig
  Issue Type: Bug
Reporter: Dmitriy V. Ryaboy
 Fix For: 0.12.0

 Attachments: PIG-3480.patch


 When pig tmpfile compression is on, some jobs fail inside core hadoop 
 internals.
 Suspect TFile is the problem, because an experiment in replacing TFile with 
 SequenceFile succeeded.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (PIG-3480) TFile-based tmpfile compression crashes in some cases

2013-09-24 Thread Olga Natkovich (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-3480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776745#comment-13776745
 ] 

Olga Natkovich commented on PIG-3480:
-

Could this be related to Hadoop version? 

 TFile-based tmpfile compression crashes in some cases
 -

 Key: PIG-3480
 URL: https://issues.apache.org/jira/browse/PIG-3480
 Project: Pig
  Issue Type: Bug
Reporter: Dmitriy V. Ryaboy
 Fix For: 0.12

 Attachments: PIG-3480.patch


 When pig tmpfile compression is on, some jobs fail inside core hadoop 
 internals.
 Suspect TFile is the problem, because an experiment in replacing TFile with 
 SequenceFile succeeded.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Welcome new Pig Committer - Koji Noguchi

2013-09-10 Thread Olga Natkovich

It is my pleasure to announce that Koji Noguchi became the newest addition to 
the Pig Committers!

Koji has been actively contributing to Pig for over a year now and has been a 
part of larger Hadoop community (including Hadoop Committer) for many years now.

Please, join me in congratulating Koji!

Olga

[jira] [Commented] (PIG-3293) Casting fails after Union from two data sourcesloaders

2013-08-30 Thread Olga Natkovich (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-3293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13755073#comment-13755073
 ] 

Olga Natkovich commented on PIG-3293:
-

Would it help to document that typecasting needs to happen before any Union 
operation?

 Casting fails after Union from two data sourcesloaders
 ---

 Key: PIG-3293
 URL: https://issues.apache.org/jira/browse/PIG-3293
 Project: Pig
  Issue Type: Bug
Reporter: Koji Noguchi
Priority: Minor
 Attachments: pig-3293-test-only-v01.patch


 Script similar to 
 {noformat}
 A = load 'data1' using MyLoader() as (a:bytearray);
 B = load 'data2' as (a:bytearray);
 C = union onschema A,B;
 D = foreach C generate (chararray)a;
 Store D into './out';
 {noformat}
 fails with 
java.lang.Exception: org.apache.pig.backend.executionengine.ExecException: 
 ERROR 1075: Received a bytearray from the UDF. Cannot determine how to 
 convert the bytearray to string.
 Both MyLoader and PigStorage use the default Utf8StorageConverter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (PIG-3279) Support nested RANK

2013-08-26 Thread Olga Natkovich (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-3279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13750279#comment-13750279
 ] 

Olga Natkovich commented on PIG-3279:
-

Hi Johnny,

Are you still planning to finish this work? If so, what is your timeline?

 Support nested RANK
 ---

 Key: PIG-3279
 URL: https://issues.apache.org/jira/browse/PIG-3279
 Project: Pig
  Issue Type: Improvement
Reporter: Gianmarco De Francisci Morales
Assignee: Johnny Zhang
 Attachments: PIG-3279-1.patch.txt, PIG-3279-2.patch.txt, 
 PIG-3279-3.patch.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (PIG-3419) Pluggable Execution Engine

2013-08-23 Thread Olga Natkovich (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-3419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13749106#comment-13749106
 ] 

Olga Natkovich commented on PIG-3419:
-

I think the reason we wanted it on the Tez branch is that it might evolve with 
Tez implementation and so we would merge the updated code back when Tez is 
ready. Since there are no plans for any additional backend, is there a need to 
apply this to trunk sooner rather than later?

 Pluggable Execution Engine 
 ---

 Key: PIG-3419
 URL: https://issues.apache.org/jira/browse/PIG-3419
 Project: Pig
  Issue Type: New Feature
Affects Versions: 0.12
Reporter: Achal Soni
Assignee: Achal Soni
Priority: Minor
 Attachments: execengine.patch, mapreduce_execengine.patch, 
 stats_scriptstate.patch, test_failures.txt, test_suite.patch, 
 updated-8-22-2013-exec-engine.patch


 In an effort to adapt Pig to work using Apache Tez 
 (https://issues.apache.org/jira/browse/TEZ), I made some changes to allow for 
 a cleaner ExecutionEngine abstraction than existed before. The changes are 
 not that major as Pig was already relatively abstracted out between the 
 frontend and backend. The changes in the attached commit are essentially the 
 barebones changes -- I tried to not change the structure of Pig's different 
 components too much. I think it will be interesting to see in the future how 
 we can refactor more areas of Pig to really honor this abstraction between 
 the frontend and backend. 
 Some of the changes was to reinstate an ExecutionEngine interface to tie 
 together the front end and backend, and making the changes in Pig to delegate 
 to the EE when necessary, and creating an MRExecutionEngine that implements 
 this interface. Other work included changing ExecType to cycle through the 
 ExecutionEngines on the classpath and select the appropriate one (this is 
 done using Java ServiceLoader, exactly how MapReduce does for choosing the 
 framework to use between local and distributed mode). Also I tried to make 
 ScriptState, JobStats, and PigStats as abstract as possible in its current 
 state. I think in the future some work will need to be done here to perhaps 
 re-evaluate the usage of ScriptState and the responsibilities of the 
 different statistics classes. I haven't touched the PPNL, but I think more 
 abstraction is needed here, perhaps in a separate patch. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (PIG-3351) Datetime objects cannot be stored using BinStorage or JasonLoader/JsonStorage

2013-06-05 Thread Olga Natkovich (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-3351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich reassigned PIG-3351:
---

Assignee: pat chan

 Datetime objects cannot be stored using BinStorage or JasonLoader/JsonStorage
 -

 Key: PIG-3351
 URL: https://issues.apache.org/jira/browse/PIG-3351
 Project: Pig
  Issue Type: Bug
  Components: data
Affects Versions: 0.11.1
Reporter: pat chan
Assignee: pat chan
Priority: Minor
 Fix For: 0.10.1, 0.11.1

 Attachments: PIG-3351.patch


 There's a bug in BinStorage that prevents datetime objects from being loaded.
 JsonLoader and JsonStorage does not support datetime objects.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Re: pig 0.11 candidate 2 feedback: Several problems

2013-02-20 Thread Olga Natkovich

I agree that supporting as much as we can is a good goal. The issue is who is 
going to be testing against all these versions? We found the issues under 
discussion because of a customer report, not because we consistently test 
against all versions. Perhaps when we decide which versions to support for next 
release we need also to agree who is going to be testing and maintaining 
compatibility with a particular version. 

For instance since Hadoop 23 compatibility is important for us at Yahoo we have 
been maintaining compatibility with this version for 0.9, 0.10 and will do the 
same for 0.11 and going forward. I think we would need others to step in and 
claim the versions of their interest.

Olga



 From: Kai Londenberg kai.londenb...@googlemail.com
To: dev@pig.apache.org 
Sent: Wednesday, February 20, 2013 1:51 AM
Subject: Re: pig 0.11 candidate 2 feedback: Several problems
 
Hi,

I stronly agree with Jonathan here. If there are good reasons why you
can't support an older version of Hadoop any more, that's one thing.
But having to change 2 lines of code doesn't really qualify as such in
my point of view ;)

At least for me, pig support for 0.20.2 is essential - without it, I
can't use it. If it doesn't support it, I'll have to branch pig and
hack it myself, or stop using it.

I guess, there are a lot of people still running 0.20.2 Clusters. If
you really have lots of data stored on HDFS and a continuously busy
cluster, an upgrade is nothing you do just because.


2013/2/20 Jonathan Coveney jcove...@gmail.com:
 I agree that we shouldn't have to support old versions forever. That said,
 I also don't think we should be too blase about supporting older versions
 where it is not odious to do so. We have a lot of competition in the
 language space and the broader the versions we can support, the better
 (assuming it isn't too odious to do so). In this case, I don't think it
 should be too hard to change ObjectSerializer so that the commons-codec
 code used is compatible with both versions...we could just in-line some of
 the Base64 code, and comment accordingly.

 That said, we also should be clear about what versions we support, but 6-12
 months seems short. The upgrade cycles on Hadoop are really, really long.


 2013/2/20 Prashant Kommireddi prash1...@gmail.com

 Agreed, that makes sense. Probably supporting older hadoop version for a 1
 or 2 pig releases before moving to a newer/stable version?

 Having said that, should we use 0.11 period to communicate the same to the
 community and start moving on 0.12 onwards? I know we are way past 6-12
 months (1-2 release) time frame with 0.20.2, but we also need to make sure
 users are aware and plan accordingly.

 I'd also be interested to hear how other projects (Hive, Oozie) are
 handling this.

 -Prashant

 On Tue, Feb 19, 2013 at 3:22 PM, Olga Natkovich onatkov...@yahoo.com
 wrote:

  It seems that for each Pig release we need to agree and clearly state
  which Hadoop versions it will support. I guess the main question is how
 we
  decide on this. Perhaps we should say that Pig no longer supports older
  Hadoop versions once the newer one is out for at least 6-12 month to make
  sure it is stable. I don't think we can support old versions
 indefinitely.
  It is in everybody's interest to keep moving forward.
 
  Olga
 
 
  
   From: Prashant Kommireddi prash1...@gmail.com
  To: dev@pig.apache.org
  Sent: Tuesday, February 19, 2013 10:57 AM
  Subject: Re: pig 0.11 candidate 2 feedback: Several problems
 
  What do you guys feel about the JIRA to do with 0.20.2 compatibility
  (PIG-3194)? I am interested in discussing the strategy around backward
  compatibility as this is something that would haunt us each time we move
 to
  the next hadoop version. For eg, we might be in a similar situation while
  moving to Hadoop 2.0, when some of the stuff might break for 1.0.
 
  I feel it would be good to get this JIRA fix in for 0.11, as 0.20.2 users
  might be caught unaware. Of course, I must admit there is selfish
 interest
  here and it's probably easier for us to have a workaround on Pig rather
  than upgrade hadoop in all our production DCs.
 
  -Prashant
 
 
  On Tue, Feb 19, 2013 at 9:54 AM, Russell Jurney 
 russell.jur...@gmail.com
  wrote:
 
   I think someone should step up and fix the easy ones, if possible.
  
  
   On Tue, Feb 19, 2013 at 9:51 AM, Bill Graham billgra...@gmail.com
  wrote:
  
Thanks Kai for reporting these.
   
What do people think about the severity of these issues w.r.t. Pig
 11?
  I
see a few possible options:
   
1. We include some or all of these patches in a new Pig 11 rc. We'd
  want
   to
make sure that they don't destabilize the current branch. This
 approach
makes sense if we think Pig 11 wouldn't be a good release without one
  or
more of these included.
   
2. We continue with the Pig 11 release without these, but then
 include

Re: pig 0.11 candidate 2 feedback: Several problems

2013-02-19 Thread Olga Natkovich

It seems that for each Pig release we need to agree and clearly state which 
Hadoop versions it will support. I guess the main question is how we decide on 
this. Perhaps we should say that Pig no longer supports older Hadoop versions 
once the newer one is out for at least 6-12 month to make sure it is stable. I 
don't think we can support old versions indefinitely. It is in everybody's 
interest to keep moving forward.

Olga



 From: Prashant Kommireddi prash1...@gmail.com
To: dev@pig.apache.org 
Sent: Tuesday, February 19, 2013 10:57 AM
Subject: Re: pig 0.11 candidate 2 feedback: Several problems
 
What do you guys feel about the JIRA to do with 0.20.2 compatibility
(PIG-3194)? I am interested in discussing the strategy around backward
compatibility as this is something that would haunt us each time we move to
the next hadoop version. For eg, we might be in a similar situation while
moving to Hadoop 2.0, when some of the stuff might break for 1.0.

I feel it would be good to get this JIRA fix in for 0.11, as 0.20.2 users
might be caught unaware. Of course, I must admit there is selfish interest
here and it's probably easier for us to have a workaround on Pig rather
than upgrade hadoop in all our production DCs.

-Prashant


On Tue, Feb 19, 2013 at 9:54 AM, Russell Jurney russell.jur...@gmail.comwrote:

 I think someone should step up and fix the easy ones, if possible.


 On Tue, Feb 19, 2013 at 9:51 AM, Bill Graham billgra...@gmail.com wrote:

  Thanks Kai for reporting these.
 
  What do people think about the severity of these issues w.r.t. Pig 11? I
  see a few possible options:
 
  1. We include some or all of these patches in a new Pig 11 rc. We'd want
 to
  make sure that they don't destabilize the current branch. This approach
  makes sense if we think Pig 11 wouldn't be a good release without one or
  more of these included.
 
  2. We continue with the Pig 11 release without these, but then include
 one
  or more in a 0.11.1 release.
 
  3. We continue with the Pig 11 release without these, but then include
 them
  in a 0.12 release.
 
  Jon has a patch for the MAP issue
  (PIG-3144https://issues.apache.org/jira/browse/PIG-3144)
  ready, which seems like the most pressing of the three to me.
 
  thanks,
  Bill
 
  On Mon, Feb 18, 2013 at 2:27 AM, Kai Londenberg 
  kai.londenb...@googlemail.com wrote:
 
   Hi,
  
   I just subscribed to the dev mailing list in order to give you some
   feedback on pig 0.11 candidate 2.
  
   The following three issues are currently present in 0.11 candidate 2:
  
   https://issues.apache.org/jira/browse/PIG-3144 - 'Erroneous map entry
   alias resolution leading to Duplicate schema alias errors'
   https://issues.apache.org/jira/browse/PIG-3194 - Changes to
   ObjectSerializer.java break compatibility with Hadoop 0.20.2
   https://issues.apache.org/jira/browse/PIG-3195 - Race Condition in
   PhysicalOperator leads to ExecException Error while trying to get
   next result in POStream
  
   The last two of these are easily solveable (see the tickets for
   details on that). The first one is a bit trickier I think, but at
   least there is a workaround for it (pass Map fields through an UDF)
  
   In my personal opinion, each of these problems is pretty severe, but
   opinions about the importance of the MAP Datatype and STREAM Operator,
   as well as Hadoop 0.20.2 compatibility might differ.
  
   so far ..
  
   Kai Londenberg
  
 
 
 
  --
  *Note that I'm no longer using my Yahoo! email address. Please email me
 at
  billgra...@gmail.com going forward.*
 



 --
 Russell Jurney twitter.com/rjurney russell.jur...@gmail.com
 datasyndrome.com

[jira] [Commented] (PIG-2353) RANK function like in SQL

2013-01-07 Thread Olga Natkovich (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-2353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13546114#comment-13546114
 ] 

Olga Natkovich commented on PIG-2353:
-

I believe we agreed that the document changes are included and reviewed as part 
of the patch. Since this was not done this way, we need to get a separate patch 
for docs,

 RANK function like in SQL
 -

 Key: PIG-2353
 URL: https://issues.apache.org/jira/browse/PIG-2353
 Project: Pig
  Issue Type: New Feature
Reporter: Gianmarco De Francisci Morales
Assignee: Allan Avendaño
  Labels: gsoc2012, mentor
 Fix For: 0.11

 Attachments: PIG-2353-2, PIG-2353-3.txt, PIG-2353-4.txt, 
 PIG-2353-5.txt, PIG2353.patch


 Implement a function that given a (sorted) bag adds to each tuple a unique, 
 increasing identifier without gaps, like what RANK does for SQL.
 This is a candidate project for Google summer of code 2012. More information 
 about the program can be found at 
 https://cwiki.apache.org/confluence/display/PIG/GSoc2012
 Functionality implemented so far, is available at 
 https://reviews.apache.org/r/5523/diff/#index_header

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (PIG-2764) Add a biginteger and bigdecimal type to pig

2012-12-18 Thread Olga Natkovich (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-2764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13535085#comment-13535085
 ] 

Olga Natkovich commented on PIG-2764:
-

I think having support for BigInteger would be very helpful. We have asks 
within Yahoo for it. 

 Add a biginteger and bigdecimal type to pig
 ---

 Key: PIG-2764
 URL: https://issues.apache.org/jira/browse/PIG-2764
 Project: Pig
  Issue Type: Improvement
Reporter: Jonathan Coveney
Assignee: Jonathan Coveney
 Attachments: fixedpoint.patch, PIG-2764-0.patch, PIG-2764-1.patch


 I think it would be useful for applications where precision is more important 
 than speed to have the option of using java's bigdecimal and biginteger types 
 natively.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (PIG-2764) Add a biginteger and bigdecimal type to pig

2012-12-18 Thread Olga Natkovich (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-2764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13535274#comment-13535274
 ] 

Olga Natkovich commented on PIG-2764:
-

I agree with using standard type.

 Add a biginteger and bigdecimal type to pig
 ---

 Key: PIG-2764
 URL: https://issues.apache.org/jira/browse/PIG-2764
 Project: Pig
  Issue Type: Improvement
Reporter: Jonathan Coveney
Assignee: Jonathan Coveney
 Attachments: fixedpoint.patch, PIG-2764-0.patch, PIG-2764-1.patch


 I think it would be useful for applications where precision is more important 
 than speed to have the option of using java's bigdecimal and biginteger types 
 natively.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Re: Our release process

2012-12-17 Thread Olga Natkovich

Hi Jonathan,

I thought I answered your email last week but I just noticed that the answer 
did not come through.

We tell users that at is coming in the next release. Now that Pig is quite 
mature and stable, we don't see much of this. Having more frequent releases 
definitely helps in this respect.

Olga






From: Jonathan Coveney jcove...@gmail.com
To: dev@pig.apache.org dev@pig.apache.org; Olga Natkovich 
onatkov...@yahoo.com 
Sent: Thursday, December 13, 2012 1:14 PM
Subject: Re: Our release process

Olga,

A related but separate question: what do y'all do when there is a feature
that is finished, but for an upcoming release? ie a feature in trunk, but
not in 0.11 (which, let us assume, is stable).

Jon


2012/12/13 Olga Natkovich onatkov...@yahoo.com

 Hi Julien,

 I think for us at Yahoo to be able to run our releases directly from the
 branch we would need the guarantees that I proposed in my initial email and
 something that we agreed to last year. The only changes that go in are

 - Failures without reasonable workarounds
 - Silent failures.

 My main concerns with the proposal is that I do not believe that our
 current testing infra is robust/inclusive enough to catch errors. That's
 why I am hesitant in widening the scope.

 I am fine with whatever the outcome the majority of people agrees with. I
 am just saying that Yahoo will likely need a private branch if our rules
 are too relaxed.

 Olga



 - Original Message -
 From: Julien Le Dem jul...@twitter.com
 To: dev@pig.apache.org dev@pig.apache.org; Olga Natkovich 
 onatkov...@yahoo.com
 Cc:
 Sent: Wednesday, December 12, 2012 4:54 PM
 Subject: Re: Our release process

 Agreed. The priority of a change is subjective as well.
 My definition for inclusion on the release branch:
 - Only bug fixes.
 - Only if they have fairly understood repercussions (up to the committers
 who +/-1 as usual).
 - If we thought it would not break things but still does (CI or externally
 reported failure) we revert it.
 What do you want to add/change? Please reformulate those rules the way you
 like and let's see how we can converge.
 (Also, let's keep it short for clarity)

 Julien

 On Wed, Dec 12, 2012 at 11:08 AM, Olga Natkovich onatkov...@yahoo.com
 wrote:

  Hi Julien,
 
  I understand what you are trying to do and I can see that being able to
  make more fixes post release has value for some use cases. My concern is
  that things that do not destabilize the branch is fairly subjective and
  also not always easy to ascertain beyond trivial changes. The only way I
  know to keep a code stable is to limit the updates. Also we need to
 clearly
  state what the constrains are for a post release commits so that every
 user
  can decide whether it works for them.
 
  Olga
 
 
  
  From: Julien Le Dem jul...@twitter.com
  To: dev@pig.apache.org dev@pig.apache.org
  Sent: Wednesday, December 12, 2012 10:26 AM
  Subject: Re: Our release process
 
  I think we all agree here, let's not jump to conclusions.
  Everything in this branch I am talking about is in Apache Pig. Everything
  we do in Pig is contributed.
  We have a branch for 0.11 where we keep merging the official 0.11 branch
  plus a few patches (and it will stay small) that are only in Apache
 TRUNK.
  The goal here is to help keeping the release branch stable by not adding
  patches that are only useful to us.
  Having this branch allows us to fix anything quickly and redeploy to
  production. It is also what allows us to use the pig 0.11 branch in
  production before it is even released.
  This definitely benefits the community and helps making 0.11 stable.
  This is a very reasonable way to keep using a recent version of Pig in
  production.
 
  Olga: My goal is to decrease the scope of what is going in the release
  branch and to make sure we add only bug fixes that are not making it
  unstable. I also think having a short definition of this helps which is
 why
  I have been chiming in.
  Let us know how you want to decrease the scope. I'm just trying to
 simplify
  here.
 
  Julien
 
 
 
  On Tue, Dec 11, 2012 at 8:54 AM, Prashant Kommireddi 
 prash1...@gmail.com
  wrote:
 
   Share the same concern as Russell here. Not great for the project for
   everyone to go private branch approach.
  
   On Tue, Dec 11, 2012 at 8:33 AM, Russell Jurney 
  russell.jur...@gmail.com
   wrote:
  
Wait. Ack. Do we want everyone to do this? This sounds like
   fragmentation.
:(
   
Russell Jurney twitter.com/rjurney
   
   
On Dec 10, 2012, at 3:24 PM, Olga Natkovich onatkov...@yahoo.com
   wrote:
   
 If everybody is using a private branch then

 (1) We are not serving a significant part of our community
 (2) There is no motivation to contribute those patches to branches
   (only
to trunk).

 Yahoo has been trying hard to work of the Apache branches but if we
increase the scope of what is going

[jira] [Commented] (PIG-2764) Add a biginteger and bigdecimal type to pig

2012-12-14 Thread Olga Natkovich (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-2764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13532746#comment-13532746
 ] 

Olga Natkovich commented on PIG-2764:
-

Is anybody working on this or planning to in the near future?

 Add a biginteger and bigdecimal type to pig
 ---

 Key: PIG-2764
 URL: https://issues.apache.org/jira/browse/PIG-2764
 Project: Pig
  Issue Type: Improvement
Reporter: Jonathan Coveney
Assignee: Jonathan Coveney
 Attachments: fixedpoint.patch, PIG-2764-0.patch, PIG-2764-1.patch


 I think it would be useful for applications where precision is more important 
 than speed to have the option of using java's bigdecimal and biginteger types 
 natively.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Re: Our release process

2012-12-13 Thread Olga Natkovich

Hi Julien,

I think for us at Yahoo to be able to run our releases directly from the branch 
we would need the guarantees that I proposed in my initial email and something 
that we agreed to last year. The only changes that go in are

- Failures without reasonable workarounds
- Silent failures.

My main concerns with the proposal is that I do not believe that our current 
testing infra is robust/inclusive enough to catch errors. That's why I 
am hesitant in widening the scope.

I am fine with whatever the outcome the majority of people agrees with. I am 
just saying that Yahoo will likely need a private branch if our rules are too 
relaxed.

Olga



- Original Message -
From: Julien Le Dem jul...@twitter.com
To: dev@pig.apache.org dev@pig.apache.org; Olga Natkovich 
onatkov...@yahoo.com
Cc: 
Sent: Wednesday, December 12, 2012 4:54 PM
Subject: Re: Our release process

Agreed. The priority of a change is subjective as well.
My definition for inclusion on the release branch:
- Only bug fixes.
- Only if they have fairly understood repercussions (up to the committers
who +/-1 as usual).
- If we thought it would not break things but still does (CI or externally
reported failure) we revert it.
What do you want to add/change? Please reformulate those rules the way you
like and let's see how we can converge.
(Also, let's keep it short for clarity)

Julien

On Wed, Dec 12, 2012 at 11:08 AM, Olga Natkovich onatkov...@yahoo.comwrote:

 Hi Julien,

 I understand what you are trying to do and I can see that being able to
 make more fixes post release has value for some use cases. My concern is
 that things that do not destabilize the branch is fairly subjective and
 also not always easy to ascertain beyond trivial changes. The only way I
 know to keep a code stable is to limit the updates. Also we need to clearly
 state what the constrains are for a post release commits so that every user
 can decide whether it works for them.

 Olga


 
 From: Julien Le Dem jul...@twitter.com
 To: dev@pig.apache.org dev@pig.apache.org
 Sent: Wednesday, December 12, 2012 10:26 AM
 Subject: Re: Our release process

 I think we all agree here, let's not jump to conclusions.
 Everything in this branch I am talking about is in Apache Pig. Everything
 we do in Pig is contributed.
 We have a branch for 0.11 where we keep merging the official 0.11 branch
 plus a few patches (and it will stay small) that are only in Apache TRUNK.
 The goal here is to help keeping the release branch stable by not adding
 patches that are only useful to us.
 Having this branch allows us to fix anything quickly and redeploy to
 production. It is also what allows us to use the pig 0.11 branch in
 production before it is even released.
 This definitely benefits the community and helps making 0.11 stable.
 This is a very reasonable way to keep using a recent version of Pig in
 production.

 Olga: My goal is to decrease the scope of what is going in the release
 branch and to make sure we add only bug fixes that are not making it
 unstable. I also think having a short definition of this helps which is why
 I have been chiming in.
 Let us know how you want to decrease the scope. I'm just trying to simplify
 here.

 Julien



 On Tue, Dec 11, 2012 at 8:54 AM, Prashant Kommireddi prash1...@gmail.com
 wrote:

  Share the same concern as Russell here. Not great for the project for
  everyone to go private branch approach.
 
  On Tue, Dec 11, 2012 at 8:33 AM, Russell Jurney 
 russell.jur...@gmail.com
  wrote:
 
   Wait. Ack. Do we want everyone to do this? This sounds like
  fragmentation.
   :(
  
   Russell Jurney twitter.com/rjurney
  
  
   On Dec 10, 2012, at 3:24 PM, Olga Natkovich onatkov...@yahoo.com
  wrote:
  
If everybody is using a private branch then
   
(1) We are not serving a significant part of our community
(2) There is no motivation to contribute those patches to branches
  (only
   to trunk).
   
Yahoo has been trying hard to work of the Apache branches but if we
   increase the scope of what is going into branches, we will go with
  private
   branch approach as well.
   
Olga
   
   

From: Julien Le Dem jul...@twitter.com
To: Olga Natkovich onatkov...@yahoo.com
Cc: dev@pig.apache.org dev@pig.apache.org; Santhosh M S 
   santhosh_mut...@yahoo.com; billgra...@gmail.com 
 billgra...@gmail.com
  
Sent: Friday, December 7, 2012 3:54 PM
Subject: Re: Our release process
   
Here's my criteria for inclusion in a release branch:
- no new feature. Only bug fixes.
- The criteria is more about stability than priority. The
 person/group
asking for it has a good reason for wanting it in the branch. If
   commiters
think the patch is reasonable and won't make the branch unstable then
  we
should check it in. If it breaks something anyway, we revert it.
   
For what it's worth we (at Twitter) maintain an internal branch

Re: Our release process

2012-12-12 Thread Olga Natkovich

Hi Julien,

I understand what you are trying to do and I can see that being able to make 
more fixes post release has value for some use cases. My concern is that 
things that do not destabilize the branch is fairly subjective and also not 
always easy to ascertain beyond trivial changes. The only way I know to keep a 
code stable is to limit the updates. Also we need to clearly state what the 
constrains are for a post release commits so that every user can decide whether 
it works for them.

Olga



From: Julien Le Dem jul...@twitter.com
To: dev@pig.apache.org dev@pig.apache.org 
Sent: Wednesday, December 12, 2012 10:26 AM
Subject: Re: Our release process

I think we all agree here, let's not jump to conclusions.
Everything in this branch I am talking about is in Apache Pig. Everything
we do in Pig is contributed.
We have a branch for 0.11 where we keep merging the official 0.11 branch
plus a few patches (and it will stay small) that are only in Apache TRUNK.
The goal here is to help keeping the release branch stable by not adding
patches that are only useful to us.
Having this branch allows us to fix anything quickly and redeploy to
production. It is also what allows us to use the pig 0.11 branch in
production before it is even released.
This definitely benefits the community and helps making 0.11 stable.
This is a very reasonable way to keep using a recent version of Pig in
production.

Olga: My goal is to decrease the scope of what is going in the release
branch and to make sure we add only bug fixes that are not making it
unstable. I also think having a short definition of this helps which is why
I have been chiming in.
Let us know how you want to decrease the scope. I'm just trying to simplify
here.

Julien



On Tue, Dec 11, 2012 at 8:54 AM, Prashant Kommireddi prash1...@gmail.comwrote:

 Share the same concern as Russell here. Not great for the project for
 everyone to go private branch approach.

 On Tue, Dec 11, 2012 at 8:33 AM, Russell Jurney russell.jur...@gmail.com
 wrote:

  Wait. Ack. Do we want everyone to do this? This sounds like
 fragmentation.
  :(
 
  Russell Jurney twitter.com/rjurney
 
 
  On Dec 10, 2012, at 3:24 PM, Olga Natkovich onatkov...@yahoo.com
 wrote:
 
   If everybody is using a private branch then
  
   (1) We are not serving a significant part of our community
   (2) There is no motivation to contribute those patches to branches
 (only
  to trunk).
  
   Yahoo has been trying hard to work of the Apache branches but if we
  increase the scope of what is going into branches, we will go with
 private
  branch approach as well.
  
   Olga
  
  
   
   From: Julien Le Dem jul...@twitter.com
   To: Olga Natkovich onatkov...@yahoo.com
   Cc: dev@pig.apache.org dev@pig.apache.org; Santhosh M S 
  santhosh_mut...@yahoo.com; billgra...@gmail.com billgra...@gmail.com
 
   Sent: Friday, December 7, 2012 3:54 PM
   Subject: Re: Our release process
  
   Here's my criteria for inclusion in a release branch:
   - no new feature. Only bug fixes.
   - The criteria is more about stability than priority. The person/group
   asking for it has a good reason for wanting it in the branch. If
  commiters
   think the patch is reasonable and won't make the branch unstable then
 we
   should check it in. If it breaks something anyway, we revert it.
  
   For what it's worth we (at Twitter) maintain an internal branch where
 we
   add patches we need and I would suggest anybody that wants to be able
 to
   make emergency fixes to their own deployment to do the same. We do keep
   that branch as close to apache as we can but it has a few patches that
  are
   in trunk only and do not satisfy the no new feature criteria.
  
   What does the PMC think ?
  
   Julien
  
  
  
  
   On Tue, Dec 4, 2012 at 12:46 PM, Olga Natkovich onatkov...@yahoo.com
  wrote:
  
   I am ok with tests running nightly and reverting patches that cause
   failures. We used to have that. Does anybody know what happened? Is
  anybody
   volunteering to make it work again?
  
   I would like to see specific criteria for what goes into the branch
 been
   published (rather than case-by-case). This way each team can decided
 if
  the
   criteria stringent enough of if they need to run a private branch.
  
   Olga
  
      --
   *From:* Santhosh M S santhosh_mut...@yahoo.com
   *To:* Julien Le Dem jul...@twitter.com; dev@pig.apache.org 
   dev@pig.apache.org
   *Cc:* billgra...@gmail.com billgra...@gmail.com
   *Sent:* Friday, November 30, 2012 11:46 PM
  
   *Subject:* Re: Our release process
  
   HI Julien,
  
   You are making most of the points that I did on this thread (CI for
 e2e,
   not burdening clean e2e prior to every commit for a release branch).
 The
   only point on which there is no clear agreement is the definition of a
  bug
   that can be included in a previously released branch. I am fine with a
  case
   by case

Re: Our release process

2012-12-04 Thread Olga Natkovich

: Re: Our release process

 I agree releasing often is ideal, but releasing major versions once a month
 would be a bit agressive.

 +1 to Olga's initial definition of how Yahoo! determines what goes into a
 released branch. Basically is something broken without a workaround or is
 there potential silent data loss. Trying to get a more granular definition
 than that (i.e. P1, P2, severity, etc) will be
painful. The reality in that
 case is that for whomever is blocked by the bug will consider it a P1.

 Fixes need to be relatively low-risk though to keep stability, but this is
 also subjective. For this I'm in favor of relying on developer and reviewer
 judgement to make that call and I'm +1 to Alan's proposal of rolling back
 patches that break the e2e tests or anything else.

 I think our policy should avoid time-based consideration on how many
 quarters away are we from the next major release since that's also
 impossible to quantify. Plus, if the answer to the question is that we're
 more than 1-2 quarters from the next release is yes then we should be
 fixing that release problem.


 On Wed, Nov 28, 2012 at 10:22 AM, Julien Le Dem jul...@twitter.com wrote:

 I would really like to see us doing frequent releases (at least once
 per quarter if not once a month).
 I think the whole notion of priority or being a blocker is subjective.
 Releasing infrequently pressures us to push more changes than we would
 want to the release branch.
 We should focus on keeping TRUNK stable as well so that it is easier
 to release and users can do more frequent and smaller upgrades.

 There should be a small enough number of patches going in the release
 branch so that we can get agreement on whether we check them in or
 not.
 I like Alan's proposal of reverting quickly when there's a problem.
 Again, this becomes less of a problem if we release more
often.

 Which leads me to my next question: what are the next steps for
 releasing pig 0.11 ?

 Julien

 On Tue, Nov 27, 2012 at 10:22 PM, Santhosh M S
 santhosh_mut...@yahoo.com wrote:
  Hi Olga,
 
  For a moment, I will move away from P1 and P2 which are related to
 priorities and use the Severity definitions.
 
  The standard bugzilla definitions for severity are:
 
  Blocker - Blocks development and/or testing work.
  Critical - Crashes, loss of data, severe memory leak.
  Major - Major loss of function.
 
  I am
skipping the other levels (normal, minor and trivial) for this
 discussion.
 
  Coming back to priorities, the proposed definitions map P1 to Blocker
 and Critical. I am proposing mapping P2 to Major even when there are known
 workarounds. We are doing this since JIRA does not have severity by default
 (see: https://confluence.atlassian.com/pages/viewpage.action?pageId=192840
 )
 
  I am proposing that P2s be included in the released branch only when
 trunk or unreleased versions are known to be backward incompatible or if
 the release is more than a quarter (or two) away.
 
  Thanks,
  Santhosh


  
   From: Olga Natkovich onatkov...@yahoo.com
  To: dev@pig.apache.org dev@pig.apache.org; Santhosh M S 
 santhosh_mut...@yahoo.com
  Sent: Tuesday, November 27, 2012 10:41 AM
  Subject: Re: Our release process
 
  Hi Santhosh,
 
  What is your definition of P2s?
 
  Olga
 
 
  - Original
Message -
  From: Santhosh M S santhosh_mut...@yahoo.com
  To: dev@pig.apache.org dev@pig.apache.org; Olga Natkovich 
 onatkov...@yahoo.com
  Cc:
  Sent: Monday, November 26, 2012 11:49 PM
  Subject: Re: Our release process
 
  Hi Olga,
 
  I agree that we cannot guarantee backward compatibility upfront. With
 that knowledge, I am proposing a small modification to your proposal.


  1. If the trunk or unreleased version is known to be backwards
 compatible then only P1 issues go into the released branch.
  2. If the the trunk or unreleased version is known to be backwards
 incompatible or the release is a long ways off (two quarters?) then we
 should allow for dot releases on the branch, i.e., P1 and P2 issues.
 
  I am hoping that should provide an incentive for users to move to a
 higher release and at the same time allow developers to fix issues of
 significance without impacting stability.
 
  Thanks,
  Santhosh
 
 
  
  From: Olga Natkovich onatkov...@yahoo.com
  To: dev@pig.apache.org dev@pig.apache.org
  Sent: Monday, November 26, 2012 9:38 AM
  Subject: Re: Our release process
 
  Hi Santhosh,
 
  I understand the compatibility issue though I am not sure we can
 guarantee it for all releases upfront but agree that we should make an
 effort.
 
  On the e2e tests, part of the proposal is only do make P1 type of
 changes to the branch after the initial release so they should be rare.
 
  Olga
 


  
  From: Santhosh M S santhosh_mut...@yahoo.com
  To: Olga Natkovich onatkov...@yahoo.com; dev@pig.apache.org 
 dev@pig.apache.org
  Sent: Monday, November 26, 2012 12:00 AM

Re: Our release process

2012-11-27 Thread Olga Natkovich

Hi Santhosh,

What is your definition of P2s?

Olga

- Original Message -
From: Santhosh M S santhosh_mut...@yahoo.com
To: dev@pig.apache.org dev@pig.apache.org; Olga Natkovich 
onatkov...@yahoo.com
Cc: 
Sent: Monday, November 26, 2012 11:49 PM
Subject: Re: Our release process

Hi Olga,

I agree that we cannot guarantee backward compatibility upfront. With that 
knowledge, I am proposing a small modification to your proposal.

1. If the trunk or unreleased version is known to be backwards compatible then 
only P1 issues go into the released branch.
2. If the the trunk or unreleased version is known to be backwards incompatible 
or the release is a long ways off (two quarters?) then we should allow for dot 
releases on the branch, i.e., P1 and P2 issues.

I am hoping that should provide an incentive for users to move to a higher 
release and at the same time allow developers to fix issues of significance 
without impacting stability.

Thanks,
Santhosh

From: Olga Natkovich onatkov...@yahoo.com
To: dev@pig.apache.org dev@pig.apache.org 
Sent: Monday, November 26, 2012 9:38 AM
Subject: Re: Our release process

Hi Santhosh,

I understand the compatibility issue though I am not sure we can guarantee it 
for all releases upfront but agree that we should make an effort.

On the e2e tests, part of the proposal is only do make P1 type of changes to 
the branch after the initial release so they should be rare.

Olga

From: Santhosh M S santhosh_mut...@yahoo.com
To: Olga Natkovich onatkov...@yahoo.com; dev@pig.apache.org 
dev@pig.apache.org 
Sent: Monday, November 26, 2012 12:00 AM
Subject: Re: Our release process

It takes too long to run. If the e2e tests are run every night or a reasonable 
timeframe then it will reduce the barrier for submitting patches. The context 
for this: the reluctance of folks to move to a higher version when the higher 
version is not backward compatible.

Santhosh

From: Olga Natkovich onatkov...@yahoo.com
To: dev@pig.apache.org dev@pig.apache.org; Santhosh M S 
santhosh_mut...@yahoo.com 
Sent: Sunday, November 25, 2012 5:56 PM
Subject: Re: Our release process

Hi Santhosh,

Can you clarify why running e2e tests on every checking is a problem?

Olga

From: Santhosh M S santhosh_mut...@yahoo.com
To: dev@pig.apache.org dev@pig.apache.org 
Sent: Monday, November 19, 2012 3:48 PM
Subject: Re: Our release process

The push for an upgrade will work only if the higher release is backward 
compatible with the lower release. If not, folks will tend to use private 
branches. Having a stable branch on a large deployment is a good indicator of 
stability. However, please note that there have been instances where some 
releases were never adopted. I will be extremely careful in applying the rule of
running e2e tests for every commit to a released branch.

If we release every quarter (hopefully) and preserve backward compatibility 
then I am +1 to the proposal. If the backward compatibility is not preserved 
then I am -1 for having to run e2e for every commit to a released branch.

Santhosh

From: Jonathan Coveney jcove...@gmail.com
To: dev@pig.apache.org dev@pig.apache.org 
Sent: Tuesday, November 6, 2012 6:34 PM
Subject: Re: Our release process

I think it might be good to clarify (for me) a couple of cases:

1. we have branched a new release
2. an existing release

The way I understand things, in the case of 1, we have
a backlog of patches
(not all of which are P1 bugs), and that's ok. If a new bad bug comes in
(the subject of debate here), then it goes in anyway (and in some cases,
would go into 0.9 etc).

Olga is saying that for existing release (0.9, 0.10), we should only commit
P1 bug fixes there. This makes sense to me, as we're fixing the official
release in place.

IMHO, this would encourage people to use newer release (as this is where
the latest and greatest stuff is, including non-critical bug fixes). Olga's
criteria is a pretty clear barrier for inclusion into these releases. With
old releases, I think the key is really that they keep doing what they have
always done. Most bugs are well understood by now, and the ones that aren't
will no doubt be P1.

I'm not decided (thus no formal +1 or whatnot), but Olga's point seems
pretty reasonable to me, especially given that trunk has pretty
liberal
development. Once it gets tidied up, I can understand not wanting to jostle
it.

2012/11/5 Alan Gates ga...@hortonworks.com

 Jonathan, for clarity, are you saying you agree that we should only put
 bug fixes in branches or we should only put high priority bug fixes in
 branches?  I think we all agree on the former, but there appear to be
 different views on the latter.

 Alan.

 On Nov 5, 2012, at 4:53 PM, Jonathan Coveney wrote:

  This seems to make sense to me. People can always back-port features

Re: Our release process

2012-11-26 Thread Olga Natkovich

Hi Santhosh,

I understand the compatibility issue though I am not sure we can guarantee it 
for all releases upfront but agree that we should make an effort.

On the e2e tests, part of the proposal is only do make P1 type of changes to 
the branch after the initial release so they should be rare.

Olga



 From: Santhosh M S santhosh_mut...@yahoo.com
To: Olga Natkovich onatkov...@yahoo.com; dev@pig.apache.org 
dev@pig.apache.org 
Sent: Monday, November 26, 2012 12:00 AM
Subject: Re: Our release process
 

It takes too long to run. If the e2e tests are run every night or a reasonable 
timeframe then it will reduce the barrier for submitting patches. The context 
for this: the reluctance of folks to move to a higher version when the higher 
version is not backward compatible.

Santhosh



 From: Olga Natkovich onatkov...@yahoo.com
To: dev@pig.apache.org dev@pig.apache.org; Santhosh M S 
santhosh_mut...@yahoo.com 
Sent: Sunday, November 25, 2012 5:56 PM
Subject: Re: Our release process
 
Hi Santhosh,

Can you clarify why running e2e tests on every checking is a problem?

Olga



From: Santhosh M S santhosh_mut...@yahoo.com
To: dev@pig.apache.org dev@pig.apache.org 
Sent: Monday, November 19, 2012 3:48 PM
Subject: Re: Our release process

The push for an upgrade will work only if the higher release is backward 
compatible with the lower release. If not, folks will tend to use private 
branches. Having a stable branch on a large deployment is a good indicator of 
stability. However, please note that there have been instances where some 
releases were never adopted. I will be extremely careful in applying the rule of
 running e2e tests for every commit to a released branch.

If we release every quarter (hopefully) and preserve backward compatibility 
then I am +1 to the proposal. If the backward compatibility is not preserved 
then I am -1 for having to run e2e for every commit to a released branch.

Santhosh



From: Jonathan Coveney jcove...@gmail.com
To: dev@pig.apache.org dev@pig.apache.org 
Sent: Tuesday, November 6, 2012 6:34 PM
Subject: Re: Our release process

I think it might be good to clarify (for me) a couple of cases:

1. we have branched a new release
2. an existing release

The way I understand things, in the case of 1, we have
 a backlog of patches
(not all of which are P1 bugs), and that's ok. If a new bad bug comes in
(the subject of debate here), then it goes in anyway (and in some cases,
would go into 0.9 etc).

Olga is saying that for existing release (0.9, 0.10), we should only commit
P1 bug fixes there. This makes sense to me, as we're fixing the official
release in place.

IMHO, this would encourage people to use newer release (as this is where
the latest and greatest stuff is, including non-critical bug fixes). Olga's
criteria is a pretty clear barrier for inclusion into these releases. With
old releases, I think the key is really that they keep doing what they have
always done. Most bugs are well understood by now, and the ones that aren't
will no doubt be P1.

I'm not decided (thus no formal +1 or whatnot), but Olga's point seems
pretty reasonable to me, especially given that trunk has pretty
 liberal
development. Once it gets tidied up, I can understand not wanting to jostle
it.


2012/11/5 Alan Gates ga...@hortonworks.com

 Jonathan, for clarity, are you saying you agree that we should only put
 bug fixes in branches or we should only put high priority bug fixes in
 branches?  I think we all agree on the former, but there appear to be
 different views on the latter.

 Alan.

 On Nov 5, 2012, at 4:53 PM, Jonathan Coveney wrote:

  This seems to make sense to me. People can always back-port features, and
  this encourages them to use the newer ones. It also means we will be more
  rigorous about stability, which is good as it is a big plus for Pig. I
  think for older branches, stability trumps features in a big way.

 
 
  2012/11/5 Gianmarco De Francisci Morales g...@apache.org
 
  Hi,
 
  On Mon, Nov 5, 2012 at 10:48 AM, Olga Natkovich onatkov...@yahoo.com
  wrote:
  Hi Gianmarco,
 
  Thanks for your comments. Here is a little more information.
 
  At Yahoo, we consider the following issues to be P1:
 
  (1) Bugs that cause wrong results being produced silently
  (2) Bugs that cause failures with no easy workaround
 
 
  Thanks Olga, now I get what you mean.
  I don't have a strong opinion on
 this.
  On one hand I see why you don't want to put too many patches in the
  branches in order to keep things stable.
  On the other hand when we do a 0.10.x release with x0 the users would
  like to have as many bugs fixed as possible.
 
  Regarding tests. I would suggest we have different rules for trunk and
  branches:
 
  (1) For branches, I think we should run the full regression suite
  (including e2e) prior to commit. This way we can ensure branch stability

Re: Our release process

2012-11-25 Thread Olga Natkovich

Hi Santhosh,

Can you clarify why running e2e tests on every checking is a problem?

Olga

From: Santhosh M S santhosh_mut...@yahoo.com
To: dev@pig.apache.org dev@pig.apache.org 
Sent: Monday, November 19, 2012 3:48 PM
Subject: Re: Our release process

The push for an upgrade will work only if the higher release is backward 
compatible with the lower release. If not, folks will tend to use private 
branches. Having a stable branch on a large deployment is a good indicator of 
stability. However, please note that there have been instances where some 
releases were never adopted. I will be extremely careful in applying the rule 
of running e2e tests for every commit to a released branch.

If we release every quarter (hopefully) and preserve backward compatibility 
then I am +1 to the proposal. If the backward compatibility is not preserved 
then I am -1 for having to run e2e for every commit to a released branch.

Santhosh

From: Jonathan Coveney jcove...@gmail.com
To: dev@pig.apache.org dev@pig.apache.org 
Sent: Tuesday, November 6, 2012 6:34 PM
Subject: Re: Our release process

I think it might be good to clarify (for me) a couple of cases:

1. we have branched a new release
2. an existing release

The way I understand things, in the case of 1, we have a backlog of patches
(not all of which are P1 bugs), and that's ok. If a new bad bug comes in
(the subject of debate here), then it goes in anyway (and in some cases,
would go into 0.9 etc).

Olga is saying that for existing release (0.9, 0.10), we should only commit
P1 bug fixes there. This makes sense to me, as we're fixing the official
release in place.

IMHO, this would encourage people to use newer release (as this is where
the latest and greatest stuff is, including non-critical bug fixes). Olga's
criteria is a pretty clear barrier for inclusion into these releases. With
old releases, I think the key is really that they keep doing what they have
always done. Most bugs are well understood by now, and the ones that aren't
will no doubt be P1.

I'm not decided (thus no formal +1 or whatnot), but Olga's point seems
pretty reasonable to me, especially given that trunk has pretty liberal
development. Once it gets tidied up, I can understand not wanting to jostle
it.

2012/11/5 Alan Gates ga...@hortonworks.com

 Jonathan, for clarity, are you saying you agree that we should only put
 bug fixes in branches or we should only put high priority bug fixes in
 branches?  I think we all agree on the former, but there appear to be
 different views on the latter.

 Alan.

 On Nov 5, 2012, at 4:53 PM, Jonathan Coveney wrote:

  This seems to make sense to me. People can always back-port features, and
  this encourages them to use the newer ones. It also means we will be more
  rigorous about stability, which is good as it is a big plus for Pig. I
  think for older branches, stability trumps features in a big way.

  2012/11/5 Gianmarco De Francisci Morales g...@apache.org

  Hi,

  On Mon, Nov 5, 2012 at 10:48 AM, Olga Natkovich onatkov...@yahoo.com
  wrote:
  Hi Gianmarco,

  Thanks for your comments. Here is a little more information.

  At Yahoo, we consider the following issues to be P1:

  (1) Bugs that cause wrong results being produced silently
  (2) Bugs that cause failures with no easy workaround

  Thanks Olga, now I get what you mean.
  I don't have a strong opinion on this.
  On one hand I see why you don't want to put too many patches in the
  branches in order to keep things stable.
  On the other hand when we do a 0.10.x release with x0 the users would
  like to have as many bugs fixed as possible.

  Regarding tests. I would suggest we have different rules for trunk and
  branches:

  (1) For branches, I think we should run the full regression suite
  (including e2e) prior to commit. This way we can ensure branch stability
  and, as number of patches should be small, will not be a burden
  (2) For trunk, we can go with test-commit only and fix things quickly
  when things break.

  I think this makes sense. +1

  Olga

  Cheers,
  --
  Gianmarco

Re: Our release process

2012-11-05 Thread Olga Natkovich

Hi Gianmarco,

Thanks for your comments. Here is a little more information.

At Yahoo, we consider the following issues to be P1:

(1) Bugs that cause wrong results being produced silently
(2) Bugs that cause failures with no easy workaround

Regarding tests. I would suggest we have different rules for trunk and branches:

(1) For branches, I think we should run the full regression suite (including 
e2e) prior to commit. This way we can ensure branch stability and, as number of 
patches should be small, will not be a burden
(2) For trunk, we can go with test-commit only and fix things quickly when 
things break.

Olga



From: Gianmarco De Francisci Morales g...@apache.org
To: dev@pig.apache.org; Olga Natkovich onatkov...@yahoo.com 
Sent: Monday, November 5, 2012 10:37 AM
Subject: Re: Our release process

Hi,

Sure we don't want to commit patches that destabilize the code base.
However, unfortunately, there is no way to know whether a patch will
destabilize the code or not. Even testing is only a heuristic. So how do we
draw the line?
We seem to agree that only bug fixing should go into branches. However it
seems that we have two different views on the policy: Olga is proposing to
have only P1 bugs fixed, while Alan is suggesting to be more lax on what
goes into the branches.
Regardless of the policy chosen, how do we define the priority of a bug? By
how many users are affected? By whether it can corrupt data? Is there a
formal definition we can agree on? Otherwise defining a policy becomes hard.

The test-commit task does not run full regression because the full test
suite takes too long to execute. And I agree that asking to run the full
test suite before committing any change slows down the (already slow)
review process.
However, I would be fine with running the full test suite for bug fixes
that need to go into branches, in order to guarantee absence of regressions.

Cheers,
--
Gianmarco



On Sun, Nov 4, 2012 at 5:17 PM, Olga Natkovich onatkov...@yahoo.com wrote:

 I can see how this would work for research projects but for real
 production this will not work. And I actually meant much more stringent
 stability. I don't think we should commit patches to either trunk or branch
 that destabilize the tree. We used to run full regression before each
 commit - is this no longer the case? By stability I meant very few things
 go into the branch. I know that pig has pretty decent tests - better
 coverage than many other projects. However, we do not have any testing at
 scale and inevitably, users end up doing testing. So any time we deploy new
 major version, it takes us at least a month to get it stable and once it is
 stabilized we want to keep it this way.

 So for us at Yahoo, the only way to work directly from the branch is to go
 by our original plan. If that is not possible, we would go with the private
 git branch.

 Olga


 
  From: Alan Gates ga...@hortonworks.com
 To: dev@pig.apache.org
 Sent: Friday, November 2, 2012 8:19 PM
 Subject: Re: Our release process

 I am all for maintaining stability of branches, and the trunk, as everyone
 benefits from it.  But I do not think this means we should limit bug fixing
 in the branches to only critical issues.  As Pig gets more users we have
 more and more people on older branches who will want fixes for bugs without
 dealing with bigger version changes.  So I am not in favor of limiting
 checkins to branches to P1 issues.

 What if we maintain stability on the branches by quickly reverting any
 patches that break the build, the unit tests, or the e2e tests?  This
 allows us to move forward with bug fix versions, it allows those who depend
 on branch stability (which I suspect is everyone in the distribution
 business plus everyone rolling their own Pig), and it should promote
 developer responsibility (no one likes having their patches reverted).

 Alan.

 On Nov 2, 2012, at 3:58 PM, Olga Natkovich wrote:

  Hi guys,
 
  Mid next year, we agreed on a release process documented in this thread:
 http://www.mail-archive.com/dev@pig.apache.org/msg04172.html.
 
  Since then, we have not really followed either of its two rules:
 
  (1) Frequent (every 3 month releases)
  (2) Branch stability (only P1 issues on the branch).
 
  So I wanted to revisit our release procedure to make sure we have one
 that we can actually follow.
 
  For us at Yahoo, branch stability is very important since we release all
 the patches directly from the branch. If we can't rely on the fact that
 only critical fixes go in, we will need to resort to git branches that will
 make the whole process very comberson because we now need to hand pick
 patches from the apache branch and port them onto our private branch. I
 would imaging that others using Pig in production would have similar issues.
 
  Olga
 
 
  Olga

Re: Our release process

2012-11-04 Thread Olga Natkovich

I can see how this would work for research projects but for real production 
this will not work. And I actually meant much more stringent stability. I don't 
think we should commit patches to either trunk or branch that destabilize the 
tree. We used to run full regression before each commit - is this no longer the 
case? By stability I meant very few things go into the branch. I know that pig 
has pretty decent tests - better coverage than many other projects. However, we 
do not have any testing at scale and inevitably, users end up doing testing. So 
any time we deploy new major version, it takes us at least a month to get it 
stable and once it is stabilized we want to keep it this way.

So for us at Yahoo, the only way to work directly from the branch is to go by 
our original plan. If that is not possible, we would go with the private git 
branch.

Olga



 From: Alan Gates ga...@hortonworks.com
To: dev@pig.apache.org 
Sent: Friday, November 2, 2012 8:19 PM
Subject: Re: Our release process
 
I am all for maintaining stability of branches, and the trunk, as everyone 
benefits from it.  But I do not think this means we should limit bug fixing in 
the branches to only critical issues.  As Pig gets more users we have more and 
more people on older branches who will want fixes for bugs without dealing with 
bigger version changes.  So I am not in favor of limiting checkins to branches 
to P1 issues.

What if we maintain stability on the branches by quickly reverting any patches 
that break the build, the unit tests, or the e2e tests?  This allows us to move 
forward with bug fix versions, it allows those who depend on branch stability 
(which I suspect is everyone in the distribution business plus everyone rolling 
their own Pig), and it should promote developer responsibility (no one likes 
having their patches reverted).

Alan.
  
On Nov 2, 2012, at 3:58 PM, Olga Natkovich wrote:

 Hi guys,
  
 Mid next year, we agreed on a release process documented in this thread: 
 http://www.mail-archive.com/dev@pig.apache.org/msg04172.html.
  
 Since then, we have not really followed either of its two rules:
  
 (1) Frequent (every 3 month releases)
 (2) Branch stability (only P1 issues on the branch).
  
 So I wanted to revisit our release procedure to make sure we have one that we 
 can actually follow.
  
 For us at Yahoo, branch stability is very important since we release all the 
 patches directly from the branch. If we can't rely on the fact that only 
 critical fixes go in, we will need to resort to git branches that will make 
 the whole process very comberson because we now need to hand pick patches 
 from the apache branch and port them onto our private branch. I would imaging 
 that others using Pig in production would have similar issues.
  
 Olga
  
  
 Olga

Re: Pig 0.11

2012-11-04 Thread Olga Natkovich

We are still at 43 open tickets. How do you guys like to proceed? 

Olga

- Original Message -
From: Olga Natkovich onatkov...@yahoo.com
To: dev@pig.apache.org dev@pig.apache.org; Olga Natkovich 
onatkov...@yahoo.com
Cc: 
Sent: Tuesday, October 30, 2012 9:07 AM
Subject: Re: Pig 0.11

We are down to 45 tickets. Thanks for everybody who helped with the cleanup. We 
only have a couple of unassigned in the area of documentation and testing. Now 
we need to go through the assigned ones and see what can be done for 0.11.

Here is a list of people with many tickets. Please review what you are planning 
to complete in the next couple of weeks and unlink the rest.

Daniel - 15+ 
John Gordon - 6
Jonathan Coveney - 6

Thanks,

Olga

From: Olga Natkovich onatkov...@yahoo.com
To: dev@pig.apache.org dev@pig.apache.org 
Sent: Friday, October 26, 2012 4:32 PM
Subject: Re: Pig 0.11

74 issues still open and more than half unassigned. I think we should narrow 
list down next week. I am planning to start unlinking the unassigned ones next 
week so if you feel they need to be addressed, please, find owner.

Olga

- Original Message -
From: Olga Natkovich onatkov...@yahoo.com
To: dev@pig.apache.org dev@pig.apache.org
Cc: 
Sent: Monday, October 22, 2012 10:14 AM
Subject: Re: Pig 0.11

There are still 76 unresolved JIRAs more than half unassigned. Lets clean this 
up by theend of this week. I propose we do the following:

(1) Unlink all JIRAs for new features since we already branched so we should 
not be taken on new work. If people feel strongly that some new features still 
need to go in please bring it up.
(2) For bug fixes, if people fill strongly that some of the unassigned issues 
need to be addressed please take ownership. If you are unable to solve them but 
still feel they are important, please, bring them up.
(3) Owners of unresolved issues, please, take a look if you will have time to 
solve them in the next 2 weeks. If not, lets move them to 12. If you can't 
address them but feel they are important, please, bring it up.

Lets make sure that all JIRAs that require changes to the documentation have 
appropriate information in the release notes section so that we can quickly 
compile release documentation.

Thanks for you help!

Olga

From: Alan Gates ga...@hortonworks.com
To: dev@pig.apache.org 
Sent: Monday, October 15, 2012 11:55 AM
Subject: Re: Pig 0.11

At this point no one has taken on release documentation for 0.11.

Alan.

On Oct 15, 2012, at 11:49 AM, Olga Natkovich wrote:

 Thanks!

 Are you talking about items 15 and 16 on the How To Release.Publish  page? 

 Also, who is doing release documentation these days? I can help with that as 
 well. I would also be happy to roll the release if you guys need help with 
 that.

 Olga

 From: Dmitriy Ryaboy dvrya...@gmail.com
 To: dev@pig.apache.org dev@pig.apache.org 
 Cc: dev@pig.apache.org dev@pig.apache.org 
 Sent: Friday, October 12, 2012 5:59 PM
 Subject: Re: Pig 0.11

 Thanks Olga and welcome back! 
 I know there's some process for linking jiras to releases, but I'm not sure 
 what that is. If you could explain and maybe cover a portion of that work, 
 that'd be super helpful. And reviews, of course. 

 On Oct 12, 2012, at 2:06 PM, Olga Natkovich onatkov...@yahoo.com wrote:

 Dmitry, I would be happy to help with the release process. Want to get back 
 into this now that I am back at work. Let me know what you would like me to 
 do.

 Olga

 From: Dmitriy Ryaboy dvrya...@gmail.com
 To: dev@pig.apache.org 
 Cc: billgra...@gmail.com 
 Sent: Thursday, October 11, 2012 2:44 PM
 Subject: Re: Pig 0.11

 Ok I will branch 0.11 tomorrow morning unless someone objects.
 From then on, committers should be careful to commit bug fixes to both
 0.11 branch and trunk; minor polish can go into the branch, but whole
 new features should not (we can discuss on the list if something is in
 the gray area).

 D

 On Thu, Oct 11, 2012 at 2:16 PM, Gianmarco De Francisci Morales
 g...@apache.org wrote:
 I added it as a dependency as it has already its own Jira.
 I hope it is OK.

 Cheers,
 --
 Gianmarco

 On Wed, Oct 10, 2012 at 11:23 PM, Bill Graham billgra...@gmail.com wrote:

 +1 for me.

 There's https://issues.apache.org/jira/browse/PIG-2756 which tracks a few
 documentation issues that should block Pig 0.11, but they can also be done
 on the trunk and merged to the branch. Gianmarco, you can add a rank
 subtask there to serve as a reminder.

 On Wed, Oct 10, 2012 at 11:03 PM, Gianmarco De Francisci Morales 
 g...@apache.org wrote:

 We are missing some documentation on the RANK but I guess we could add
 that
 to the branch and trunk in parallel.
 All the patches I was keeping an eye on are in.

 So +1 for me.
 --
 Gianmarco

 On Wed, Oct 10, 2012 at 5:31 PM, Jonathan Coveney

Our release process

2012-11-02 Thread Olga Natkovich

Hi guys,
 
Mid next year, we agreed on a release process documented in this thread: 
http://www.mail-archive.com/dev@pig.apache.org/msg04172.html.
 
Since then, we have not really followed either of its two rules:
 
(1) Frequent (every 3 month releases)
(2) Branch stability (only P1 issues on the branch).
 
So I wanted to revisit our release procedure to make sure we have one that we 
can actually follow.
 
For us at Yahoo, branch stability is very important since we release all the 
patches directly from the branch. If we can't rely on the fact that only 
critical fixes go in, we will need to resort to git branches that will make the 
whole process very comberson because we now need to hand pick patches from the 
apache branch and port them onto our private branch. I would imaging that 
others using Pig in production would have similar issues.
 
Olga
 
 
Olga

[jira] [Updated] (PIG-2657) Print warning if using wrong jython version

2012-10-30 Thread Olga Natkovich (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-2657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich updated PIG-2657:


Fix Version/s: (was: 0.10.1)
   (was: 0.11)
   0.12

Moving to 0.12 based on Rohini's recommendation. Please, move back if you feel 
it needs to make it to 0.11

 Print warning if using wrong jython version
 ---

 Key: PIG-2657
 URL: https://issues.apache.org/jira/browse/PIG-2657
 Project: Pig
  Issue Type: Bug
Reporter: Fabian Alenius
 Fix For: 0.12

 Attachments: PIG-2657.1.patch, PIG-2657.2.patch


 Hi,
 It would be good if Pig would print a warning (or refuse to run) if you are 
 using an unsupported version of jython. I spent a couple of hours before 
 figuring out that you had to use 2.5.0. I've seen posts indicating that 
 others have run into this problem as well.
 Might write up a patch if others agree this is an issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (PIG-2521) explicit reference to namenode path with streaming results in an error

2012-10-30 Thread Olga Natkovich (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-2521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich updated PIG-2521:


Fix Version/s: (was: 0.11)
   0.12

moving to 0.12 since it is labelled as minor and no activity

 explicit reference to namenode path with streaming results in an error
 --

 Key: PIG-2521
 URL: https://issues.apache.org/jira/browse/PIG-2521
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.9.2
Reporter: Araceli Henley
Priority: Minor
 Fix For: 0.12


 I set this to minor because this test works with client side tables and with 
 old style references.
 ::
 /grid/2/dev/pigqa/out/pigtest/hadoopqa/hadoopqa.1327441396/dotNext_baseline_15.pig
 ::
 THIS TEST FAILS. It uses an explicit reference to namenode1 
 (hdfs://namenode1.domain.com:8020)
 define CMD `perl PigStreamingDepend.pl` input(stdin) 
 ship('/homes/araceli/pigtest/pigtest_next/pigharness/dist/pig_harness/libexec/PigTest/PigStreamingDepend.pl',
  
 '/homes/araceli/pigtest/pigtest_next/pigharness/dist/pig_harness/libexec/PigTest/PigStreamingModule.pm');
 A = load 'hdfs://namdenode1.domain.com:8020/user/hadoopqa/pig/tests/data';
 B = stream A through `perl PigStreaming.pl`;
 C = stream B through CMD as (name, age, gpa);
 D = foreach C generate name, age;
 store D into 
 'hdfs://namenode1.domain.com:8020/user/hadoopqa/pig/out1/user/hadoopqa/pig/out/hadoopqa.1327441396/dotNext_baseline_15.out';
 fs -cp 
 hdfs://namenode1.domain.com:8020/user/hadoopqa/pig/out1/user/hadoopqa/pig/out/hadoopqa.1327441396/dotNext_baseline_15.out
  /user/hadoopqa/pig/out/hadoopqa.1327441396/dotNext_baseline_15.out
 ::
 /grid/2/dev/pigqa/out/pigtest/hadoopqa/hadoopqa.1327441396/dotNext_baseline_1.pig
 ::
 This test PASSES. It uses an explicit reference to 
 NN1(hdfs://namenode1.domain.com:8020) for load and store
 a = load 
 'hdfs://namenode1.domain.com:8020/user/hadoopqa/pig/tests/data/singlefile/studenttab10k'
  as (name, age, gpa);
 store a into 
 'hdfs://namenode1.domain.com:8020/user/hadoopqa/pig/out1/user/hadoopqa/pig/out/hadoopqa.1327441396/dotNext_baseline_1.out'
  ;
 fs -cp 
 hdfs://namenode1.domain.com:8020/user/hadoopqa/pig/out1/user/hadoopqa/pig/out/hadoopqa.1327441396/dotNext_baseline_1.out
  /user/hadoopqa/pig/out/hadoopqa.1327441396/dotNext_baseline_1.out
 THE REMAINING TESTS ARE IDENTICAL EXCEPT FOR THE FILE REFERNCE: explicit vs 
 mount point
 ::
  
 /grid/2/dev/pigqa/out/pigtest/hadoopqa/hadoopqa.1327433551/dotNext_baseline_15.pig
 ::
 This test PASSES. Its the baseline for the test, it uses old style references.
 define CMD `perl PigStreamingDepend.pl` input(stdin) 
 ship('/homes/araceli/pigtest/pigtest_next/pigharness/dist/pig_harness/libexec/PigTest/PigStreamingDepend.pl',
  
 '/homes/araceli/pigtest/pigtest_next/pigharness/dist/pig_harness/libexec/PigTest/PigStreamingModule.pm');
 A = load '/user/hadoopqa/pig/tests/data';
 B = stream A through `perl PigStreaming.pl`;
 C = stream B through CMD as (name, age, gpa);
 D = foreach C generate name, age;
 store D into 
 '/user/hadoopqa/pig/out/hadoopqa.1327433551/dotNext_baseline_15.out';
 ::
 grid/2/dev/pigqa/out/pigtest/hadoopqa/hadoopqa.1327431567/dotNext_baseline_15.pig
 ::
 This test PASSES. It uses a mount point to namenode 1( /data1 is a mount 
 point for hdfs://namenode1.domain.com:8020/user/hadoopqa/pig/tests/data).
 define CMD `perl PigStreamingDepend.pl` input(stdin) 
 ship('/homes/araceli/pigtest/pigtest_next/pigharness/dist/pig_harness/libexec/PigTest/PigStreamingDepend.pl',
  
 '/homes/araceli/pigtest/pigtest_next/pigharness/dist/pig_harness/libexec/PigTest/PigStreamingModule.pm');
 A = load '/data1';
 B = stream A through `perl PigStreaming.pl`;
 C = stream B through CMD as (name, age, gpa);
 D = foreach C generate name, age;
 store D into 
 '/out1/user/hadoopqa/pig/out/hadoopqa.1327431567/dotNext_baseline_15.out';
 fs -cp 
 /out1/user/hadoopqa/pig/out/hadoopqa.1327431567/dotNext_baseline_15.out 
 /user/hadoopqa/pig/out/hadoopqa.1327431567/dotNext_baseline_15.out

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (PIG-2830) Macros should work in Grunt

2012-10-30 Thread Olga Natkovich (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-2830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich updated PIG-2830:


Fix Version/s: (was: 0.11)
   0.12

Moving to 0.12 since nobody is working on this

 Macros should work in Grunt
 ---

 Key: PIG-2830
 URL: https://issues.apache.org/jira/browse/PIG-2830
 Project: Pig
  Issue Type: Improvement
  Components: grunt, parser
Affects Versions: 0.10.0, 0.11, 0.10.1
Reporter: Russell Jurney
Priority: Minor
  Labels: fun, grunt, happy, macro, pants
 Fix For: 0.12


 It would be very helpful in writing Pig scripts if Grunt could load and use 
 Macros in an interactive session.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (PIG-2687) Add relation/operator scoping to Pig

2012-10-30 Thread Olga Natkovich (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-2687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich updated PIG-2687:


Fix Version/s: (was: 0.11)
   0.12

 Add relation/operator scoping to Pig
 

 Key: PIG-2687
 URL: https://issues.apache.org/jira/browse/PIG-2687
 Project: Pig
  Issue Type: Improvement
Reporter: Jonathan Coveney
Priority: Minor
 Fix For: 0.12


 The idea is to add a real notion of scope that can be used to manage 
 namespace. This would mean the addition of blocks to pig, probably with some 
 sort of syntax like this...
 {code}
 a = load thing as (x:int, y:int);
 b = foreach a generate x, y, x*y as z;
 {
   a = group b by z;
   b = foreach a generate COUNT(b);
   global b;
 }
 {code}
 which would replace the alias b with the nested b value in the scope. This 
 could also be used in nested foreach blocks, and macros could just become 
 blocks as well.
 I am 95% sure about how to implement this... I have a failed patch attempt, 
 and need to study a bit more about how Pig uses its logical operators.
 Any thoughts?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (PIG-2584) Command line arguments for Pig script

2012-10-30 Thread Olga Natkovich (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-2584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich updated PIG-2584:


Fix Version/s: (was: 0.11)
   0.12

Moving to 0.12 since nobody is working on this

 Command line arguments for Pig script
 -

 Key: PIG-2584
 URL: https://issues.apache.org/jira/browse/PIG-2584
 Project: Pig
  Issue Type: Improvement
  Components: impl
Reporter: Daniel Dai
Priority: Minor
 Fix For: 0.12


 We did that for Jython embeded script. It is also useful in Pig script itself:
 command line: pig a.pig student.txt output
 a.pig:
 a = load '$1' as (a0, a1);
 store a into '$2';

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (PIG-19) A=load causes parse error

2012-10-30 Thread Olga Natkovich (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-19?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich updated PIG-19:
--

Fix Version/s: (was: 0.11)
   0.12

 A=load causes parse error
 -

 Key: PIG-19
 URL: https://issues.apache.org/jira/browse/PIG-19
 Project: Pig
  Issue Type: Bug
  Components: grunt
Reporter: Olga Natkovich
Priority: Minor
 Fix For: 0.12


 Parser expects spaces around =. This should be a minor change in 
 src/org/apache/pig/tools/grunt/GruntParser.jj

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (PIG-2522) deprecated hdfs pig commands do not work well with client side tables

2012-10-30 Thread Olga Natkovich (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-2522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13486959#comment-13486959
 ] 

Olga Natkovich commented on PIG-2522:
-

Araceli, can you provide details:

- which commands do not work and what errors you were seeing?

 deprecated hdfs pig commands do not work well with client side tables
 -

 Key: PIG-2522
 URL: https://issues.apache.org/jira/browse/PIG-2522
 Project: Pig
  Issue Type: Bug
Reporter: Araceli Henley
Priority: Trivial
 Fix For: 0.11


 I'm mostly entering this Jira to make you aware that the deprecated pig api's 
 to access hdfs (typically thru grunt) do not work consistently with 
 federation.
 The hadoop references suported in grunt do work and can be used.
 It should at a minimum be noted in the documentation that the deprecated 
 api's do not work with client side tables.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (PIG-2522) deprecated hdfs pig commands do not work well with client side tables

2012-10-30 Thread Olga Natkovich (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-2522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich reassigned PIG-2522:
---

Assignee: Rohini Palaniswamy

Rohini, could you check if this is going to be an issue with federation.

I am mostly concerned about commands like cd for which we do not have 
equivalent in the fs command. Thanks!

 deprecated hdfs pig commands do not work well with client side tables
 -

 Key: PIG-2522
 URL: https://issues.apache.org/jira/browse/PIG-2522
 Project: Pig
  Issue Type: Bug
Reporter: Araceli Henley
Assignee: Rohini Palaniswamy
Priority: Trivial
 Fix For: 0.11


 I'm mostly entering this Jira to make you aware that the deprecated pig api's 
 to access hdfs (typically thru grunt) do not work consistently with 
 federation.
 The hadoop references suported in grunt do work and can be used.
 It should at a minimum be noted in the documentation that the deprecated 
 api's do not work with client side tables.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (PIG-2834) MultiStorage requires unused constructor argument

2012-10-30 Thread Olga Natkovich (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-2834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich updated PIG-2834:


Fix Version/s: (was: 0.11)
   0.12

 MultiStorage requires unused constructor argument
 -

 Key: PIG-2834
 URL: https://issues.apache.org/jira/browse/PIG-2834
 Project: Pig
  Issue Type: Improvement
  Components: data
Affects Versions: 0.10.0, 0.11
 Environment: Linux
Reporter: Danny Antonetti
Priority: Trivial
  Labels: newbie
 Fix For: 0.12

 Attachments: MultiStorage.patch


 each constructor in
 org.apache.pig.piggybank.storage.MultiStorage
 requires a constructor argument 'parentPathStr, that has no meaningful usage.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Re: Pig 0.11

2012-10-30 Thread Olga Natkovich

We are down to 45 tickets. Thanks for everybody who helped with the cleanup. We 
only have a couple of unassigned in the area of documentation and testing. Now 
we need to go through the assigned ones and see what can be done for 0.11.
 
Here is a list of people with many tickets. Please review what you are planning 
to complete in the next couple of weeks and unlink the rest.
 
Daniel - 15+ 
John Gordon - 6
Jonathan Coveney - 6
 
Thanks,
 
Olga


From: Olga Natkovich onatkov...@yahoo.com
To: dev@pig.apache.org dev@pig.apache.org 
Sent: Friday, October 26, 2012 4:32 PM
Subject: Re: Pig 0.11

74 issues still open and more than half unassigned. I think we should narrow 
list down next week. I am planning to start unlinking the unassigned ones next 
week so if you feel they need to be addressed, please, find owner.

Olga



- Original Message -
From: Olga Natkovich onatkov...@yahoo.com
To: dev@pig.apache.org dev@pig.apache.org
Cc: 
Sent: Monday, October 22, 2012 10:14 AM
Subject: Re: Pig 0.11

There are still 76 unresolved JIRAs more than half unassigned. Lets clean this 
up by theend of this week. I propose we do the following:
 
(1) Unlink all JIRAs for new features since we already branched so we should 
not be taken on new work. If people feel strongly that some new features still 
need to go in please bring it up.
(2) For bug fixes, if people fill strongly that some of the unassigned issues 
need to be addressed please take ownership. If you are unable to solve them but 
still feel they are important, please, bring them up.
(3) Owners of unresolved issues, please, take a look if you will have time to 
solve them in the next 2 weeks. If not, lets move them to 12. If you can't 
address them but feel they are important, please, bring it up.
 
Lets make sure that all JIRAs that require changes to the documentation have 
appropriate information in the release notes section so that we can quickly 
compile release documentation.
 
Thanks for you help!
 
Olga





From: Alan Gates ga...@hortonworks.com
To: dev@pig.apache.org 
Sent: Monday, October 15, 2012 11:55 AM
Subject: Re: Pig 0.11

At this point no one has taken on release documentation for 0.11.

Alan.

On Oct 15, 2012, at 11:49 AM, Olga Natkovich wrote:

 Thanks!
  
 Are you talking about items 15 and 16 on the How To Release.Publish  page? 
  
 Also, who is doing release documentation these days? I can help with that as 
 well. I would also be happy to roll the release if you guys need help with 
 that.
  
 Olga
 
 
 
 From: Dmitriy Ryaboy dvrya...@gmail.com
 To: dev@pig.apache.org dev@pig.apache.org 
 Cc: dev@pig.apache.org dev@pig.apache.org 
 Sent: Friday, October 12, 2012 5:59 PM
 Subject: Re: Pig 0.11
 
 Thanks Olga and welcome back! 
 I know there's some process for linking jiras to releases, but I'm not sure 
 what that is. If you could explain and maybe cover a portion of that work, 
 that'd be super helpful. And reviews, of course. 
 
 On Oct 12, 2012, at 2:06 PM, Olga Natkovich onatkov...@yahoo.com wrote:
 
 Dmitry, I would be happy to help with the release process. Want to get back 
 into this now that I am back at work. Let me know what you would like me to 
 do.
  
 Olga
 
 
 
 
 From: Dmitriy Ryaboy dvrya...@gmail.com
 To: dev@pig.apache.org 
 Cc: billgra...@gmail.com 
 Sent: Thursday, October 11, 2012 2:44 PM
 Subject: Re: Pig 0.11
 
 Ok I will branch 0.11 tomorrow morning unless someone objects.
 From then on, committers should be careful to commit bug fixes to both
 0.11 branch and trunk; minor polish can go into the branch, but whole
 new features should not (we can discuss on the list if something is in
 the gray area).
 
 D
 
 On Thu, Oct 11, 2012 at 2:16 PM, Gianmarco De Francisci Morales
 g...@apache.org wrote:
 I added it as a dependency as it has already its own Jira.
 I hope it is OK.
 
 Cheers,
 --
 Gianmarco
 
 
 
 On Wed, Oct 10, 2012 at 11:23 PM, Bill Graham billgra...@gmail.com wrote:
 
 +1 for me.
 
 There's https://issues.apache.org/jira/browse/PIG-2756 which tracks a few
 documentation issues that should block Pig 0.11, but they can also be done
 on the trunk and merged to the branch. Gianmarco, you can add a rank
 subtask there to serve as a reminder.
 
 
 On Wed, Oct 10, 2012 at 11:03 PM, Gianmarco De Francisci Morales 
 g...@apache.org wrote:
 
 We are missing some documentation on the RANK but I guess we could add
 that
 to the branch and trunk in parallel.
 All the patches I was keeping an eye on are in.
 
 So +1 for me.
 --
 Gianmarco
 
 
 
 On Wed, Oct 10, 2012 at 5:31 PM, Jonathan Coveney jcove...@gmail.com
 wrote:
 
 I think all of the major patches are in, no? Now it's just bug testing?
 Just wanted to touch base on where we are at with this.
 
 
 
 
 
 --
 *Note that I'm no longer using my Yahoo! email address. Please email me at
 billgra...@gmail.com going forward.*

[jira] [Updated] (PIG-2461) Simplify schema syntax for cast

2012-10-29 Thread Olga Natkovich (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-2461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich updated PIG-2461:


Fix Version/s: (was: 0.11)

 Simplify schema syntax for cast
 ---

 Key: PIG-2461
 URL: https://issues.apache.org/jira/browse/PIG-2461
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.10.0
Reporter: Daniel Dai
 Fix For: 0.12


 Cast into a bag/tuple syntax is confusing:
 {code}
 b = foreach a generate (bag{tuple(int,double)})bag0;
 {code}
 It's pretty hard to get it right for users. We should make key word 
 bag/tuple optional.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (PIG-2625) Allow use of JRuby for control flow

2012-10-29 Thread Olga Natkovich (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-2625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich updated PIG-2625:


Fix Version/s: (was: 0.11)

 Allow use of JRuby for control flow
 ---

 Key: PIG-2625
 URL: https://issues.apache.org/jira/browse/PIG-2625
 Project: Pig
  Issue Type: New Feature
Reporter: Jonathan Coveney
 Fix For: 0.12


 Much like people can use jython for iterative computation, it'd be great to 
 use JRuby for the same

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (PIG-2625) Allow use of JRuby for control flow

2012-10-29 Thread Olga Natkovich (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-2625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich updated PIG-2625:


Fix Version/s: 0.12

Moving to 12 since this is an improvement with no activity at this point

 Allow use of JRuby for control flow
 ---

 Key: PIG-2625
 URL: https://issues.apache.org/jira/browse/PIG-2625
 Project: Pig
  Issue Type: New Feature
Reporter: Jonathan Coveney
 Fix For: 0.12


 Much like people can use jython for iterative computation, it'd be great to 
 use JRuby for the same

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (PIG-2628) Allow in line scripting UDF definitions

2012-10-29 Thread Olga Natkovich (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-2628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich updated PIG-2628:


Fix Version/s: (was: 0.11)
   0.12

Moving to 0.12 since it is an improvement with no work done yet

 Allow in line scripting UDF definitions
 ---

 Key: PIG-2628
 URL: https://issues.apache.org/jira/browse/PIG-2628
 Project: Pig
  Issue Type: Improvement
Reporter: Jonathan Coveney
 Fix For: 0.12


 For small udfs in scripting languages, it may be cumbersome to force users to 
 make a script, put it on the classpath, ship it, etc. It would be great to 
 support a syntax that allows people to declare UDFs in line (essentially, to 
 define a snippet of code that will be interpreted as a scriptlet)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (PIG-2624) Handle recursive inclusion of scripts in JRuby UDFs

2012-10-29 Thread Olga Natkovich (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-2624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich updated PIG-2624:


Fix Version/s: (was: 0.10.1)
   (was: 0.11)
   0.12

Moving to 0.12 since there no work or person assigned to date

 Handle recursive inclusion of scripts in JRuby UDFs
 ---

 Key: PIG-2624
 URL: https://issues.apache.org/jira/browse/PIG-2624
 Project: Pig
  Issue Type: Improvement
Affects Versions: 0.10.0, 0.11
Reporter: Jonathan Coveney
  Labels: JRuby
 Fix For: 0.12


 Currently, if you have a script which require's another script, the 
 dependency won't be properly handled.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (PIG-2631) Pig should allow self joins

2012-10-29 Thread Olga Natkovich (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-2631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich updated PIG-2631:


Fix Version/s: 0.12

Moving to 12 since no work has been done and the ticket is unassigned

 Pig should allow self joins
 ---

 Key: PIG-2631
 URL: https://issues.apache.org/jira/browse/PIG-2631
 Project: Pig
  Issue Type: Improvement
Reporter: Jonathan Coveney
 Fix For: 0.11, 0.12


 This doesn't have to even be optimized, and can still involve a double scan 
 of the data, but there is no reason the following should work:
 {code}
 a = load 'thing' as (x:int);
 b = join a by x, (foreach a generate *) by x;
 {code}
 but this does not:
 {code}
 a = load 'thing' as (x:int);
 b = join a by x, a by x;
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (PIG-2631) Pig should allow self joins

2012-10-29 Thread Olga Natkovich (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-2631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich updated PIG-2631:


Fix Version/s: (was: 0.11)

 Pig should allow self joins
 ---

 Key: PIG-2631
 URL: https://issues.apache.org/jira/browse/PIG-2631
 Project: Pig
  Issue Type: Improvement
Reporter: Jonathan Coveney
 Fix For: 0.12


 This doesn't have to even be optimized, and can still involve a double scan 
 of the data, but there is no reason the following should work:
 {code}
 a = load 'thing' as (x:int);
 b = join a by x, (foreach a generate *) by x;
 {code}
 but this does not:
 {code}
 a = load 'thing' as (x:int);
 b = join a by x, a by x;
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (PIG-2641) Create toJSON function for all complex types: tuples, bags and maps

2012-10-29 Thread Olga Natkovich (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-2641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich reassigned PIG-2641:
---

Assignee: Russell Jurney

Hi Russell,

Is this going to be done in the next couple of weeks? if not, should we move it 
to 12?

 Create toJSON function for all complex types: tuples, bags and maps
 ---

 Key: PIG-2641
 URL: https://issues.apache.org/jira/browse/PIG-2641
 Project: Pig
  Issue Type: New Feature
  Components: piggybank
Affects Versions: 0.11, 0.10.1
 Environment: Foggy. Damn foggy.
Reporter: Russell Jurney
Assignee: Russell Jurney
  Labels: chararray, fun, happy, input, json, output, pants, pig, 
 piggybank, string, wonderdog
 Fix For: 0.11, 0.10.1

   Original Estimate: 96h
  Remaining Estimate: 96h

 It is a travesty that there are no UDFs in Piggybanks that, given an 
 arbitrary Pig datatype, return a JSON string of same. I intend to fix this 
 problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (PIG-2591) Unit tests should not write to /tmp but respect java.io.tmpdir

2012-10-29 Thread Olga Natkovich (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-2591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich updated PIG-2591:


Fix Version/s: (was: 0.11)
   0.12

Moving to 12 since no work has been done and the ticket is unassigned

 Unit tests should not write to /tmp but respect java.io.tmpdir
 --

 Key: PIG-2591
 URL: https://issues.apache.org/jira/browse/PIG-2591
 Project: Pig
  Issue Type: Bug
  Components: tools
Reporter: Thomas Weise
 Fix For: 0.12


 Several tests use /tmp but should derive temporary file location from 
 java.io.tmpdir to avoid side effects (java.io.tmpdir is already set to a test 
 run specific location in build.xml)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (PIG-1919) order-by on bag gives error only at runtime

2012-10-29 Thread Olga Natkovich (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-1919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13486524#comment-13486524
 ] 

Olga Natkovich commented on PIG-1919:
-

Jonathan, should this be assigned to you? Is this going to be finished for 0.11 
or should be moved to 0.12?

 order-by on bag gives error only at runtime
 ---

 Key: PIG-1919
 URL: https://issues.apache.org/jira/browse/PIG-1919
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.8.0, 0.9.0
Reporter: Thejas M Nair
 Fix For: 0.11, 0.10.1

 Attachments: PIG-1919-0.patch, PIG-1919-1.patch, PIG-1919-1.patch


 Order-by on a bag or tuple should give error at query compile time, instead 
 of giving an error at runtime.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (PIG-2423) document use case where co-group is better choice than join

2012-10-29 Thread Olga Natkovich (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-2423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13486527#comment-13486527
 ] 

Olga Natkovich commented on PIG-2423:
-

Thejas, should this be assigned to you? Is this going to go into 0.11 or 0.12?

 document use case where co-group is better choice than join 
 

 Key: PIG-2423
 URL: https://issues.apache.org/jira/browse/PIG-2423
 Project: Pig
  Issue Type: Improvement
  Components: documentation
Reporter: Thejas M Nair
 Fix For: 0.11


 Optimization rules 2 and 3 suggested in 
 https://issues.apache.org/jira/secure/attachment/12506841/pig_tpch.ppt 
 (PIG-2397) recommend the use of co-group instead of  join in certain cases. 
 These should be documented in pig performance page.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (PIG-2595) BinCond only works inside parentheses

2012-10-29 Thread Olga Natkovich (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-2595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich updated PIG-2595:


Fix Version/s: (was: 0.11)
   0.12

Moving to 12 since no work has been done and the ticket is unassigned

 BinCond only works inside parentheses
 -

 Key: PIG-2595
 URL: https://issues.apache.org/jira/browse/PIG-2595
 Project: Pig
  Issue Type: Bug
Reporter: Daniel Dai
 Fix For: 0.12


 Not sure if we have a Jira for this before. This script does not work:
 {code}
 a = load '/user/pig/tests/data/singlefile/studenttab10k' using PigStorage() 
 as (name, age:int, gpa:double, instate:chararray);
 b = foreach a generate name, instate=='true'?gpa:gpa+1;
 dump b;
 {code}
 If we put bincond into parentheses, it works
 {code}
 a = load '/user/pig/tests/data/singlefile/studenttab10k' using PigStorage() 
 as (name, age:int, gpa:double, instate:chararray);
 b = foreach a generate name, (instate=='true'?gpa:gpa+1);
 dump b;
 {code}
 Exception:
 ERROR 1200: file 40.pig, line 2, column 36  mismatched input '==' expecting 
 SEMI_COLON
 org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error during 
 parsing. file 40.pig, line 2, column 36  mismatched input '==' expecting 
 SEMI_COLON
 at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1598)
 at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1541)
 at org.apache.pig.PigServer.registerQuery(PigServer.java:541)
 at 
 org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:945)
 at 
 org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:392)
 at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:190)
 at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:166)
 at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
 at org.apache.pig.Main.run(Main.java:599)
 at org.apache.pig.Main.main(Main.java:153)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
 Caused by: Failed to parse: file 40.pig, line 2, column 36  mismatched 
 input '==' expecting SEMI_COLON
 at 
 org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:226)
 at 
 org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:168)
 at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1590)
 ... 14 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (PIG-2434) investigate 5% slowdown in TPC-H Q6 query in 0.10

2012-10-29 Thread Olga Natkovich (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-2434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13486632#comment-13486632
 ] 

Olga Natkovich commented on PIG-2434:
-

Thejas, any plan to address this for 0.11?

 investigate 5% slowdown in TPC-H Q6 query in 0.10
 -

 Key: PIG-2434
 URL: https://issues.apache.org/jira/browse/PIG-2434
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.10.0
Reporter: Thejas M Nair
 Fix For: 0.11


 0.10 is slower than 0.9 by around 5% for TPC-H Q6 query as per observation in 
 https://issues.apache.org/jira/browse/PIG-2228?focusedCommentId=13171461page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13171461
  .
 This needs to be investigated.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (PIG-2812) Spill InternalCachedBag into only 1 file

2012-10-29 Thread Olga Natkovich (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-2812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich reassigned PIG-2812:
---

Assignee: Haitao Yao

 Spill InternalCachedBag into only 1 file
 

 Key: PIG-2812
 URL: https://issues.apache.org/jira/browse/PIG-2812
 Project: Pig
  Issue Type: Bug
  Components: data
Reporter: Haitao Yao
Assignee: Haitao Yao
 Fix For: 0.11

 Attachments: aa.jpg, spill.patch


 I encountered a reducer's OOM because of java.io.DeleteOnExitHook. And I 
 found out that the InternalCachedBag creates a seperate tmp file, and the tmp 
 files is deleted on exit. So the file delete hook caused the OOM. 
 Why not just hold the tmp file handle and spill only one tmp file?
 Too many tmp files may block the tasktracker start process, if the tmp files 
 are not cleaned on time and the tasktracker restarts at this specific time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (PIG-2812) Spill InternalCachedBag into only 1 file

2012-10-29 Thread Olga Natkovich (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-2812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13486634#comment-13486634
 ] 

Olga Natkovich commented on PIG-2812:
-

Alan - are you planning to review this one? Do we need to include this in 0.11?

 Spill InternalCachedBag into only 1 file
 

 Key: PIG-2812
 URL: https://issues.apache.org/jira/browse/PIG-2812
 Project: Pig
  Issue Type: Bug
  Components: data
Reporter: Haitao Yao
Assignee: Haitao Yao
 Fix For: 0.11

 Attachments: aa.jpg, spill.patch


 I encountered a reducer's OOM because of java.io.DeleteOnExitHook. And I 
 found out that the InternalCachedBag creates a seperate tmp file, and the tmp 
 files is deleted on exit. So the file delete hook caused the OOM. 
 Why not just hold the tmp file handle and spill only one tmp file?
 Too many tmp files may block the tasktracker start process, if the tmp files 
 are not cleaned on time and the tasktracker restarts at this specific time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (PIG-2681) TestDriverPig.countStores() does not correctly count the number of stores for pig scripts using variables for the alias

2012-10-29 Thread Olga Natkovich (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-2681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich updated PIG-2681:


Fix Version/s: (was: 0.10.1)
   (was: 0.9.3)
   (was: 0.11)
   0.12

 TestDriverPig.countStores() does not correctly count the number of stores for 
 pig scripts using variables for the alias
 ---

 Key: PIG-2681
 URL: https://issues.apache.org/jira/browse/PIG-2681
 Project: Pig
  Issue Type: Test
  Components: e2e harness
Affects Versions: 0.9.0, 0.9.1, 0.9.2, 0.10.0
Reporter: Araceli Henley
 Fix For: 0.12

 Attachments: PIG-2681.patch


 For  pig macros where the out parameter is referenced in a store statement, 
 the TestDriveP.countStores() does not correctly count the number of stores:
 For example, the store will not be counted in :
 define myMacro(in1,in2) returns A {
  A  = load '$in1' using PigStorage('$delimeter') as (intnum1000: int,id: 
 int,intnum5: int,intnum100: int,intnum: int,longnum: long,floatnum: 
 float,doublenum: double);
store $A into '$out';
 }
  countStores() matches with:
  $count += $q[$i] =~ /store\s+[a-zA-Z][a-zA-Z0-9_]*\s+into/i;
 Since the alias has a special character $ it doesn't count it and the test 
 fails.
 Need to change this to:
$count += $q[$i] =~ /store\s+(\$)?[a-zA-Z][a-zA-Z0-9_]*\s+into/i;
 I'll submit a patch shortly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (PIG-2981) add e2e tests for DateTime data type

2012-10-29 Thread Olga Natkovich (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-2981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13486636#comment-13486636
 ] 

Olga Natkovich commented on PIG-2981:
-

Is anybody planning to add this or should it be moved to 0.12?

 add e2e tests for DateTime  data type
 -

 Key: PIG-2981
 URL: https://issues.apache.org/jira/browse/PIG-2981
 Project: Pig
  Issue Type: Test
Reporter: Thejas M Nair
 Fix For: 0.11


 e2e tests for DateTime datatype need to be added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (PIG-2974) StreamingLocal_11 e2e test hangs

2012-10-29 Thread Olga Natkovich (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-2974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich resolved PIG-2974.
-

Resolution: Duplicate

 StreamingLocal_11 e2e test hangs
 

 Key: PIG-2974
 URL: https://issues.apache.org/jira/browse/PIG-2974
 Project: Pig
  Issue Type: Sub-task
Affects Versions: 0.11
Reporter: Rohini Palaniswamy
 Fix For: 0.11




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (PIG-2630) Issue with setting b = a;

2012-10-29 Thread Olga Natkovich (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-2630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich updated PIG-2630:


Fix Version/s: (was: 0.10.1)
   (was: 0.11)
   0.12

Moving to 0.12 as no work has been done so far

 Issue with setting b = a;
 ---

 Key: PIG-2630
 URL: https://issues.apache.org/jira/browse/PIG-2630
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.10.0, 0.11
Reporter: Jonathan Coveney
 Fix For: 0.12


 The following gives an error:
 {code}
 a = load 'thing' as (x:int);
 b = a; c = join a by x, b by x;
 {code}
 Error:
 {code}
 2012-04-03 14:02:47,434 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
 1200: Pig script failed to parse: 
 line 14, column 4 pig script failed to validate: 
 org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2225: Projection 
 with nothing to reference!
 {code}
 No issue with the following, however
 {code}
 a = load 'thing' as (x:int);
 b = foreach a generate *;
 c = join a by x, b by x;
 {code}
 oh and here is the log:
 {code}
 $ cat pig_1333487146863.log
 Pig Stack Trace
 ---
 ERROR 1200: Pig script failed to parse: 
 line 3, column 4 pig script failed to validate: 
 org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2225: Projection 
 with nothing to reference!
 Failed to parse: Pig script failed to parse: 
 line 3, column 4 pig script failed to validate: 
 org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2225: Projection 
 with nothing to reference!
   at 
 org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:182)
   at org.apache.pig.PigServer$Graph.validateQuery(PigServer.java:1566)
   at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1539)
   at org.apache.pig.PigServer.registerQuery(PigServer.java:541)
   at 
 org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:945)
   at 
 org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:392)
   at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:190)
   at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:166)
   at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69)
   at org.apache.pig.Main.run(Main.java:535)
   at org.apache.pig.Main.main(Main.java:153)
 Caused by: 
 line 3, column 4 pig script failed to validate: 
 org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2225: Projection 
 with nothing to reference!
   at 
 org.apache.pig.parser.LogicalPlanBuilder.buildJoinOp(LogicalPlanBuilder.java:363)
   at 
 org.apache.pig.parser.LogicalPlanGenerator.join_clause(LogicalPlanGenerator.java:11441)
   at 
 org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1491)
   at 
 org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:791)
   at 
 org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:509)
   at 
 org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:384)
   at 
 org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:175)
   ... 10 more
 
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (PIG-3005) TestLargeFile#testOrderBy is failing

2012-10-29 Thread Olga Natkovich (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-3005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich updated PIG-3005:


Affects Version/s: (was: 0.12)
   (was: 0.11)
Fix Version/s: (was: 0.11)

 TestLargeFile#testOrderBy is failing
 

 Key: PIG-3005
 URL: https://issues.apache.org/jira/browse/PIG-3005
 Project: Pig
  Issue Type: Sub-task
 Environment: Mac OSX 10.6.8
Reporter: Jonathan Coveney
 Fix For: 0.12


 When run locally, at least, this test is failing for me.
 Has anyone else noticed this failing?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Re: [ANNOUNCE] Welcome new Apache Pig Committers Rohini Palaniswamy

2012-10-27 Thread Olga Natkovich

Congrats, Rohini!

- Original Message -
From: Daniel Dai da...@hortonworks.com
To: dev@pig.apache.org; u...@pig.apache.org
Cc: 
Sent: Friday, October 26, 2012 4:37 PM
Subject: [ANNOUNCE] Welcome new Apache Pig Committers Rohini Palaniswamy

Here is another Pig committer announcement today. Please welcome
Rohini Palaniswamy to be a Pig committer!

Thanks,
Daniel

Re: Pig 0.11

2012-10-26 Thread Olga Natkovich

74 issues still open and more than half unassigned. I think we should narrow 
list down next week. I am planning to start unlinking the unassigned ones next 
week so if you feel they need to be addressed, please, find owner.

Olga



- Original Message -
From: Olga Natkovich onatkov...@yahoo.com
To: dev@pig.apache.org dev@pig.apache.org
Cc: 
Sent: Monday, October 22, 2012 10:14 AM
Subject: Re: Pig 0.11

There are still 76 unresolved JIRAs more than half unassigned. Lets clean this 
up by theend of this week. I propose we do the following:
 
(1) Unlink all JIRAs for new features since we already branched so we should 
not be taken on new work. If people feel strongly that some new features still 
need to go in please bring it up.
(2) For bug fixes, if people fill strongly that some of the unassigned issues 
need to be addressed please take ownership. If you are unable to solve them but 
still feel they are important, please, bring them up.
(3) Owners of unresolved issues, please, take a look if you will have time to 
solve them in the next 2 weeks. If not, lets move them to 12. If you can't 
address them but feel they are important, please, bring it up.
 
Lets make sure that all JIRAs that require changes to the documentation have 
appropriate information in the release notes section so that we can quickly 
compile release documentation.
 
Thanks for you help!
 
Olga





From: Alan Gates ga...@hortonworks.com
To: dev@pig.apache.org 
Sent: Monday, October 15, 2012 11:55 AM
Subject: Re: Pig 0.11

At this point no one has taken on release documentation for 0.11.

Alan.

On Oct 15, 2012, at 11:49 AM, Olga Natkovich wrote:

 Thanks!
  
 Are you talking about items 15 and 16 on the How To Release.Publish  page? 
  
 Also, who is doing release documentation these days? I can help with that as 
 well. I would also be happy to roll the release if you guys need help with 
 that.
  
 Olga
 
 
 
 From: Dmitriy Ryaboy dvrya...@gmail.com
 To: dev@pig.apache.org dev@pig.apache.org 
 Cc: dev@pig.apache.org dev@pig.apache.org 
 Sent: Friday, October 12, 2012 5:59 PM
 Subject: Re: Pig 0.11
 
 Thanks Olga and welcome back! 
 I know there's some process for linking jiras to releases, but I'm not sure 
 what that is. If you could explain and maybe cover a portion of that work, 
 that'd be super helpful. And reviews, of course. 
 
 On Oct 12, 2012, at 2:06 PM, Olga Natkovich onatkov...@yahoo.com wrote:
 
 Dmitry, I would be happy to help with the release process. Want to get back 
 into this now that I am back at work. Let me know what you would like me to 
 do.
  
 Olga
 
 
 
 
 From: Dmitriy Ryaboy dvrya...@gmail.com
 To: dev@pig.apache.org 
 Cc: billgra...@gmail.com 
 Sent: Thursday, October 11, 2012 2:44 PM
 Subject: Re: Pig 0.11
 
 Ok I will branch 0.11 tomorrow morning unless someone objects.
 From then on, committers should be careful to commit bug fixes to both
 0.11 branch and trunk; minor polish can go into the branch, but whole
 new features should not (we can discuss on the list if something is in
 the gray area).
 
 D
 
 On Thu, Oct 11, 2012 at 2:16 PM, Gianmarco De Francisci Morales
 g...@apache.org wrote:
 I added it as a dependency as it has already its own Jira.
 I hope it is OK.
 
 Cheers,
 --
 Gianmarco
 
 
 
 On Wed, Oct 10, 2012 at 11:23 PM, Bill Graham billgra...@gmail.com wrote:
 
 +1 for me.
 
 There's https://issues.apache.org/jira/browse/PIG-2756 which tracks a few
 documentation issues that should block Pig 0.11, but they can also be done
 on the trunk and merged to the branch. Gianmarco, you can add a rank
 subtask there to serve as a reminder.
 
 
 On Wed, Oct 10, 2012 at 11:03 PM, Gianmarco De Francisci Morales 
 g...@apache.org wrote:
 
 We are missing some documentation on the RANK but I guess we could add
 that
 to the branch and trunk in parallel.
 All the patches I was keeping an eye on are in.
 
 So +1 for me.
 --
 Gianmarco
 
 
 
 On Wed, Oct 10, 2012 at 5:31 PM, Jonathan Coveney jcove...@gmail.com
 wrote:
 
 I think all of the major patches are in, no? Now it's just bug testing?
 Just wanted to touch base on where we are at with this.
 
 
 
 
 
 --
 *Note that I'm no longer using my Yahoo! email address. Please email me at
 billgra...@gmail.com going forward.*

[jira] [Commented] (PIG-2328) Add builtin UDFs for building and using bloom filters

2012-10-23 Thread Olga Natkovich (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13482451#comment-13482451
 ] 

Olga Natkovich commented on PIG-2328:
-

This one are in builtins at least according to the patch, so they need to be in 
docs. I will create a doc patch, I just was not sure if it was in a different 
place

 Add builtin UDFs for building and using bloom filters
 -

 Key: PIG-2328
 URL: https://issues.apache.org/jira/browse/PIG-2328
 Project: Pig
  Issue Type: New Feature
  Components: internal-udfs
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 0.10.0, 0.11

 Attachments: PIG-bloom-2.patch, PIG-bloom-3.patch, PIG-bloom.patch


 Bloom filters are a common way to do select a limited set of records before 
 moving data for a join or other heavy weight operation.  Pig should add UDFs 
 to support building and using bloom filters.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Re: Pig 0.11

2012-10-22 Thread Olga Natkovich

There are still 76 unresolved JIRAs more than half unassigned. Lets clean this 
up by theend of this week. I propose we do the following:
 
(1) Unlink all JIRAs for new features since we already branched so we should 
not be taken on new work. If people feel strongly that some new features still 
need to go in please bring it up.
(2) For bug fixes, if people fill strongly that some of the unassigned issues 
need to be addressed please take ownership. If you are unable to solve them but 
still feel they are important, please, bring them up.
(3) Owners of unresolved issues, please, take a look if you will have time to 
solve them in the next 2 weeks. If not, lets move them to 12. If you can't 
address them but feel they are important, please, bring it up.
 
Lets make sure that all JIRAs that require changes to the documentation have 
appropriate information in the release notes section so that we can quickly 
compile release documentation.
 
Thanks for you help!
 
Olga





From: Alan Gates ga...@hortonworks.com
To: dev@pig.apache.org 
Sent: Monday, October 15, 2012 11:55 AM
Subject: Re: Pig 0.11

At this point no one has taken on release documentation for 0.11.

Alan.

On Oct 15, 2012, at 11:49 AM, Olga Natkovich wrote:

 Thanks!
  
 Are you talking about items 15 and 16 on the How To Release.Publish  page? 
  
 Also, who is doing release documentation these days? I can help with that as 
 well. I would also be happy to roll the release if you guys need help with 
 that.
  
 Olga
 
 
 
 From: Dmitriy Ryaboy dvrya...@gmail.com
 To: dev@pig.apache.org dev@pig.apache.org 
 Cc: dev@pig.apache.org dev@pig.apache.org 
 Sent: Friday, October 12, 2012 5:59 PM
 Subject: Re: Pig 0.11
 
 Thanks Olga and welcome back! 
 I know there's some process for linking jiras to releases, but I'm not sure 
 what that is. If you could explain and maybe cover a portion of that work, 
 that'd be super helpful. And reviews, of course. 
 
 On Oct 12, 2012, at 2:06 PM, Olga Natkovich onatkov...@yahoo.com wrote:
 
 Dmitry, I would be happy to help with the release process. Want to get back 
 into this now that I am back at work. Let me know what you would like me to 
 do.
  
 Olga
 
 
 
 
 From: Dmitriy Ryaboy dvrya...@gmail.com
 To: dev@pig.apache.org 
 Cc: billgra...@gmail.com 
 Sent: Thursday, October 11, 2012 2:44 PM
 Subject: Re: Pig 0.11
 
 Ok I will branch 0.11 tomorrow morning unless someone objects.
 From then on, committers should be careful to commit bug fixes to both
 0.11 branch and trunk; minor polish can go into the branch, but whole
 new features should not (we can discuss on the list if something is in
 the gray area).
 
 D
 
 On Thu, Oct 11, 2012 at 2:16 PM, Gianmarco De Francisci Morales
 g...@apache.org wrote:
 I added it as a dependency as it has already its own Jira.
 I hope it is OK.
 
 Cheers,
 --
 Gianmarco
 
 
 
 On Wed, Oct 10, 2012 at 11:23 PM, Bill Graham billgra...@gmail.com wrote:
 
 +1 for me.
 
 There's https://issues.apache.org/jira/browse/PIG-2756 which tracks a few
 documentation issues that should block Pig 0.11, but they can also be done
 on the trunk and merged to the branch. Gianmarco, you can add a rank
 subtask there to serve as a reminder.
 
 
 On Wed, Oct 10, 2012 at 11:03 PM, Gianmarco De Francisci Morales 
 g...@apache.org wrote:
 
 We are missing some documentation on the RANK but I guess we could add
 that
 to the branch and trunk in parallel.
 All the patches I was keeping an eye on are in.
 
 So +1 for me.
 --
 Gianmarco
 
 
 
 On Wed, Oct 10, 2012 at 5:31 PM, Jonathan Coveney jcove...@gmail.com
 wrote:
 
 I think all of the major patches are in, no? Now it's just bug testing?
 Just wanted to touch base on where we are at with this.
 
 
 
 
 
 --
 *Note that I'm no longer using my Yahoo! email address. Please email me at
 billgra...@gmail.com going forward.*

[jira] [Commented] (PIG-2353) RANK function like in SQL

2012-10-22 Thread Olga Natkovich (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-2353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13481537#comment-13481537
 ] 

Olga Natkovich commented on PIG-2353:
-

Can you please add usage example to release notes section, thanks!

 RANK function like in SQL
 -

 Key: PIG-2353
 URL: https://issues.apache.org/jira/browse/PIG-2353
 Project: Pig
  Issue Type: New Feature
Reporter: Gianmarco De Francisci Morales
Assignee: Allan Avendaño
  Labels: gsoc2012, mentor
 Fix For: 0.11

 Attachments: PIG-2353-2, PIG-2353-3.txt, PIG-2353-4.txt, 
 PIG-2353-5.txt, PIG2353.patch


 Implement a function that given a (sorted) bag adds to each tuple a unique, 
 increasing identifier without gaps, like what RANK does for SQL.
 This is a candidate project for Google summer of code 2012. More information 
 about the program can be found at 
 https://cwiki.apache.org/confluence/display/PIG/GSoc2012
 Functionality implemented so far, is available at 
 https://reviews.apache.org/r/5523/diff/#index_header

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (PIG-2710) Implement Naive CUBE operator

2012-10-22 Thread Olga Natkovich (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-2710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13481591#comment-13481591
 ] 

Olga Natkovich commented on PIG-2710:
-

Could you, please, include release notes including syntax and examples for 
inclusion in the documentation, thanks!

 Implement Naive CUBE operator
 -

 Key: PIG-2710
 URL: https://issues.apache.org/jira/browse/PIG-2710
 Project: Pig
  Issue Type: Sub-task
Reporter: Dmitriy V. Ryaboy
Assignee: Prasanth J
 Fix For: 0.11

 Attachments: PIG-2710.1.patch


 The Naive CUBE operator is just syntactic sugar for the CubeDimensions UDFS 
 followed by a flatten+group-by.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (PIG-2600) Better Map support

2012-10-22 Thread Olga Natkovich (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13481596#comment-13481596
 ] 

Olga Natkovich commented on PIG-2600:
-

can you please add to release notes the UDFs that were added as well as their 
syntax and usage examples. This is for inclusion in the documentation, thanks!

 Better Map support
 --

 Key: PIG-2600
 URL: https://issues.apache.org/jira/browse/PIG-2600
 Project: Pig
  Issue Type: Improvement
Reporter: Jonathan Coveney
Assignee: Prashant Kommireddi
 Fix For: 0.11

 Attachments: PIG-2600_2.patch, PIG-2600_3.patch, PIG-2600_4.patch, 
 PIG-2600_5.patch, PIG-2600_6.patch, PIG-2600_7.patch, PIG-2600_8.patch, 
 PIG-2600_9.patch, PIG-2600.patch


 It would be nice if Pig played better with Maps. To that end, I'd like to add 
 a lot of utility around Maps.
 - TOBAG should take a Map and output {(key, value)}
 - TOMAP should take a Bag in that same form and make a map.
 - KEYSET should return the set of keys.
 - VALUESET should return the set of values.
 - VALUELIST should return the List of values (no deduping).
 - INVERSEMAP would return a Map of values = the set of keys that refer to 
 that Key
 This would all be pretty easy. A more substantial piece of work would be to 
 make Pig support non-String keys (this is especially an issue since UDFs and 
 whatnot probably assume that they are all Integers). Not sure if it is worth 
 it.
 I'd love to hear other things that would be useful for people!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (PIG-2710) Implement Naive CUBE operator

2012-10-22 Thread Olga Natkovich (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-2710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13481603#comment-13481603
 ] 

Olga Natkovich commented on PIG-2710:
-

Hi Prashanth,

The release notes look great! Do they basically cover all work we have done in 
this release for CUBE related support?

 Implement Naive CUBE operator
 -

 Key: PIG-2710
 URL: https://issues.apache.org/jira/browse/PIG-2710
 Project: Pig
  Issue Type: Sub-task
Reporter: Dmitriy V. Ryaboy
Assignee: Prasanth J
 Fix For: 0.11

 Attachments: PIG-2710.1.patch


 The Naive CUBE operator is just syntactic sugar for the CubeDimensions UDFS 
 followed by a flatten+group-by.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Documentation planning for Pig 0.11 release

2012-10-22 Thread Olga Natkovich

Hi,
 
I have gone through the resolved JIRAs for 0.11, and here is what I believe 
needs to go into the documentation. Please, let me know if I missed anything. 
Also, I have not looked and anything that has not yet been committed:
 
Bloom filter UDF: https://issues.apache.org/jira/browse/PIG-2328
Clear command in Grunt: https://issues.apache.org/jira/browse/PIG-2706 - this 
is already in the docs
RANK operator: https://issues.apache.org/jira/browse/PIG-2353 - this is already 
in docs
UDF convinience classes: https://issues.apache.org/jira/browse/PIG-2547
More efficient tuple support: https://issues.apache.org/jira/browse/PIG-2359
Pluggable progress notification: https://issues.apache.org/jira/browse/PIG-2525
Merge join after ORDER BY: https://issues.apache.org/jira/browse/PIG-2673
Measure time spent in UDF: https://issues.apache.org/jira/browse/PIG-2855
Storage func improvements: https://issues.apache.org/jira/browse/PIG-1891
UDFs to flatten bags: https://issues.apache.org/jira/browse/PIG-2166
Make Tuple iterable: https://issues.apache.org/jira/browse/PIG-2724
New accumulate interface: https://issues.apache.org/jira/browse/PIG-2651
RUBY UDF: https://issues.apache.org/jira/browse/PIG-2317 . Looks like this is 
also in 0.10. Was documentation for this committed to 10?
Re-aliasing: https://issues.apache.org/jira/browse/PIG-438 . Looks like this is 
also in 0.10. Was documentation for this committed to 10?
Groovy UDFs: https://issues.apache.org/jira/browse/PIG-2763 Docs already 
committed
Native cube operator: https://issues.apache.org/jira/browse/PIG-2710 - docs 
at:  http://goo.gl/SpUad
Better map support: https://issues.apache.org/jira/browse/PIG-2600 - This needs 
release notes to include in docs.
 
Olga

[jira] [Commented] (PIG-2600) Better Map support

2012-10-22 Thread Olga Natkovich (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13481738#comment-13481738
 ] 

Olga Natkovich commented on PIG-2600:
-

Yes, please, put all the information into the release notes. This way it is 
much easier to created documentation patch.

 Better Map support
 --

 Key: PIG-2600
 URL: https://issues.apache.org/jira/browse/PIG-2600
 Project: Pig
  Issue Type: Improvement
Reporter: Jonathan Coveney
Assignee: Prashant Kommireddi
 Fix For: 0.11

 Attachments: PIG-2600_2.patch, PIG-2600_3.patch, PIG-2600_4.patch, 
 PIG-2600_5.patch, PIG-2600_6.patch, PIG-2600_7.patch, PIG-2600_8.patch, 
 PIG-2600_9.patch, PIG-2600.patch


 It would be nice if Pig played better with Maps. To that end, I'd like to add 
 a lot of utility around Maps.
 - TOBAG should take a Map and output {(key, value)}
 - TOMAP should take a Bag in that same form and make a map.
 - KEYSET should return the set of keys.
 - VALUESET should return the set of values.
 - VALUELIST should return the List of values (no deduping).
 - INVERSEMAP would return a Map of values = the set of keys that refer to 
 that Key
 This would all be pretty easy. A more substantial piece of work would be to 
 make Pig support non-String keys (this is especially an issue since UDFs and 
 whatnot probably assume that they are all Integers). Not sure if it is worth 
 it.
 I'd love to hear other things that would be useful for people!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (PIG-2353) RANK function like in SQL

2012-10-22 Thread Olga Natkovich (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-2353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13481817#comment-13481817
 ] 

Olga Natkovich commented on PIG-2353:
-

Yes, I think that's fine - I did not realize it was covered in a separate JIRA, 
thanks!

 RANK function like in SQL
 -

 Key: PIG-2353
 URL: https://issues.apache.org/jira/browse/PIG-2353
 Project: Pig
  Issue Type: New Feature
Reporter: Gianmarco De Francisci Morales
Assignee: Allan Avendaño
  Labels: gsoc2012, mentor
 Fix For: 0.11

 Attachments: PIG-2353-2, PIG-2353-3.txt, PIG-2353-4.txt, 
 PIG-2353-5.txt, PIG2353.patch


 Implement a function that given a (sorted) bag adds to each tuple a unique, 
 increasing identifier without gaps, like what RANK does for SQL.
 This is a candidate project for Google summer of code 2012. More information 
 about the program can be found at 
 https://cwiki.apache.org/confluence/display/PIG/GSoc2012
 Functionality implemented so far, is available at 
 https://reviews.apache.org/r/5523/diff/#index_header

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Re: Pig 0.11

2012-10-22 Thread Olga Natkovich

There are still 76 unresolved JIRAs more than half unassigned. Lets clean this 
up by theend of this week. I propose we do the following:
 
(1) Unlink all JIRAs for new features since we already branched so we should 
not be taken on new work. If people feel strongly that some new features still 
need to go in please bring it up.
(2) For bug fixes, if people fill strongly that some of the unassigned issues 
need to be addressed please take ownership. If you are unable to solve them but 
still feel they are important, please, bring them up.
(3) Owners of unresolved issues, please, take a look if you will have time to 
solve them in the next 2 weeks. If not, lets move them to 12. If you can't 
address them but feel they are important, please, bring it up.
 
Lets make sure that all JIRAs that require changes to the documentation have 
appropriate information in the release notes section so that we can quickly 
compile release documentation.
 
Thanks for you help!
 
Olga



From: Alan Gates ga...@hortonworks.com
To: dev@pig.apache.org 
Sent: Monday, October 15, 2012 11:55 AM
Subject: Re: Pig 0.11

At this point no one has taken on release documentation for 0.11.

Alan.

On Oct 15, 2012, at 11:49 AM, Olga Natkovich wrote:

 Thanks!
  
 Are you talking about items 15 and 16 on the How To Release.Publish  page? 
  
 Also, who is doing release documentation these days? I can help with that as 
 well. I would also be happy to roll the release if you guys need help with 
 that.
  
 Olga
 
 
 
 From: Dmitriy Ryaboy dvrya...@gmail.com
 To: dev@pig.apache.org dev@pig.apache.org 
 Cc: dev@pig.apache.org dev@pig.apache.org 
 Sent: Friday, October 12, 2012 5:59 PM
 Subject: Re: Pig 0.11
 
 Thanks Olga and welcome back! 
 I know there's some process for linking jiras to releases, but I'm not sure 
 what that is. If you could explain and maybe cover a portion of that work, 
 that'd be super helpful. And reviews, of course. 
 
 On Oct 12, 2012, at 2:06 PM, Olga Natkovich onatkov...@yahoo.com wrote:
 
 Dmitry, I would be happy to help with the release process. Want to get back 
 into this now that I am back at work. Let me know what you would like me to 
 do.
  
 Olga
 
 
 
 
 From: Dmitriy Ryaboy dvrya...@gmail.com
 To: dev@pig.apache.org 
 Cc: billgra...@gmail.com 
 Sent: Thursday, October 11, 2012 2:44 PM
 Subject: Re: Pig 0.11
 
 Ok I will branch 0.11 tomorrow morning unless someone objects.
 From then on, committers should be careful to commit bug fixes to both
 0.11 branch and trunk; minor polish can go into the branch, but whole
 new features should not (we can discuss on the list if something is in
 the gray area).
 
 D
 
 On Thu, Oct 11, 2012 at 2:16 PM, Gianmarco De Francisci Morales
 g...@apache.org wrote:
 I added it as a dependency as it has already its own Jira.
 I hope it is OK.
 
 Cheers,
 --
 Gianmarco
 
 
 
 On Wed, Oct 10, 2012 at 11:23 PM, Bill Graham billgra...@gmail.com wrote:
 
 +1 for me.
 
 There's https://issues.apache.org/jira/browse/PIG-2756 which tracks a few
 documentation issues that should block Pig 0.11, but they can also be done
 on the trunk and merged to the branch. Gianmarco, you can add a rank
 subtask there to serve as a reminder.
 
 
 On Wed, Oct 10, 2012 at 11:03 PM, Gianmarco De Francisci Morales 
 g...@apache.org wrote:
 
 We are missing some documentation on the RANK but I guess we could add
 that
 to the branch and trunk in parallel.
 All the patches I was keeping an eye on are in.
 
 So +1 for me.
 --
 Gianmarco
 
 
 
 On Wed, Oct 10, 2012 at 5:31 PM, Jonathan Coveney jcove...@gmail.com
 wrote:
 
 I think all of the major patches are in, no? Now it's just bug testing?
 Just wanted to touch base on where we are at with this.
 
 
 
 
 
 --
 *Note that I'm no longer using my Yahoo! email address. Please email me at
 billgra...@gmail.com going forward.*

[jira] [Assigned] (PIG-2756) Documentation for 0.11

2012-10-22 Thread Olga Natkovich (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-2756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich reassigned PIG-2756:
---

Assignee: Olga Natkovich

 Documentation for 0.11
 --

 Key: PIG-2756
 URL: https://issues.apache.org/jira/browse/PIG-2756
 Project: Pig
  Issue Type: Bug
  Components: documentation
Affects Versions: 0.11
Reporter: Bill Graham
Assignee: Olga Natkovich
 Fix For: 0.11


 Tracking areas where we need documentation on the pig.apache.org site 
 (Javadocs are typically pretty good). We can open child tasks as needed. 
 Please add to the list if you know of others.
 * Pluggable {{PigProgressNotificationListener}} isn't in the docs
 * Pluggable reducer estimators (see PIG-2574)
 * ILLUSTRATE seems to have dropped off the docs
 * {{HBaseStorage}} (see PIG-2341)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (PIG-1314) Add DateTime Support to Pig

2012-10-22 Thread Olga Natkovich (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-1314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich updated PIG-1314:


Fix Version/s: 0.11

 Add DateTime Support to Pig
 ---

 Key: PIG-1314
 URL: https://issues.apache.org/jira/browse/PIG-1314
 Project: Pig
  Issue Type: Bug
  Components: data
Affects Versions: 0.7.0
Reporter: Russell Jurney
Assignee: Zhijie Shen
  Labels: gsoc2012
 Fix For: 0.11

 Attachments: joda_vs_builtin.zip, PIG-1314-1.patch, PIG-1314-2.patch, 
 PIG-1314-3.patch, PIG-1314-4.patch, PIG-1314-5.patch, PIG-1314-6.patch, 
 PIG-1314-7.patch

   Original Estimate: 672h
  Remaining Estimate: 672h

 Hadoop/Pig are primarily used to parse log data, and most logs have a 
 timestamp component.  Therefore Pig should support dates as a primitive.
 Can someone familiar with adding types to pig comment on how hard this is?  
 We're looking at doing this, rather than use UDFs.  Is this a patch that 
 would be accepted?
 This is a candidate project for Google summer of code 2012. More information 
 about the program can be found at 
 https://cwiki.apache.org/confluence/display/PIG/GSoc2012

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (PIG-2980) documentation for DateTime datatype

2012-10-22 Thread Olga Natkovich (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-2980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich reassigned PIG-2980:
---

Assignee: Zhijie Shen

 documentation for DateTime datatype
 ---

 Key: PIG-2980
 URL: https://issues.apache.org/jira/browse/PIG-2980
 Project: Pig
  Issue Type: Bug
  Components: documentation
Reporter: Thejas M Nair
Assignee: Zhijie Shen
 Fix For: 0.11


 Documentation for new DateTime type needs to be added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (PIG-2980) documentation for DateTime datatype

2012-10-22 Thread Olga Natkovich (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-2980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13481938#comment-13481938
 ] 

Olga Natkovich commented on PIG-2980:
-

Sounds good. Zhijie, please, re-assign to me once you provide the information, 
thanks!

 documentation for DateTime datatype
 ---

 Key: PIG-2980
 URL: https://issues.apache.org/jira/browse/PIG-2980
 Project: Pig
  Issue Type: Bug
  Components: documentation
Reporter: Thejas M Nair
Assignee: Zhijie Shen
 Fix For: 0.11


 Documentation for new DateTime type needs to be added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (PIG-2328) Add builtin UDFs for building and using bloom filters

2012-10-22 Thread Olga Natkovich (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13481948#comment-13481948
 ] 

Olga Natkovich commented on PIG-2328:
-

Looks like the change made it into 10 but what about documentation? I could not 
find it ib builtins but just want to make sure it was not put in some other 
place?

 Add builtin UDFs for building and using bloom filters
 -

 Key: PIG-2328
 URL: https://issues.apache.org/jira/browse/PIG-2328
 Project: Pig
  Issue Type: New Feature
  Components: internal-udfs
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 0.10.0, 0.11

 Attachments: PIG-bloom-2.patch, PIG-bloom-3.patch, PIG-bloom.patch


 Bloom filters are a common way to do select a limited set of records before 
 moving data for a join or other heavy weight operation.  Pig should add UDFs 
 to support building and using bloom filters.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Re: Pig 0.11

2012-10-15 Thread Olga Natkovich

Thanks!
 
Are you talking about items 15 and 16 on the How To Release.Publish  page? 
 
Also, who is doing release documentation these days? I can help with that as 
well. I would also be happy to roll the release if you guys need help with that.
 
Olga



From: Dmitriy Ryaboy dvrya...@gmail.com
To: dev@pig.apache.org dev@pig.apache.org 
Cc: dev@pig.apache.org dev@pig.apache.org 
Sent: Friday, October 12, 2012 5:59 PM
Subject: Re: Pig 0.11

Thanks Olga and welcome back! 
I know there's some process for linking jiras to releases, but I'm not sure 
what that is. If you could explain and maybe cover a portion of that work, 
that'd be super helpful. And reviews, of course. 

On Oct 12, 2012, at 2:06 PM, Olga Natkovich onatkov...@yahoo.com wrote:

 Dmitry, I would be happy to help with the release process. Want to get back 
 into this now that I am back at work. Let me know what you would like me to 
 do.
  
 Olga
 
 
 
 
 From: Dmitriy Ryaboy dvrya...@gmail.com
 To: dev@pig.apache.org 
 Cc: billgra...@gmail.com 
 Sent: Thursday, October 11, 2012 2:44 PM
 Subject: Re: Pig 0.11
 
 Ok I will branch 0.11 tomorrow morning unless someone objects.
 From then on, committers should be careful to commit bug fixes to both
 0.11 branch and trunk; minor polish can go into the branch, but whole
 new features should not (we can discuss on the list if something is in
 the gray area).
 
 D
 
 On Thu, Oct 11, 2012 at 2:16 PM, Gianmarco De Francisci Morales
 g...@apache.org wrote:
 I added it as a dependency as it has already its own Jira.
 I hope it is OK.
 
 Cheers,
 --
 Gianmarco
 
 
 
 On Wed, Oct 10, 2012 at 11:23 PM, Bill Graham billgra...@gmail.com wrote:
 
 +1 for me.
 
 There's https://issues.apache.org/jira/browse/PIG-2756 which tracks a few
 documentation issues that should block Pig 0.11, but they can also be done
 on the trunk and merged to the branch. Gianmarco, you can add a rank
 subtask there to serve as a reminder.
 
 
 On Wed, Oct 10, 2012 at 11:03 PM, Gianmarco De Francisci Morales 
 g...@apache.org wrote:
 
 We are missing some documentation on the RANK but I guess we could add
 that
 to the branch and trunk in parallel.
 All the patches I was keeping an eye on are in.
 
 So +1 for me.
 --
 Gianmarco
 
 
 
 On Wed, Oct 10, 2012 at 5:31 PM, Jonathan Coveney jcove...@gmail.com
 wrote:
 
 I think all of the major patches are in, no? Now it's just bug testing?
 Just wanted to touch base on where we are at with this.
 
 
 
 
 
 --
 *Note that I'm no longer using my Yahoo! email address. Please email me at
 billgra...@gmail.com going forward.*

Re: Pig 0.11

2012-10-12 Thread Olga Natkovich

Dmitry, I would be happy to help with the release process. Want to get back 
into this now that I am back at work. Let me know what you would like me to do.

Olga

From: Dmitriy Ryaboy dvrya...@gmail.com
To: dev@pig.apache.org 
Cc: billgra...@gmail.com 
Sent: Thursday, October 11, 2012 2:44 PM
Subject: Re: Pig 0.11

Ok I will branch 0.11 tomorrow morning unless someone objects.
From then on, committers should be careful to commit bug fixes to both
0.11 branch and trunk; minor polish can go into the branch, but whole
new features should not (we can discuss on the list if something is in
the gray area).

D

On Thu, Oct 11, 2012 at 2:16 PM, Gianmarco De Francisci Morales
g...@apache.org wrote:
 I added it as a dependency as it has already its own Jira.
 I hope it is OK.

 Cheers,
 --
 Gianmarco

 On Wed, Oct 10, 2012 at 11:23 PM, Bill Graham billgra...@gmail.com wrote:

 +1 for me.

 There's https://issues.apache.org/jira/browse/PIG-2756 which tracks a few
 documentation issues that should block Pig 0.11, but they can also be done
 on the trunk and merged to the branch. Gianmarco, you can add a rank
 subtask there to serve as a reminder.

 On Wed, Oct 10, 2012 at 11:03 PM, Gianmarco De Francisci Morales 
 g...@apache.org wrote:

  We are missing some documentation on the RANK but I guess we could add
 that
  to the branch and trunk in parallel.
  All the patches I was keeping an eye on are in.

  So +1 for me.
  --
  Gianmarco

  On Wed, Oct 10, 2012 at 5:31 PM, Jonathan Coveney jcove...@gmail.com
  wrote:

   I think all of the major patches are in, no? Now it's just bug testing?
   Just wanted to touch base on where we are at with this.

 --
 *Note that I'm no longer using my Yahoo! email address. Please email me at
 billgra...@gmail.com going forward.*

[jira] [Updated] (PIG-2442) Multiple Stores in pig streaming causes infinite waiting

2011-12-22 Thread Olga Natkovich (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-2442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich updated PIG-2442:


Fix Version/s: 0.10

 Multiple Stores in pig streaming causes infinite waiting
 

 Key: PIG-2442
 URL: https://issues.apache.org/jira/browse/PIG-2442
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.8.1, 0.9.0
Reporter: Anitha Raju
 Fix For: 0.10


 Hi,
 If there are multiple store in a pig streaming script, it goes into infinite 
 waiting. 
 Script
 {code}
 DEFINE SCRIPT `./a.pl` SHIP ('/homes/anithar/a.pl');;
 DEFINE SCRIPT1 `./b.pl` SHIP ('/homes/anithar/b.pl');;
 A = LOAD 'test.txt' USING PigStorage() ;
 B1 = STREAM A THROUGH SCRIPT ;
 B1 = foreach B1 generate $0;
 STORE B1 INTO 'B1' USING PigStorage();
 B2 =  STREAM B1 THROUGH SCRIPT1;
 STORE B2 INTO 'B2' USING PigStorage();
 {code}
 a.pl
 
 #! /usr/bin/perl -w
 while (my $line = STDIN) {
 print uc($line);
 }
 
 b.pl
 -
 #! /usr/bin/perl -w
 while (my $line = STDIN) {
 print $line;
 }
 -
 Input (test.txt)
 {code}
 test
 hi
 hello
 {code}
 This infinite waiting happens randomly causing the job to fail with Task 
 attempt failed to report
 status for 605 seconds. Killing!. 
 Same happens with 0.8 version too.
 Regards,
 Anitha

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (PIG-2426) ProgressableReporter.progress(String msg) is an empty function

2011-12-15 Thread Olga Natkovich (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-2426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich updated PIG-2426:


Fix Version/s: 0.10

 ProgressableReporter.progress(String msg) is an empty function
 --

 Key: PIG-2426
 URL: https://issues.apache.org/jira/browse/PIG-2426
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.8.1, 0.9.1
Reporter: Vivek Padmanabhan
Assignee: Vivek Padmanabhan
Priority: Minor
 Fix For: 0.10

 Attachments: PIG-2426_1.patch


 In current implementation the reporter function 
 ProgressableReporter.progress(String msg)  is an empty function.
 If I have a long running UDF and I want update the status using a message, 
 the preferred way is to use this api.  
 The previous implementation of ProgressableReporter used 
 org.apache.hadoop.mapred.Reporter api directly.
 But the currently used org.apache.hadoop.util.Progressable interface  does 
 not have api to set status as a given message. 
 Hence I believe the empty method.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (PIG-2410) Piggybank does not compile in 23

2011-12-14 Thread Olga Natkovich (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-2410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169789#comment-13169789
 ] 

Olga Natkovich commented on PIG-2410:
-

I beleive that HadoopJobHistoryLoader.java issue was traced to JobHistory 
interface changes. I also remember that there was a discussion of having 2 
version of the function.

I believe that is confusing for users and we should make an effort to shim it 
the same way we did for Pif core code

 Piggybank does not compile in 23
 

 Key: PIG-2410
 URL: https://issues.apache.org/jira/browse/PIG-2410
 Project: Pig
  Issue Type: Bug
  Components: piggybank
Affects Versions: 0.10, 0.9.2, 0.11
Reporter: Daniel Dai
Assignee: Daniel Dai
  Labels: hadoop2.0
 Fix For: 0.10, 0.9.2, 0.11


 These does not compile:
 AllLoader.java
 HiveRCInputFormat.java
 HadoopJobHistoryLoader.java
 HiveColumnarLoader.java
 PathPartitionHelper.java
 IndexedStorage.java

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (PIG-2390) Support for empty schema in AS () syntax is broken

2011-12-08 Thread Olga Natkovich (Resolved) (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-2390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich resolved PIG-2390.
-

Resolution: Won't Fix

 Support for empty schema in  AS () syntax is broken
 -

 Key: PIG-2390
 URL: https://issues.apache.org/jira/browse/PIG-2390
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.9.1
Reporter: Francis Liu

 running this command in pig 0.8 works:
 A = LOAD 'myfile.txt' USING PigStorage('\t') AS ()
 but in 0.9, you get:
 ERROR 1200: line 1, column 49  mismatched input ')' expecting IDENTIFIER_L

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (PIG-2374) streaming regression with dotNext

2011-12-07 Thread Olga Natkovich (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-2374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164769#comment-13164769
 ] 

Olga Natkovich commented on PIG-2374:
-

I think Ashutosh is brining a really good point. We seemed to always fixing 
things in Pig because understandably it is easier for us. However, if Hadoop is 
breaking contract they should be fixing this especially if we have to be paying 
performance penalty on this

 streaming regression with dotNext
 -

 Key: PIG-2374
 URL: https://issues.apache.org/jira/browse/PIG-2374
 Project: Pig
  Issue Type: Bug
 Environment: hadoopApache Pig version 0.9.2.101150 (r1200499)
 compiled Nov 10 2011, 19:50:15
  -bash-3.1$ hadoop version
 Hadoop 0.23.0.080202
 Subversion 
 http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.23.0/hadoop-common-project/hadoop-common
  -r 1196973
 Compiled by hadoopqa on Tue Nov  8 02:12:04 PST 2011
 From source with checksum 4e42b2d96c899a98a8ab8c7cc23f27ae
Reporter: Araceli Henley
Assignee: Daniel Dai
  Labels: hadoop2.0
 Fix For: 0.9.2

 Attachments: PIG-2374-1.patch


 Streaming seems to be broken in dotNext. There are several tests that are 
 failing.
 The results from C below produce clean results.
 The results from D which are streamed through CMD produce control characters 
 on some of the output.
 define CMD `perl GroupBy.pl '\t' 0` 
 ship('/homes/monster/pigtest/pigtest_next/pigharness/dist/pig_harness/libexec/PigTest/GroupBy.pl');
 A = load '/user/user1/pig/tests/data/singlefile/studenttab10k';
 B = group A by $0;
 C = foreach B generate flatten(A);
 D = stream C through CMD;
 store C into '/user/user1/pig/out/user1.1321117428/ComputeSpec_7_C.out';
 store D into '/user/user1/pig/out/user1.1321117428/ComputeSpec_7_D.out';
 Other streaming tests that fail with control characters:
 EST FAILED ComputeSpec_7
 TEST FAILED ComputeSpec_8
 TEST FAILED ComputeSpec_10
 TEST FAILED ComputeSpec_11
 TEST FAILED ComputeSpec_12
 TEST FAILED JobManagement_2
 TEST FAILED JobManagement_3
 TEST FAILED StreamingIO_4
 TEST FAILED NonStreaming_1
 TEST FAILED MultiQuery_21
 ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (PIG-2404) NullPointerException when I have multiple python udfs

2011-12-07 Thread Olga Natkovich (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-2404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich updated PIG-2404:


Assignee: xuting zhao

 NullPointerException when I have multiple python udfs
 -

 Key: PIG-2404
 URL: https://issues.apache.org/jira/browse/PIG-2404
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.8.1, 0.9.1
Reporter: Vivek Padmanabhan
Assignee: xuting zhao
 Fix For: 0.9.2


 When I have multiple python udfs registered, the script fails at compile 
 phase while trying to get the udf ouputschema.
 {code}
 register 'a.py' using org.apache.pig.scripting.jython.JythonScriptEngine as 
 a_func;
 register 'b.py' using org.apache.pig.scripting.jython.JythonScriptEngine as 
 b_func;
 a = load 'i1' as (f1:chararray);
 b = foreach a generate a_func.helloworld(), b_func.square(3);
 dump b;
 {code}
 a.py 
 {code}
 @outputSchema(word:chararray)
 def helloworld():  
   return 'Hello, World'
 {code}
 b.py 
 {code}
 @outputSchemaFunction(squareSchema)
 def square(num):
   return ((num)*(num))
 {code}
 Moreover , in the log we can see duplicate and incorrect registration of udfs 
 which I believe the cause for the script failure.
 INFO  org.apache.pig.scripting.jython.JythonScriptEngine - Register scripting 
 UDF: a_func.helloworld
 INFO  org.apache.pig.scripting.jython.JythonScriptEngine - Register scripting 
 UDF: b_func.square
 INFO  org.apache.pig.scripting.jython.JythonScriptEngine - Register scripting 
 UDF: b_func.helloworld
 This issue is observed in 0.9,0.8 and  in trunk also.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (PIG-2404) NullPointerException when I have multiple python udfs

2011-12-07 Thread Olga Natkovich (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-2404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich updated PIG-2404:


Fix Version/s: 0.9.2

 NullPointerException when I have multiple python udfs
 -

 Key: PIG-2404
 URL: https://issues.apache.org/jira/browse/PIG-2404
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.8.1, 0.9.1
Reporter: Vivek Padmanabhan
Assignee: xuting zhao
 Fix For: 0.9.2


 When I have multiple python udfs registered, the script fails at compile 
 phase while trying to get the udf ouputschema.
 {code}
 register 'a.py' using org.apache.pig.scripting.jython.JythonScriptEngine as 
 a_func;
 register 'b.py' using org.apache.pig.scripting.jython.JythonScriptEngine as 
 b_func;
 a = load 'i1' as (f1:chararray);
 b = foreach a generate a_func.helloworld(), b_func.square(3);
 dump b;
 {code}
 a.py 
 {code}
 @outputSchema(word:chararray)
 def helloworld():  
   return 'Hello, World'
 {code}
 b.py 
 {code}
 @outputSchemaFunction(squareSchema)
 def square(num):
   return ((num)*(num))
 {code}
 Moreover , in the log we can see duplicate and incorrect registration of udfs 
 which I believe the cause for the script failure.
 INFO  org.apache.pig.scripting.jython.JythonScriptEngine - Register scripting 
 UDF: a_func.helloworld
 INFO  org.apache.pig.scripting.jython.JythonScriptEngine - Register scripting 
 UDF: b_func.square
 INFO  org.apache.pig.scripting.jython.JythonScriptEngine - Register scripting 
 UDF: b_func.helloworld
 This issue is observed in 0.9,0.8 and  in trunk also.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (PIG-2391) Bzip_2 test is broken

2011-12-06 Thread Olga Natkovich (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-2391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13163897#comment-13163897
 ] 

Olga Natkovich commented on PIG-2391:
-

Hi Xuting,

Could you explain what is causing this regression? It is not obvious to me what 
the fix is doing. Also, what would happen if another store function like 
BinStorage is used?

Thanks

 Bzip_2 test is broken
 -

 Key: PIG-2391
 URL: https://issues.apache.org/jira/browse/PIG-2391
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.10
Reporter: Olga Natkovich
Assignee: xuting zhao
 Fix For: 0.10, 0.11

 Attachments: PIG-2391.patch


 This test is currently commented out but if you uncomment it it fails with 
 Pig 10 but runs successfully with Pig 9.
 Script:
 a = load '/homes/olgan/studenttab10k' using PigStorage() as (name, age, gpa);
 store a into 'intermediate.bz';
 b = load 'intermediate.bz';
 store b into 'final.bz';
 A couple of observations:
 (1) Identical script (represented by Bzip_1 test) that has bz2 instead of bz 
 extension in the script succeeds in Pig 10
 (2) The problem occurs while reading intermediate.bz which has different size 
 with Pig 9 and Pig 10
 (3) Problem can be reproduced in local mode with small subset of data in the 
 file
 (4) The following stack trace is observed:
 2011-12-01 13:53:12,280 [Thread-22] WARN  
 org.apache.hadoop.mapred.LocalJobRunner - job_local_0002
 java.lang.RuntimeException: java.io.IOException: compressedStream EOF
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.initNextRecordReader(PigRecordReader.java:237)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.init(PigRecordReader.java:109)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.createRecordReader(PigInputFormat.java:119)
 at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:588)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
 at 
 org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
 Caused by: java.io.IOException: compressedStream EOF
 at 
 org.apache.tools.bzip2r.CBZip2InputStream.cadvise(CBZip2InputStream.java:92)
 at 
 org.apache.tools.bzip2r.CBZip2InputStream.compressedStreamEOF(CBZip2InputStream.java:96)
 at 
 org.apache.tools.bzip2r.CBZip2InputStream.bsR(CBZip2InputStream.java:451)
 at 
 org.apache.tools.bzip2r.CBZip2InputStream.initBlock(CBZip2InputStream.java:348)
 at 
 org.apache.tools.bzip2r.CBZip2InputStream.init(CBZip2InputStream.java:220)
 at 
 org.apache.pig.bzip2r.Bzip2TextInputFormat$BZip2LineRecordReader.init(Bzip2TextInputFormat.java:105)
 at 
 org.apache.pig.bzip2r.Bzip2TextInputFormat.createRecordReader(Bzip2TextInputFormat.java:244)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.initNextRecordReader(PigRecordReader.java:227)
 ... 5 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (PIG-2391) Bzip_2 test is broken

2011-12-06 Thread Olga Natkovich (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-2391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13163899#comment-13163899
 ] 

Olga Natkovich commented on PIG-2391:
-

Looks like our comments crossed. So the issue is that Hadoop does not 
understand .bz extension and you need to fake it by saying it is actually bz2.

 Bzip_2 test is broken
 -

 Key: PIG-2391
 URL: https://issues.apache.org/jira/browse/PIG-2391
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.10
Reporter: Olga Natkovich
Assignee: xuting zhao
 Fix For: 0.10, 0.11

 Attachments: PIG-2391.patch


 This test is currently commented out but if you uncomment it it fails with 
 Pig 10 but runs successfully with Pig 9.
 Script:
 a = load '/homes/olgan/studenttab10k' using PigStorage() as (name, age, gpa);
 store a into 'intermediate.bz';
 b = load 'intermediate.bz';
 store b into 'final.bz';
 A couple of observations:
 (1) Identical script (represented by Bzip_1 test) that has bz2 instead of bz 
 extension in the script succeeds in Pig 10
 (2) The problem occurs while reading intermediate.bz which has different size 
 with Pig 9 and Pig 10
 (3) Problem can be reproduced in local mode with small subset of data in the 
 file
 (4) The following stack trace is observed:
 2011-12-01 13:53:12,280 [Thread-22] WARN  
 org.apache.hadoop.mapred.LocalJobRunner - job_local_0002
 java.lang.RuntimeException: java.io.IOException: compressedStream EOF
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.initNextRecordReader(PigRecordReader.java:237)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.init(PigRecordReader.java:109)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.createRecordReader(PigInputFormat.java:119)
 at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:588)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
 at 
 org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
 Caused by: java.io.IOException: compressedStream EOF
 at 
 org.apache.tools.bzip2r.CBZip2InputStream.cadvise(CBZip2InputStream.java:92)
 at 
 org.apache.tools.bzip2r.CBZip2InputStream.compressedStreamEOF(CBZip2InputStream.java:96)
 at 
 org.apache.tools.bzip2r.CBZip2InputStream.bsR(CBZip2InputStream.java:451)
 at 
 org.apache.tools.bzip2r.CBZip2InputStream.initBlock(CBZip2InputStream.java:348)
 at 
 org.apache.tools.bzip2r.CBZip2InputStream.init(CBZip2InputStream.java:220)
 at 
 org.apache.pig.bzip2r.Bzip2TextInputFormat$BZip2LineRecordReader.init(Bzip2TextInputFormat.java:105)
 at 
 org.apache.pig.bzip2r.Bzip2TextInputFormat.createRecordReader(Bzip2TextInputFormat.java:244)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.initNextRecordReader(PigRecordReader.java:227)
 ... 5 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

1 2 3 4 5 6 >

1 - 100 of 575 matches

Mail list logo