Re: [VOTE] Release Apache Tez-0.10.1 RC0

2021-06-23 Thread Jonathan Eagles
REMINDER. We will need three binding +1 on this release for it to be an
official apache release. Please try and verify this release today.

On Sun, Jun 20, 2021 at 3:04 AM László Bodor 
wrote:

> Hi Team!
>
> I have created a tez-0.10.1 release candidate rc0.
> GIT source tag (release-0.10.1-rc0)
>
>
> https://gitbox.apache.org/repos/asf?p=tez.git;a=commit;h=refs/tags/release-0.10.1-rc0
> (355fbc14caeaefab08cb2045f2d9d83435c5be70
> 
> )
>
> Staging site:
> https://dist.apache.org/repos/dist/dev/tez/apache-tez-0.10.1-rc0/ (svn
> revision: 48404)
>
> PGP release keys (signed using 0x4ECA5CA5E303605A)
> http://pgp.mit.edu:11371/pks/lookup?op=vindex=0x4ECA5CA5E303605A
>
> KEYS file available at https://dist.apache.org/repos/dist/release/tez/KEYS
>
> One can look into the issues fixed in this release at:
>
> https://issues.apache.org/jira/browse/TEZ-4309?jql=project%20%3D%20%22Apache%20Tez%22%20%20and%20fixVersion%20%3D%20%220.10.1%22
>
> Vote will be open for at least 72 hours.
> [ ] +1 approve
> [ ] +0 no opinion
> [ ] -1 disapprove (and reason why)
>
> Regards,
> Laszlo Bodor
>


Re: [VOTE] Release Apache Tez-0.10.1 RC0

2021-06-22 Thread Jonathan Eagles
+1. Thanks for this new release candidate.

I have verified the release by checking keys, signature, sha512, and md5.
Compiled the src release and run a test run with a compatible hadoop
release.

jeagles

On Sun, Jun 20, 2021 at 3:04 AM László Bodor 
wrote:

> Hi Team!
>
> I have created a tez-0.10.1 release candidate rc0.
> GIT source tag (release-0.10.1-rc0)
>
>
> https://gitbox.apache.org/repos/asf?p=tez.git;a=commit;h=refs/tags/release-0.10.1-rc0
> (355fbc14caeaefab08cb2045f2d9d83435c5be70
> 
> )
>
> Staging site:
> https://dist.apache.org/repos/dist/dev/tez/apache-tez-0.10.1-rc0/ (svn
> revision: 48404)
>
> PGP release keys (signed using 0x4ECA5CA5E303605A)
> http://pgp.mit.edu:11371/pks/lookup?op=vindex=0x4ECA5CA5E303605A
>
> KEYS file available at https://dist.apache.org/repos/dist/release/tez/KEYS
>
> One can look into the issues fixed in this release at:
>
> https://issues.apache.org/jira/browse/TEZ-4309?jql=project%20%3D%20%22Apache%20Tez%22%20%20and%20fixVersion%20%3D%20%220.10.1%22
>
> Vote will be open for at least 72 hours.
> [ ] +1 approve
> [ ] +0 no opinion
> [ ] -1 disapprove (and reason why)
>
> Regards,
> Laszlo Bodor
>


[NOTICE] New Apache Tez PMC Chair László Bodor

2021-06-17 Thread Jonathan Eagles
The nomination for László Bodor as Apache Tez PMC Chair was accepted by the
board yesterday in the monthly meeting. Please congratulate our new chair.

I am happy to pass this position on to someone who I have so much
confidence in. I expect that the project will continue to grow and flourish
under this new leadership. I have seen László grow and take on such an
active role that it gives me great joy in being the first to welcome László
to this new role.

Welcome László Bodor, Apache Tez PMC Chair

Jon Eagles
Former Apache Tez PMC Chair


Re: Setting up Tez on windows

2021-06-12 Thread Jonathan Eagles
On Sat, Jun 12, 2021, 3:15 PM Jonathan Eagles  wrote:

> This request will be better directed to the hive mailing list to get the
> answer you need.
>
> On Sat, Jun 12, 2021, 3:13 PM tar chan  wrote:
>
>> Hi there,
>>
>> I am using Embedded mode of HiveLocalServer2 setup on windows machine.
>>
>> I setup'ed using  java files like HiveLocalServer2 , HiveConf,
>> HiveMetaStore, Zoopkeeper, Hadoop home
>>
>> But, how to setup and integrate Tez with above setup?
>>
>> I could not find any clear step by step documentation.
>>
>> Pls provide the steps ASAP as its urgent to test POC using Tez
>>
>> Thanks,
>>
>


Re: Setting up Tez on windows

2021-06-12 Thread Jonathan Eagles
This request will be better directed to the hive mailing list to get the
answer you need.

On Sat, Jun 12, 2021, 3:13 PM tar chan  wrote:

> Hi there,
>
> I am using Embedded mode of HiveLocalServer2 setup on windows machine.
>
> I setup'ed using  java files like HiveLocalServer2 , HiveConf,
> HiveMetaStore, Zoopkeeper, Hadoop home
>
> But, how to setup and integrate Tez with above setup?
>
> I could not find any clear step by step documentation.
>
> Pls provide the steps ASAP as its urgent to test POC using Tez
>
> Thanks,
>


Re: Tez Test Failure

2021-03-14 Thread Jonathan Eagles
https://issues.apache.org/jira/browse/TEZ-4074

Which doesn't necessarily mean it's a bug,  but rather there are two
different versions of guava in the classpath. Can you continue the
conversation there?

On Sun, Mar 14, 2021, 10:45 AM Suryansh Agnihotri <
agnihotrisuryans...@gmail.com> wrote:

> Hi Jonathan
> should I create a Jira for this?
>
> On Thu, 11 Mar 2021 at 15:47, Suryansh Agnihotri <
> agnihotrisuryans...@gmail.com> wrote:
>
>> Hey Jonathan
>> I tried with 0.10.0 but now I am getting below error in the test.
>> Building doc does not mention any setup to be done. I just installed all
>> requirements mentioned in the doc.
>> I am using "mvn test -Dhadoop.version=3.1.2 -Phadoop3
>> -Dsurefire.rerunFailingTestsCount=3" to run the tests.
>>
>> [INFO] Running org.apache.tez.common.TestTezCommonUtils
>> [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed:
>> 4.331 s <<< FAILURE! - in org.apache.tez.common.TestTezCommonUtils
>> [ERROR] org.apache.tez.common.TestTezCommonUtils  Time elapsed: 4.33 s
>>  <<< ERROR!
>> java.lang.NoSuchMethodError:
>> com.google.common.util.concurrent.Futures.addCallback(Lcom/google/common/util/concurrent/ListenableFuture;Lcom/google/common/util/concurrent/FutureCallback;)V
>> at
>> org.apache.tez.common.TestTezCommonUtils.setup(TestTezCommonUtils.java:65)
>>
>> On Wed, 10 Mar 2021 at 22:48, Suryansh Agnihotri <
>> agnihotrisuryans...@gmail.com> wrote:
>>
>>> Thanks, I'll check the 0.10 branch and let you know if face any issues.
>>>
>>> On Wed, 10 Mar 2021 at 21:20, Jonathan Eagles  wrote:
>>>
>>>> Essentially, 0.10 and 0.9 are the same with the exception of hadoop 3.x
>>>> compatibility and a few minor changes. Both release lines are considered
>>>> active and fully supported. I will suggest development using the
>>>> branch-0.10.0 or master branch for this case. In this way if there are
>>>> other incompatibilities found, you can get the best support from the tez
>>>> open source community.
>>>>
>>>> Let me know if this helps guide you to a decision.
>>>>
>>>> On Wed, Mar 10, 2021 at 9:43 AM Suryansh Agnihotri <
>>>> agnihotrisuryans...@gmail.com> wrote:
>>>>
>>>>> Hello Jonathan
>>>>> Thanks for the quick response. I am trying to build tez 0.9.2 with
>>>>> Hadoop 3.1.2.using "mvn package -Dhadoop.version=3.1.0 -Phadoop3".
>>>>> Will I be not able to use "branch-0.9.2", could you please specify if
>>>>> there are many compatibility issues observed/fixed wrt Hadoop 3.1+.
>>>>> If there are a few, can I just backport them in my "branch-0.92", will
>>>>> that make sense?
>>>>>
>>>>>
>>>>> On Wed, 10 Mar 2021 at 21:04, Jonathan Eagles 
>>>>> wrote:
>>>>>
>>>>>> Suryansh, thanks for reaching out to this email list. We are going to
>>>>>> need some more information about the issue. This message can occur for a
>>>>>> number of reasons. TEZ-3884 is most directly related to running tez 
>>>>>> against
>>>>>> hadoop 3.1+. Tez has created a new branch branch-0.10 to address hadoop 3
>>>>>> compatibility. If your issue is related to hadoop 3 compatibility, please
>>>>>> try tez 0.10.0 or branch-10.0 or even master branch. If you are running
>>>>>> against hadoop 2.x, then likely this is a different issue and will need
>>>>>> more understanding of client classpath setup as per
>>>>>> http://tez.apache.org/install.html.
>>>>>>
>>>>>> Regards,
>>>>>> jeagles
>>>>>>
>>>>>> On Wed, Mar 10, 2021 at 9:23 AM Suryansh Agnihotri <
>>>>>> agnihotrisuryans...@gmail.com> wrote:
>>>>>>
>>>>>>> Hello
>>>>>>> I am facing the same issue
>>>>>>> https://issues.apache.org/jira/browse/TEZ-3884
>>>>>>> but this is marked as resolved.
>>>>>>> Could anyone please help with this.
>>>>>>>
>>>>>>> Tez branch: *branch-0.9.2*
>>>>>>>
>>>>>>> StackTrace:
>>>>>>> [ERROR] COMPILATION ERROR :
>>>>>>> [INFO] -
>>>>>>> [ERROR]
>>>>>>>
>>>>>>> /Users/suryansh/Documents/OSS/tez/tez-api/src/test/java/org/apache/tez/client/TestTezClientUtils.java:[48,30]
>>>>>>> cannot find symbol
>>>>>>>   symbol:   class DistributedFileSystem
>>>>>>>   location: package org.apache.hadoop.hdfs
>>>>>>>
>>>>>>>
>>>>>>> Thanks
>>>>>>>
>>>>>>


Re: Tez Test Failure

2021-03-10 Thread Jonathan Eagles
Essentially, 0.10 and 0.9 are the same with the exception of hadoop 3.x
compatibility and a few minor changes. Both release lines are considered
active and fully supported. I will suggest development using the
branch-0.10.0 or master branch for this case. In this way if there are
other incompatibilities found, you can get the best support from the tez
open source community.

Let me know if this helps guide you to a decision.

On Wed, Mar 10, 2021 at 9:43 AM Suryansh Agnihotri <
agnihotrisuryans...@gmail.com> wrote:

> Hello Jonathan
> Thanks for the quick response. I am trying to build tez 0.9.2 with Hadoop
> 3.1.2.using "mvn package -Dhadoop.version=3.1.0 -Phadoop3".
> Will I be not able to use "branch-0.9.2", could you please specify if
> there are many compatibility issues observed/fixed wrt Hadoop 3.1+.
> If there are a few, can I just backport them in my "branch-0.92", will
> that make sense?
>
>
> On Wed, 10 Mar 2021 at 21:04, Jonathan Eagles  wrote:
>
>> Suryansh, thanks for reaching out to this email list. We are going to
>> need some more information about the issue. This message can occur for a
>> number of reasons. TEZ-3884 is most directly related to running tez against
>> hadoop 3.1+. Tez has created a new branch branch-0.10 to address hadoop 3
>> compatibility. If your issue is related to hadoop 3 compatibility, please
>> try tez 0.10.0 or branch-10.0 or even master branch. If you are running
>> against hadoop 2.x, then likely this is a different issue and will need
>> more understanding of client classpath setup as per
>> http://tez.apache.org/install.html.
>>
>> Regards,
>> jeagles
>>
>> On Wed, Mar 10, 2021 at 9:23 AM Suryansh Agnihotri <
>> agnihotrisuryans...@gmail.com> wrote:
>>
>>> Hello
>>> I am facing the same issue
>>> https://issues.apache.org/jira/browse/TEZ-3884
>>> but this is marked as resolved.
>>> Could anyone please help with this.
>>>
>>> Tez branch: *branch-0.9.2*
>>>
>>> StackTrace:
>>> [ERROR] COMPILATION ERROR :
>>> [INFO] -
>>> [ERROR]
>>>
>>> /Users/suryansh/Documents/OSS/tez/tez-api/src/test/java/org/apache/tez/client/TestTezClientUtils.java:[48,30]
>>> cannot find symbol
>>>   symbol:   class DistributedFileSystem
>>>   location: package org.apache.hadoop.hdfs
>>>
>>>
>>> Thanks
>>>
>>


Re: Tez Test Failure

2021-03-10 Thread Jonathan Eagles
Suryansh, thanks for reaching out to this email list. We are going to need
some more information about the issue. This message can occur for a number
of reasons. TEZ-3884 is most directly related to running tez against hadoop
3.1+. Tez has created a new branch branch-0.10 to address hadoop 3
compatibility. If your issue is related to hadoop 3 compatibility, please
try tez 0.10.0 or branch-10.0 or even master branch. If you are running
against hadoop 2.x, then likely this is a different issue and will need
more understanding of client classpath setup as per
http://tez.apache.org/install.html.

Regards,
jeagles

On Wed, Mar 10, 2021 at 9:23 AM Suryansh Agnihotri <
agnihotrisuryans...@gmail.com> wrote:

> Hello
> I am facing the same issue https://issues.apache.org/jira/browse/TEZ-3884
> but this is marked as resolved.
> Could anyone please help with this.
>
> Tez branch: *branch-0.9.2*
>
> StackTrace:
> [ERROR] COMPILATION ERROR :
> [INFO] -
> [ERROR]
>
> /Users/suryansh/Documents/OSS/tez/tez-api/src/test/java/org/apache/tez/client/TestTezClientUtils.java:[48,30]
> cannot find symbol
>   symbol:   class DistributedFileSystem
>   location: package org.apache.hadoop.hdfs
>
>
> Thanks
>


Re: TezIDCache

2021-01-28 Thread Jonathan Eagles
The TezIDCache is memory-saving cache, similar in function to java
String.intern but for objects. Tez states uses an event-based multithreaded
message passing system where hundreds of thousands of messages may be in
flight concurrently. A cache allows great reduction of message size and
therefore runtime memory requirements. However, Tez was also designed to
allow millions of tasks per DAG and tens of thousands of DAGs per session
(perhaps more). So to protect against memory bloat, the cache is
evaporative and uses soft references that the garbage collector can clear
when not in use any long or under memory pressure.

So it has extra complication to balance against the design for two demands.

On Thu, Jan 28, 2021 at 2:03 PM David  wrote:

> Hello,
>
> In the class TezID there is a caching mechanism I can't figure out.  What
> us the purpose of caching these objects? This is much like a set since the
> key and value are the same. Is there some requirement that the items in the
> cache have to be globally unique? Is this some sort of memory saving
> optimization to only maintain a single instance of each value?
>
> Thanks.
>


Re: [VOTE] Release Apache Tez-0.10.0 RC1

2020-10-16 Thread Jonathan Eagles
+1.

On Mon, Oct 12, 2020 at 2:54 AM László Bodor 
wrote:

> Hi Team!
>
> This is a kind reminder about RC1, let me proceed with that, I would
> appreciate +1s.
> Just remember, if RC0 was fine, RC1 will be perfect. :)
>
> Changes since RC0:
> https://issues.apache.org/jira/browse/TEZ-4228
> https://issues.apache.org/jira/browse/TEZ-4230
> https://issues.apache.org/jira/browse/TEZ-4234
> https://issues.apache.org/jira/browse/TEZ-4238
>
> Regards,
> Laszlo Bodor
>
> On Thu, 8 Oct 2020 at 18:57, László Bodor 
> wrote:
>
> > Hi Team!
> >
> > I have created an tez-0.10.0 release candidate rc1.
> > GIT source tag (release-0.10.0-rc1)
> >
> >
> >
> https://gitbox.apache.org/repos/asf?p=tez.git;a=commit;h=refs/tags/release-0.10.0-rc1
> >  (22fec6c0ecc7ebe6f6f28800935cc6f69794dad5)
> >
> > Staging site:
> > https://dist.apache.org/repos/dist/dev/tez/apache-tez-0.10.0-rc1/ (svn
> > revision: 41851)
> >
> > PGP release keys (signed using 0x4ECA5CA5E303605A)
> > http://pgp.mit.edu:11371/pks/lookup?op=vindex=0x4ECA5CA5E303605A
> >
> > KEYS file available at
> https://dist.apache.org/repos/dist/release/tez/KEYS
> >
> > One can look into the issues fixed in this release at
> >
> https://issues.apache.org/jira/browse/TEZ-4230?jql=project%20%3D%20%22Apache%20Tez%22%20%20and%20fixVersion%20%3D%20%220.10.0%22
> >
> > Vote will be open for atleast 72 hours.
> > [ ] +1 approve
> > [ ] +0 no opinion
> > [ ] -1 disapprove (and reason why)
> >
> > Regards,
> > Laszlo Bodor
> >
>


Re: [VOTE] Release Apache Tez-0.10.0 RC0

2020-09-22 Thread Jonathan Eagles
I have found a serious issue that may need a respin of the release. Let's
investigate this issue before announcing release.

https://issues.apache.org/jira/browse/TEZ-4234

Jon


Re: [VOTE] Release Apache Tez-0.10.0 RC0

2020-09-14 Thread Jonathan Eagles
+1.

Verified signatures. Checked build against hadoop 3.3.0 release using built
from src minimal distribution.

On Wed, Sep 9, 2020 at 4:12 AM László Bodor 
wrote:

> Hi Team!
>
> I have created a tez-0.10.0 release candidate rc0.
> GIT source tag (release-0.10.0-rc0)
>
>
> https://gitbox.apache.org/repos/asf?p=tez.git;a=commit;h=refs/tags/release-0.10.0-rc0
>  (2358bd85f4a359e6f2fea838e393ebd665afb496)
>
> Staging site:
> https://dist.apache.org/repos/dist/dev/tez/apache-tez-0.10.0-rc0/ (svn
> revision: 41371)
>
> PGP release keys (signed using 0x4ECA5CA5E303605A)
> http://pgp.mit.edu:11371/pks/lookup?op=vindex=0x4ECA5CA5E303605A
>
> KEYS file available at https://dist.apache.org/repos/dist/release/tez/KEYS
>
> One can look into the issues fixed in this release at
>
> https://issues.apache.org/jira/browse/TEZ-4230?jql=project%20%3D%20%22Apache%20Tez%22%20%20and%20fixVersion%20%3D%20%220.10.0%22
>
> Vote will be open for at least 72 hours.
> [ ] +1 approve
> [ ] +0 no opinion
> [ ] -1 disapprove (and reason why)
>
> Regards,
> Laszlo Bodor
>


Re: Tez migration from Jenkins to Cloudbees

2020-08-26 Thread Jonathan Eagles
Devs,
I have migrated the Jenkins Jobs to Cloudbees. The Tez view on the new
build server is below.

https://ci-hadoop.apache.org/view/Tez/

Please email me with any questions or problems. One work around is that the
PreCommit-TEZ-Build comment will temporarily be posted by Hadoop QA instead
of Tez QA. I think this was a sufficient compromise to enable builds today.

On Wed, Aug 26, 2020 at 10:38 AM Jonathan Eagles  wrote:

> As per announcement from Apache, Jenkins is now shutdown. I am in the
> process of migrating jobs to the new framework and will keep you updated on
> the progress.
>
>
> https://cwiki.apache.org/confluence/display/INFRA/Migrating+jobs+from+Jenkins+to+Cloudbees
>
> Jon
>


Tez migration from Jenkins to Cloudbees

2020-08-26 Thread Jonathan Eagles
As per announcement from Apache, Jenkins is now shutdown. I am in the
process of migrating jobs to the new framework and will keep you updated on
the progress.

https://cwiki.apache.org/confluence/display/INFRA/Migrating+jobs+from+Jenkins+to+Cloudbees

Jon


Re: [DISCUSS] Tez 0.10.0 Release Planning

2020-08-25 Thread Jonathan Eagles
This will be a necessary step for this release. The bulk update option
(found under the tools) will be a great idea to address this.
https://issues.apache.org/jira/issues/?jql=project%20%3D%20TEZ%20AND%20resolution%20!%3D%20null%20and%20fixVersion%20in%20(0.10.1)

Because of the initial attempt at 0.10.0, the step to create a branch
release will need some unique steps. We will need to re-branch for 0.10.0
(delete branch and then create new or perhaps upmerge 0.10.0 from master).

On Tue, Aug 25, 2020 at 9:10 AM László Bodor 
wrote:

> ~60 jiras contain 0.10.1 as fixVersion, with a bulk update I can refresh
> them to 0.10.0 in order to stay inline with the current release.
>
> On Tue, 25 Aug 2020 at 16:07, Jonathan Eagles  wrote:
>
> > There was some initial work on a 0.10.0 release so there are some or
> > perhaps many JIRAs should be retargeted from 0.10.1 to 0.10.0.
> >
> > On Tue, Aug 25, 2020 at 7:03 AM László Bodor 
> > wrote:
> >
> > > There is only 1 jira left before release (TEZ-3645), I think we can
> wait
> > > for it to be committed (this week).
> > > We consistently used 0.10.1 as fixVersion on tickets, I guess all
> > > occurrences should be fixed to 0.10.0 on resolved tickets, as we are
> > about
> > > to release 0.10.0 now.
> > >
> > > Regards,
> > > Laszlo Bodor
> > >
> > >
> > > On Sun, 16 Aug 2020 at 09:54, László Bodor 
> > > wrote:
> > >
> > > > Applied "0.10_blocker" label to opened and mentioned jiras.
> > > >
> > > >
> > > >
> > >
> >
> https://issues.apache.org/jira/browse/TEZ-4213?jql=project%20in%20(%22Apache%20Tez%22)%20and%20labels%20%3D%200.10_blocker%20and%20status%20not%20in%20(Closed%2C%20Resolved)
> > > >
> > > > Regards,
> > > > Laszlo Bodor
> > > >
> > > >
> > > > On Fri, 14 Aug 2020 at 16:48, Jonathan Eagles 
> > wrote:
> > > >
> > > >> I think there is an outdated branch-0.10.0 from an early attempt at
> a
> > > >> 0.10.0 release. This branch should be recut or updated to be
> identical
> > > to
> > > >> the main branch.
> > > >>
> > > >> On Thu, Aug 13, 2020 at 12:44 AM Harish Jai Prakash Perumal
> > > >>  wrote:
> > > >>
> > > >> > I noticed that https://issues.apache.org/jira/browse/TEZ-3860 is
> > > >> present
> > > >> > in
> > > >> > 0.9 branch but not part of 0.10.0. Is this intentional? TEZ-4223
> is
> > > >> > dependent on TEZ-3860, if we do not want this in 0.10.0 we can
> skip
> > > >> > TEZ-4223.
> > > >> >
> > > >> > On Tue, Aug 11, 2020 at 7:09 PM Harish Jai Prakash Perumal <
> > > >> > h...@cloudera.com>
> > > >> > wrote:
> > > >> >
> > > >> > > This is also a regression and would be good to have:
> > > >> > > https://issues.apache.org/jira/browse/TEZ-4223
> > > >> > >
> > > >> > > On Tue, Aug 11, 2020 at 2:14 PM Rajesh Balamohan <
> > > >> rbalamo...@apache.org>
> > > >> > > wrote:
> > > >> > >
> > > >> > >> It would be good to add the following tickets
> > > >> > >>
> > > >> > >> https://issues.apache.org/jira/browse/TEZ-4208
> > > >> > >> https://issues.apache.org/jira/browse/TEZ-3645
> > > >> > >> https://issues.apache.org/jira/browse/TEZ-4216
> > > >> > >> https://issues.apache.org/jira/browse/TEZ-4199
> > > >> > >>
> > > >> > >>
> > > >> > >> ~Rajesh.B
> > > >> > >>
> > > >> > >> On Tue, Aug 11, 2020 at 1:14 PM László Bodor <
> > > >> bodorlaszlo0...@gmail.com
> > > >> > >
> > > >> > >> wrote:
> > > >> > >>
> > > >> > >> > Hi!
> > > >> > >> >
> > > >> > >> > JIRAS NEEDED:
> > > >> > >> > 1 known open regression at the moment, which (I'm aware of
> and)
> > > >> might
> > > >> > be
> > > >> > >> > worth waiting for:
> > > https://issues.apache.org/jira/browse/TEZ-4213
> > > >> > >> >
> > > >> > >> > RELEASE MANAGER:
> > > >> > >> > I'm interested in doing this!
> > > >> > >> >
> > > >> > >> > Regards,
> > > >> > >> > Laszlo Bodor
> > > >> > >> >
> > > >> > >> > On Mon, 10 Aug 2020 at 16:17, Jonathan Eagles <
> > jeag...@gmail.com
> > > >
> > > >> > >> wrote:
> > > >> > >> >
> > > >> > >> > > JIRAS NEEDED
> > > >> > >> > > Please give feedback for Tez 0.10.0 release planning.
> > > >> Specifically,
> > > >> > >> let's
> > > >> > >> > > propose jiras required for the release as well as the
> timing
> > > for
> > > >> the
> > > >> > >> > > release. Let's keep this discussion open for the week and
> > reply
> > > >> with
> > > >> > >> JIRA
> > > >> > >> > > proposals.
> > > >> > >> > >
> > > >> > >> > > RELEASE MANAGER
> > > >> > >> > > If anyone is wanting to volunteer for 0.10.0 release
> manager,
> > > >> please
> > > >> > >> let
> > > >> > >> > me
> > > >> > >> > > know. The process is documented, but having someone walk
> > > through
> > > >> the
> > > >> > >> > > process the first time is best.
> > > >> > >> > >
> > > >> > >> > > Jon Eagles
> > > >> > >> > > Tez PMC Chair
> > > >> > >> > >
> > > >> > >> >
> > > >> > >>
> > > >> > >
> > > >> >
> > > >>
> > > >
> > >
> >
>


Re: [DISCUSS] Tez 0.10.0 Release Planning

2020-08-25 Thread Jonathan Eagles
There was some initial work on a 0.10.0 release so there are some or
perhaps many JIRAs should be retargeted from 0.10.1 to 0.10.0.

On Tue, Aug 25, 2020 at 7:03 AM László Bodor 
wrote:

> There is only 1 jira left before release (TEZ-3645), I think we can wait
> for it to be committed (this week).
> We consistently used 0.10.1 as fixVersion on tickets, I guess all
> occurrences should be fixed to 0.10.0 on resolved tickets, as we are about
> to release 0.10.0 now.
>
> Regards,
> Laszlo Bodor
>
>
> On Sun, 16 Aug 2020 at 09:54, László Bodor 
> wrote:
>
> > Applied "0.10_blocker" label to opened and mentioned jiras.
> >
> >
> >
> https://issues.apache.org/jira/browse/TEZ-4213?jql=project%20in%20(%22Apache%20Tez%22)%20and%20labels%20%3D%200.10_blocker%20and%20status%20not%20in%20(Closed%2C%20Resolved)
> >
> > Regards,
> > Laszlo Bodor
> >
> >
> > On Fri, 14 Aug 2020 at 16:48, Jonathan Eagles  wrote:
> >
> >> I think there is an outdated branch-0.10.0 from an early attempt at a
> >> 0.10.0 release. This branch should be recut or updated to be identical
> to
> >> the main branch.
> >>
> >> On Thu, Aug 13, 2020 at 12:44 AM Harish Jai Prakash Perumal
> >>  wrote:
> >>
> >> > I noticed that https://issues.apache.org/jira/browse/TEZ-3860 is
> >> present
> >> > in
> >> > 0.9 branch but not part of 0.10.0. Is this intentional? TEZ-4223 is
> >> > dependent on TEZ-3860, if we do not want this in 0.10.0 we can skip
> >> > TEZ-4223.
> >> >
> >> > On Tue, Aug 11, 2020 at 7:09 PM Harish Jai Prakash Perumal <
> >> > h...@cloudera.com>
> >> > wrote:
> >> >
> >> > > This is also a regression and would be good to have:
> >> > > https://issues.apache.org/jira/browse/TEZ-4223
> >> > >
> >> > > On Tue, Aug 11, 2020 at 2:14 PM Rajesh Balamohan <
> >> rbalamo...@apache.org>
> >> > > wrote:
> >> > >
> >> > >> It would be good to add the following tickets
> >> > >>
> >> > >> https://issues.apache.org/jira/browse/TEZ-4208
> >> > >> https://issues.apache.org/jira/browse/TEZ-3645
> >> > >> https://issues.apache.org/jira/browse/TEZ-4216
> >> > >> https://issues.apache.org/jira/browse/TEZ-4199
> >> > >>
> >> > >>
> >> > >> ~Rajesh.B
> >> > >>
> >> > >> On Tue, Aug 11, 2020 at 1:14 PM László Bodor <
> >> bodorlaszlo0...@gmail.com
> >> > >
> >> > >> wrote:
> >> > >>
> >> > >> > Hi!
> >> > >> >
> >> > >> > JIRAS NEEDED:
> >> > >> > 1 known open regression at the moment, which (I'm aware of and)
> >> might
> >> > be
> >> > >> > worth waiting for:
> https://issues.apache.org/jira/browse/TEZ-4213
> >> > >> >
> >> > >> > RELEASE MANAGER:
> >> > >> > I'm interested in doing this!
> >> > >> >
> >> > >> > Regards,
> >> > >> > Laszlo Bodor
> >> > >> >
> >> > >> > On Mon, 10 Aug 2020 at 16:17, Jonathan Eagles  >
> >> > >> wrote:
> >> > >> >
> >> > >> > > JIRAS NEEDED
> >> > >> > > Please give feedback for Tez 0.10.0 release planning.
> >> Specifically,
> >> > >> let's
> >> > >> > > propose jiras required for the release as well as the timing
> for
> >> the
> >> > >> > > release. Let's keep this discussion open for the week and reply
> >> with
> >> > >> JIRA
> >> > >> > > proposals.
> >> > >> > >
> >> > >> > > RELEASE MANAGER
> >> > >> > > If anyone is wanting to volunteer for 0.10.0 release manager,
> >> please
> >> > >> let
> >> > >> > me
> >> > >> > > know. The process is documented, but having someone walk
> through
> >> the
> >> > >> > > process the first time is best.
> >> > >> > >
> >> > >> > > Jon Eagles
> >> > >> > > Tez PMC Chair
> >> > >> > >
> >> > >> >
> >> > >>
> >> > >
> >> >
> >>
> >
>


Re: [DISCUSS] Tez 0.10.0 Release Planning

2020-08-14 Thread Jonathan Eagles
I think there is an outdated branch-0.10.0 from an early attempt at a
0.10.0 release. This branch should be recut or updated to be identical to
the main branch.

On Thu, Aug 13, 2020 at 12:44 AM Harish Jai Prakash Perumal
 wrote:

> I noticed that https://issues.apache.org/jira/browse/TEZ-3860 is present
> in
> 0.9 branch but not part of 0.10.0. Is this intentional? TEZ-4223 is
> dependent on TEZ-3860, if we do not want this in 0.10.0 we can skip
> TEZ-4223.
>
> On Tue, Aug 11, 2020 at 7:09 PM Harish Jai Prakash Perumal <
> h...@cloudera.com>
> wrote:
>
> > This is also a regression and would be good to have:
> > https://issues.apache.org/jira/browse/TEZ-4223
> >
> > On Tue, Aug 11, 2020 at 2:14 PM Rajesh Balamohan 
> > wrote:
> >
> >> It would be good to add the following tickets
> >>
> >> https://issues.apache.org/jira/browse/TEZ-4208
> >> https://issues.apache.org/jira/browse/TEZ-3645
> >> https://issues.apache.org/jira/browse/TEZ-4216
> >> https://issues.apache.org/jira/browse/TEZ-4199
> >>
> >>
> >> ~Rajesh.B
> >>
> >> On Tue, Aug 11, 2020 at 1:14 PM László Bodor  >
> >> wrote:
> >>
> >> > Hi!
> >> >
> >> > JIRAS NEEDED:
> >> > 1 known open regression at the moment, which (I'm aware of and) might
> be
> >> > worth waiting for: https://issues.apache.org/jira/browse/TEZ-4213
> >> >
> >> > RELEASE MANAGER:
> >> > I'm interested in doing this!
> >> >
> >> > Regards,
> >> > Laszlo Bodor
> >> >
> >> > On Mon, 10 Aug 2020 at 16:17, Jonathan Eagles 
> >> wrote:
> >> >
> >> > > JIRAS NEEDED
> >> > > Please give feedback for Tez 0.10.0 release planning. Specifically,
> >> let's
> >> > > propose jiras required for the release as well as the timing for the
> >> > > release. Let's keep this discussion open for the week and reply with
> >> JIRA
> >> > > proposals.
> >> > >
> >> > > RELEASE MANAGER
> >> > > If anyone is wanting to volunteer for 0.10.0 release manager, please
> >> let
> >> > me
> >> > > know. The process is documented, but having someone walk through the
> >> > > process the first time is best.
> >> > >
> >> > > Jon Eagles
> >> > > Tez PMC Chair
> >> > >
> >> >
> >>
> >
>


Re: [DISCUSS] Tez 0.10.0 Release Planning

2020-08-14 Thread Jonathan Eagles
Thank you devs for the feedback and discussion and thank you László Bodor
for volunteering to be 0.10.0 release manager. Let's track these jiras
above to completion and cut a release. The official process is documented
on our twiki and I can help to guide as needed.

https://cwiki.apache.org/confluence/display/TEZ/Making+a+TEZ+Release

Jon

On Thu, Aug 13, 2020 at 12:44 AM Harish Jai Prakash Perumal
 wrote:

> I noticed that https://issues.apache.org/jira/browse/TEZ-3860 is present
> in
> 0.9 branch but not part of 0.10.0. Is this intentional? TEZ-4223 is
> dependent on TEZ-3860, if we do not want this in 0.10.0 we can skip
> TEZ-4223.
>
> On Tue, Aug 11, 2020 at 7:09 PM Harish Jai Prakash Perumal <
> h...@cloudera.com>
> wrote:
>
> > This is also a regression and would be good to have:
> > https://issues.apache.org/jira/browse/TEZ-4223
> >
> > On Tue, Aug 11, 2020 at 2:14 PM Rajesh Balamohan 
> > wrote:
> >
> >> It would be good to add the following tickets
> >>
> >> https://issues.apache.org/jira/browse/TEZ-4208
> >> https://issues.apache.org/jira/browse/TEZ-3645
> >> https://issues.apache.org/jira/browse/TEZ-4216
> >> https://issues.apache.org/jira/browse/TEZ-4199
> >>
> >>
> >> ~Rajesh.B
> >>
> >> On Tue, Aug 11, 2020 at 1:14 PM László Bodor  >
> >> wrote:
> >>
> >> > Hi!
> >> >
> >> > JIRAS NEEDED:
> >> > 1 known open regression at the moment, which (I'm aware of and) might
> be
> >> > worth waiting for: https://issues.apache.org/jira/browse/TEZ-4213
> >> >
> >> > RELEASE MANAGER:
> >> > I'm interested in doing this!
> >> >
> >> > Regards,
> >> > Laszlo Bodor
> >> >
> >> > On Mon, 10 Aug 2020 at 16:17, Jonathan Eagles 
> >> wrote:
> >> >
> >> > > JIRAS NEEDED
> >> > > Please give feedback for Tez 0.10.0 release planning. Specifically,
> >> let's
> >> > > propose jiras required for the release as well as the timing for the
> >> > > release. Let's keep this discussion open for the week and reply with
> >> JIRA
> >> > > proposals.
> >> > >
> >> > > RELEASE MANAGER
> >> > > If anyone is wanting to volunteer for 0.10.0 release manager, please
> >> let
> >> > me
> >> > > know. The process is documented, but having someone walk through the
> >> > > process the first time is best.
> >> > >
> >> > > Jon Eagles
> >> > > Tez PMC Chair
> >> > >
> >> >
> >>
> >
>


[DISCUSS] Tez 0.10.0 Release Planning

2020-08-10 Thread Jonathan Eagles
JIRAS NEEDED
Please give feedback for Tez 0.10.0 release planning. Specifically, let's
propose jiras required for the release as well as the timing for the
release. Let's keep this discussion open for the week and reply with JIRA
proposals.

RELEASE MANAGER
If anyone is wanting to volunteer for 0.10.0 release manager, please let me
know. The process is documented, but having someone walk through the
process the first time is best.

Jon Eagles
Tez PMC Chair


[DISCUSS] Tez 0.9.3 Release Planning

2020-08-10 Thread Jonathan Eagles
JIRAS NEEDED
Please give feedback for Tez 0.9.3 release planning. Specifically, let's
propose jiras required for the release as well as the timing for the
release. Let's keep this discussion open for the week and reply with JIRA
proposals.

RELEASE MANAGER
If anyone is wanting to volunteer for 0.9.3 release manager, please let me
know. The process is documented, but having someone walk through the
process the first time is best.

Jon Eagles
Tez PMC Chair


[DISCUSS] A more inclusive community - Rename master branch to trunk

2020-08-05 Thread Jonathan Eagles
A step towards inclusion has been made.

I have filed https://issues.apache.org/jira/browse/TEZ-4217 as an umbrella
jira to contain sub-tasks of work that will move the Tez community
towards inclusivity. As a first pass, I have requested the removal of
whitelist/blacklist, master/slave terminology, and added a task towards
enforcing inclusive language policies. I expect that umbrella to remain
open to reflect that the work a journey and is never done.

To that end, I propose renaming the master branch to trunk. The process for
this change is 1) creating a new branch trunk based on the master branch,
2) modify build configuration (including default branch in YETUS), 3)
update and publish documentation/website, and 4) notify INFRA to switch.
master branch will be EOL'd, but will remain in place to allow external
build systems to continue using this until migration to trunk is complete.

Regards,
Jon Eagles
Tez PMC Chair


[NOTICE] Tez Snapshots builds fixed for 0.9.x and 0.10.x release lines

2020-07-29 Thread Jonathan Eagles
When migrating to yetus build environment to overcome build
incompatibility with OS upgrades in Jenkins, snapshot builds were not
migrated and have been failing for some time. I have written a small yetus
plugin that adds maven snapshots deploy (as that capability was not
provided by yetus 0.12.0).

https://builds.apache.org/job/Tez-qbt-0.9-snapshots/
https://builds.apache.org/job/Tez-qbt-0.10-snapshots/

I have updated the Tez Jenkins view to show these new builds as part of the
Tez project.

https://builds.apache.org/view/S-Z/view/Tez/

As these are new work, there are likely issues. Please ping the dev list or
reach out to me to for issues.

Jon Eagles
Tez PMC Chair


Re: tez 0.10 SNAPSHOT artifacts

2020-07-27 Thread Jonathan Eagles
I have uploaded a new set of 0.10.1 snapshots with
version 0.10.1-20200727.165306-21. This will need to be a manual step for
now until I finish getting the automatic snapshot build working again.

On Mon, Jul 27, 2020 at 9:46 AM László Bodor 
wrote:

> Hi!
>
> I'm wondering what's the policy of generating tez SNAPSHOT artifacts.
>
> Seems like the "current" tez jar doesn't contain some changes:
>
> https://repository.apache.org/snapshots/org/apache/tez/tez-runtime-library/0.10.1-SNAPSHOT/tez-runtime-library-0.10.1-20190426.191641-20.jar
>
> I need to see and compile against TezSpillRecord changes introduced in
> TEZ-4145, but after extracting TezSpillRecord.class I get this for javap:
>
> javap TezSpillRecord.class
> Compiled from "TezSpillRecord.java"
> ...
>   public
> org.apache.tez.runtime.library.common.sort.impl.TezSpillRecord(int);
>   public
>
> org.apache.tez.runtime.library.common.sort.impl.TezSpillRecord(org.apache.hadoop.fs.Path,
> org.apache.hadoop.conf.Configuration) throws java.io.IOException;
>   public
>
> org.apache.tez.runtime.library.common.sort.impl.TezSpillRecord(org.apache.hadoop.fs.Path,
> org.apache.hadoop.conf.Configuration, java.lang.String) throws
> java.io.IOException;
>   public
>
> org.apache.tez.runtime.library.common.sort.impl.TezSpillRecord(org.apache.hadoop.fs.Path,
> org.apache.hadoop.conf.Configuration, java.util.zip.Checksum,
> java.lang.String) throws java.io.IOException;
> ...
> }
>
> I'm expecting the new constructor to be here (Path, FileSystem, String):
>
> https://github.com/apache/tez/commit/7dbec63e1f97eea95ab998e16ffcd592ff6be332#diff-358f7c423b64e350b9cae8462c24de20R60
>
> How could I proceed with this?
> I would be also happy to use another repo for newer snapshots, if that
> makes sense.
> Currently, without setting.xml, maven downloads above jar from
> repository.apache.org/snapshots:
>
> Downloading from repository-release:
>
> https://repository.apache.org/content/repositories/releases/org/apache/tez/tez-runtime-library/0.10.1-SNAPSHOT/maven-metadata.xml
> Downloading from apache.snapshots:
>
> https://repository.apache.org/snapshots/org/apache/tez/tez-runtime-library/0.10.1-SNAPSHOT/maven-metadata.xml
> Downloading from central:
>
> https://repo.maven.apache.org/maven2/org/apache/tez/tez-runtime-library/0.10.1-SNAPSHOT/maven-metadata.xml
> Downloaded from apache.snapshots:
>
> https://repository.apache.org/snapshots/org/apache/tez/tez-runtime-library/0.10.1-SNAPSHOT/maven-metadata.xml
> (1.0 kB at 3.7 kB/s)
> Downloading from central:
>
> https://repo.maven.apache.org/maven2/org/apache/tez/tez-runtime-library/0.10.1-SNAPSHOT/tez-runtime-library-0.10.1-20190426.191641-20.pom
> Downloading from repository-release:
>
> https://repository.apache.org/content/repositories/releases/org/apache/tez/tez-runtime-library/0.10.1-SNAPSHOT/tez-runtime-library-0.10.1-20190426.191641-20.pom
> Downloading from apache.snapshots:
>
> https://repository.apache.org/snapshots/org/apache/tez/tez-runtime-library/0.10.1-SNAPSHOT/tez-runtime-library-0.10.1-20190426.191641-20.pom
> Downloaded from apache.snapshots:
>
> https://repository.apache.org/snapshots/org/apache/tez/tez-runtime-library/0.10.1-SNAPSHOT/tez-runtime-library-0.10.1-20190426.191641-20.pom
> (5.7 kB at 33 kB/s)
> Downloading from apache.snapshots:
>
> https://repository.apache.org/snapshots/org/apache/tez/tez-runtime-library/0.10.1-SNAPSHOT/tez-runtime-library-0.10.1-20190426.191641-20.jar
> Downloaded from apache.snapshots:
>
> https://repository.apache.org/snapshots/org/apache/tez/tez-runtime-library/0.10.1-SNAPSHOT/tez-runtime-library-0.10.1-20190426.191641-20.jar
> (779 kB at 2.6 MB/s)
>
>
> Regards,
> Laszlo Bodor
>


Re: tez 0.10 SNAPSHOT artifacts

2020-07-27 Thread Jonathan Eagles
The snapshot build has stopped working apparently a long time ago. When
build machines OS was updated. The migration to yetus for snapshot builds
was forgotten. I'll see if it can be easily fixed. Otherwise, I'll push a
new snapshot artifact manually for this release branch.

https://builds.apache.org/job/Tez-Build/

On Mon, Jul 27, 2020 at 9:46 AM László Bodor 
wrote:

> Hi!
>
> I'm wondering what's the policy of generating tez SNAPSHOT artifacts.
>
> Seems like the "current" tez jar doesn't contain some changes:
>
> https://repository.apache.org/snapshots/org/apache/tez/tez-runtime-library/0.10.1-SNAPSHOT/tez-runtime-library-0.10.1-20190426.191641-20.jar
>
> I need to see and compile against TezSpillRecord changes introduced in
> TEZ-4145, but after extracting TezSpillRecord.class I get this for javap:
>
> javap TezSpillRecord.class
> Compiled from "TezSpillRecord.java"
> ...
>   public
> org.apache.tez.runtime.library.common.sort.impl.TezSpillRecord(int);
>   public
>
> org.apache.tez.runtime.library.common.sort.impl.TezSpillRecord(org.apache.hadoop.fs.Path,
> org.apache.hadoop.conf.Configuration) throws java.io.IOException;
>   public
>
> org.apache.tez.runtime.library.common.sort.impl.TezSpillRecord(org.apache.hadoop.fs.Path,
> org.apache.hadoop.conf.Configuration, java.lang.String) throws
> java.io.IOException;
>   public
>
> org.apache.tez.runtime.library.common.sort.impl.TezSpillRecord(org.apache.hadoop.fs.Path,
> org.apache.hadoop.conf.Configuration, java.util.zip.Checksum,
> java.lang.String) throws java.io.IOException;
> ...
> }
>
> I'm expecting the new constructor to be here (Path, FileSystem, String):
>
> https://github.com/apache/tez/commit/7dbec63e1f97eea95ab998e16ffcd592ff6be332#diff-358f7c423b64e350b9cae8462c24de20R60
>
> How could I proceed with this?
> I would be also happy to use another repo for newer snapshots, if that
> makes sense.
> Currently, without setting.xml, maven downloads above jar from
> repository.apache.org/snapshots:
>
> Downloading from repository-release:
>
> https://repository.apache.org/content/repositories/releases/org/apache/tez/tez-runtime-library/0.10.1-SNAPSHOT/maven-metadata.xml
> Downloading from apache.snapshots:
>
> https://repository.apache.org/snapshots/org/apache/tez/tez-runtime-library/0.10.1-SNAPSHOT/maven-metadata.xml
> Downloading from central:
>
> https://repo.maven.apache.org/maven2/org/apache/tez/tez-runtime-library/0.10.1-SNAPSHOT/maven-metadata.xml
> Downloaded from apache.snapshots:
>
> https://repository.apache.org/snapshots/org/apache/tez/tez-runtime-library/0.10.1-SNAPSHOT/maven-metadata.xml
> (1.0 kB at 3.7 kB/s)
> Downloading from central:
>
> https://repo.maven.apache.org/maven2/org/apache/tez/tez-runtime-library/0.10.1-SNAPSHOT/tez-runtime-library-0.10.1-20190426.191641-20.pom
> Downloading from repository-release:
>
> https://repository.apache.org/content/repositories/releases/org/apache/tez/tez-runtime-library/0.10.1-SNAPSHOT/tez-runtime-library-0.10.1-20190426.191641-20.pom
> Downloading from apache.snapshots:
>
> https://repository.apache.org/snapshots/org/apache/tez/tez-runtime-library/0.10.1-SNAPSHOT/tez-runtime-library-0.10.1-20190426.191641-20.pom
> Downloaded from apache.snapshots:
>
> https://repository.apache.org/snapshots/org/apache/tez/tez-runtime-library/0.10.1-SNAPSHOT/tez-runtime-library-0.10.1-20190426.191641-20.pom
> (5.7 kB at 33 kB/s)
> Downloading from apache.snapshots:
>
> https://repository.apache.org/snapshots/org/apache/tez/tez-runtime-library/0.10.1-SNAPSHOT/tez-runtime-library-0.10.1-20190426.191641-20.jar
> Downloaded from apache.snapshots:
>
> https://repository.apache.org/snapshots/org/apache/tez/tez-runtime-library/0.10.1-SNAPSHOT/tez-runtime-library-0.10.1-20190426.191641-20.jar
> (779 kB at 2.6 MB/s)
>
>
> Regards,
> Laszlo Bodor
>


[NOTICE] Tez Pre-Commit builds upgraded to 0.12.0, Docker to ubuntu:bionic

2020-07-23 Thread Jonathan Eagles
To address some ongoing issues with the Tez Pre-Commit build and JIRA
integration, yetus was upgraded to 0.12.0. So far the builds look good. In
addition, the docker image was upgraded to ubuntu:bionic. This has some
implications regarding python defaulting to 3+ and maven defaulting to
3.6.*+. Please let me know of any new issues with this upgrade.

Thanks to Mustafa Iman for contributions to identifying issues and fixes.

Jon Eagles
Tez PPMC


[DISCUSS] Addressing open jiras

2020-07-22 Thread Jonathan Eagles
As of today there are 1060 open TEZ project jiras spanning our 7 year
history. This is an unmanageable number from my perspective. Some of these
represent customer requests. And some have grown stale and are just needing
closed.

I would like to hear ideas of how to reduce these numbers. Perhaps a bug
bash, or a way to distribute the load for analysis. But I want to hear from
you.

https://issues.apache.org/jira/issues/?jql=project%20%3D%20TEZ%20AND%20resolution%20%3D%20Unresolved%20ORDER%20BY%20updated%20DESC%2C%20priority%20DESC

More significantly, 84 of these are in the Patch Available status. This
represents value to the Tez
project that hasn't been realized. It is most important to drive this to
zero. Contributions are valued and we should make sure these are top
priority. We could set up a goal of reviewing a few per week. Ideas?
https://issues.apache.org/jira/issues/?jql=project%20%3D%20TEZ%20AND%20status%20%3D%20%22Patch%20Available%22%20

Furthermore,

We have setup a Tez slack to ping developers for review some time ago.
apache-tez.slack.com

We should either retire this or we should use this resource to help
coordinate to look for reviewers on jiras.

Let me know your thoughts.
Jon Eagles
Tez PPMC


[jira] [Created] (TEZ-4079) Limit the size of broadcast data

2019-07-23 Thread Jonathan Eagles (JIRA)
Jonathan Eagles created TEZ-4079:


 Summary: Limit the size of broadcast data
 Key: TEZ-4079
 URL: https://issues.apache.org/jira/browse/TEZ-4079
 Project: Apache Tez
  Issue Type: Improvement
Reporter: Jonathan Eagles






--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Resolved] (TEZ-4068) Prevent new speculative attempt after task has issued canCommit to an attempt

2019-05-10 Thread Jonathan Eagles (JIRA)


 [ 
https://issues.apache.org/jira/browse/TEZ-4068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles resolved TEZ-4068.
--
   Resolution: Fixed
Fix Version/s: 0.9.3
   0.10.1

Thanks [~Chyler] for patch and [~yingdachen] for review. Committed patch to 
master and branch-0.9.

> Prevent new speculative attempt after task has issued canCommit to an attempt
> -
>
> Key: TEZ-4068
> URL: https://issues.apache.org/jira/browse/TEZ-4068
> Project: Apache Tez
>  Issue Type: Improvement
>    Reporter: Jonathan Eagles
>Assignee: Ying Han
>Priority: Major
> Fix For: 0.10.1, 0.9.3
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> When a running attempt calls TaskImpl#canCommit through the taskUmbilical, 
> the TaskImpl will issue a "go" if it is the first attempt to do so. Otherwise 
> it will issue a "no-go". After commitAttempt is assigned is TaskImpl, no 
> other attempt is allowed to succeed at that point. So a speculative attempt 
> that is launched after commitAttempt is assigned can never finished before 
> the original since is will allows be given a "no-go" in the canCommit 
> response. In this jira, I propose to discuss disabling speculative attempts 
> after commitAttempt has been assigned.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (TEZ-4068) Prevent new speculative attempt after task has issued canCommit to an attempt

2019-05-08 Thread Jonathan Eagles (JIRA)
Jonathan Eagles created TEZ-4068:


 Summary: Prevent new speculative attempt after task has issued 
canCommit to an attempt
 Key: TEZ-4068
 URL: https://issues.apache.org/jira/browse/TEZ-4068
 Project: Apache Tez
  Issue Type: Improvement
Reporter: Jonathan Eagles


When a running attempt calls TaskImpl#canCommit through the taskUmbilical, the 
TaskImpl will issue a "go" if it is the first attempt to do so. Otherwise it 
will issue a "no-go". After commitAttempt is assigned is TaskImpl, no other 
attempt is allowed to succeed at that point. So a speculative attempt that is 
launched after commitAttempt is assigned can never finished before the original 
since is will allows be given a "no-go" in the canCommit response. In this 
jira, I propose to discuss disabling speculative attempts after commitAttempt 
has been assigned.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (TEZ-746) DAG and Vertex commit should ensure that all tasks have completed before committing

2019-05-07 Thread Jonathan Eagles (JIRA)


 [ 
https://issues.apache.org/jira/browse/TEZ-746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles resolved TEZ-746.
-
Resolution: Duplicate

> DAG and Vertex commit should ensure that all tasks have completed before 
> committing
> ---
>
> Key: TEZ-746
> URL: https://issues.apache.org/jira/browse/TEZ-746
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Bikas Saha
>Assignee: Bikas Saha
>Priority: Major
>
> There may be a race when a task (say a speculative task) is still running. 
> Then the committer commits (say lists & cleans some files in the output dir). 
> Then that task writes some temporary files in the output dir and gets killed. 
> Those temp files may remain behind.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (TEZ-4066) Upgrade servlet-api from 2.5 to 3.1.0

2019-05-03 Thread Jonathan Eagles (JIRA)
Jonathan Eagles created TEZ-4066:


 Summary: Upgrade servlet-api from 2.5 to 3.1.0
 Key: TEZ-4066
 URL: https://issues.apache.org/jira/browse/TEZ-4066
 Project: Apache Tez
  Issue Type: Bug
Reporter: Jonathan Eagles
Assignee: Jonathan Eagles


Oozie launcher jobs trying to launch Tez jobs now fail to render Oozie Launcher 
Job AM due to both 2.5 (from tez) and 3.1.0 (from hadoop) servlet-api both 
being in the classpath. Tez should sync with servlet api version from tez 
master branch that only supports hadoop 3+

{code}
2019-04-30 14:53:02,747 WARN [qtp1213419524-119] 
org.eclipse.jetty.server.HttpChannel:
java.lang.NoSuchMethodError: 
javax.servlet.http.HttpServletRequest.isAsyncStarted()Z
at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:688)
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:224)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
at 
org.eclipse.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:493)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
at org.eclipse.jetty.server.Server.handle(Server.java:534)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:333)
at 
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
at 
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108)
at 
org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
at 
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
at 
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
at 
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
at java.lang.Thread.run(Thread.java:748)
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (TEZ-4065) Yetus build fails on trunk due to relying on snapshot dependencies

2019-04-24 Thread Jonathan Eagles (JIRA)
Jonathan Eagles created TEZ-4065:


 Summary: Yetus build fails on trunk due to relying on snapshot 
dependencies
 Key: TEZ-4065
 URL: https://issues.apache.org/jira/browse/TEZ-4065
 Project: Apache Tez
  Issue Type: Bug
Reporter: Jonathan Eagles


As noted in TEZ-4062, Yetus tez builds for provided patch. Very first maven 
build step fails.

{code}
cd /testptch/tez/tez-dag
/usr/bin/mvn --batch-mode 
-Dmaven.repo.local=/home/jenkins/jenkins-slave/workspace/PreCommit-TEZ-Build/yetus-m2/tez-master-patch-0
 -fae clean install -DskipTests=true -Dmaven.javadoc.skip=true 
-Dcheckstyle.skip=true -Dfindbugs.skip=true > 
/testptch/patchprocess/branch-mvninstall-tez-dag.txt 2>&1
Elapsed:   3m 19s

tez-dag in master failed.
{code}

The cause is because there is no top level master install step. Instead, it 
tries to download Tez snapshot dependencies which are out of date. 

How do I convince Yetus to do a top level build like PreCommit-YARN?
Looking at a similar build in YARN first build step installs at the top level.
https://builds.apache.org/job/PreCommit-YARN-Build/24016/consoleText
{code}
cd /testptch/hadoop
/usr/bin/mvn --batch-mode 
-Dmaven.repo.local=/home/jenkins/jenkins-slave/workspace/PreCommit-YARN-Build/yetus-m2/hadoop-trunk-patch-1
 -Ptest-patch -DskipTests -fae clean install -DskipTests=true 
-Dmaven.javadoc.skip=true -Dcheckstyle.skip=true -Dfindbugs.skip=true > 
/testptch/patchprocess/branch-mvninstall-root.txt 2>&1
Elapsed:  18m 20s
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Question regarding released 0.10.X

2019-04-22 Thread Jonathan Eagles
I'll try to get that information for you tomorrow. Thanks for bringing
this to my attention. I'll talk to the branch release committer and
get back to you.

Jon

On Thu, Apr 18, 2019 at 8:15 AM  wrote:
>
> Hi, all. It seems that Tez 0.10.0 is abandoned as long master branch switched 
> to 0.10.1. Does anybody have any idea when could be released 0.10.X very 
> first Tez on Hadoop 3+?
>
> Best regards,
> Georgii Zemlianyi


[jira] [Created] (TEZ-4064) Integrate Tez with Github

2019-04-22 Thread Jonathan Eagles (JIRA)
Jonathan Eagles created TEZ-4064:


 Summary: Integrate Tez with Github
 Key: TEZ-4064
 URL: https://issues.apache.org/jira/browse/TEZ-4064
 Project: Apache Tez
  Issue Type: New Feature
Reporter: Jonathan Eagles
Assignee: Jonathan Eagles


According to HADOOP-16035, steps are as follows.
- an account that can read Github
- Apache Yetus 0.9.0+
- a Jenkinsfile that uses the above



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [VOTE] Release Apache Tez-0.9.2 RC0

2019-03-26 Thread Jonathan Eagles
+1. I have validated this release and signatures.

On Tue, Mar 19, 2019 at 6:12 PM Kuhu Shukla  wrote:
>
> Hello Tez folks,
>
> I have created an tez-0.9.2 release candidate rc0.
>
> Git Source Tag:
>
> https://git-wip-us.apache.org/repos/asf/tez/repo?p=tez.git;a=log;h=refs/tags/release-0.9.2-rc0
> 
> Staging site :
>
> https://dist.apache.org/repos/dist/dev/tez/apache-tez-0.9.2-rc0/
> 
>
> Nexus Staging URL :
>
> https://repository.apache.org/content/repositories/orgapachetez-1065
>
> PGP release keys (signed using ) :
>
> http://pgp.surfnet.nl/pks/lookup?op=get=0x4405B74BAAFFE291
>
> KEYS file available at :
>
> https://dist.apache.org/repos/dist/release/tez/KEYS
>
> One can look into the issues fixed in this release at:
>
> https://issues.apache.org/jira/projects/TEZ/versions/12342390
>
>
> Vote will be open for at least 72 hours or until the required number of PMC
> votes are obtained. Please reply to this thread for any
> issues/comments/concerns.
>
> [ ] +1 approve
> [ ] +0 no opinion
> [ ] -1 disapprove (and reason why)
>
> Here is my +1 (binding).
>
> Thanks and Regards,
>
> Kuhu Shukla


[jira] [Created] (TEZ-4052) Fit dot files ASF License issues - part 2

2019-03-08 Thread Jonathan Eagles (JIRA)
Jonathan Eagles created TEZ-4052:


 Summary: Fit dot files ASF License issues - part 2
 Key: TEZ-4052
 URL: https://issues.apache.org/jira/browse/TEZ-4052
 Project: Apache Tez
  Issue Type: Bug
Reporter: Jonathan Eagles
Assignee: Jonathan Eagles


Continuing the effort in TEZ-3995.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (TEZ-4046) insert overwrite table without data

2019-03-05 Thread Jonathan Eagles (JIRA)


 [ 
https://issues.apache.org/jira/browse/TEZ-4046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles resolved TEZ-4046.
--
Resolution: Cannot Reproduce

[~jipeng], closing this issue. Please direct questions as directed to the 
proper user lists.

> insert overwrite table without data
> ---
>
> Key: TEZ-4046
> URL: https://issues.apache.org/jira/browse/TEZ-4046
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.8.4
>Reporter: zengjipeng
>Priority: Major
>
> use hive on tez.
> execute as below statement, the target table tmp.zjp_b have no data.
>  
> {code:java}
> create table tmp.zjp_a(a string,b string);
> insert into tmp.zjp_a select "a1","1,2,3";
> create table tmp.zjp_b(name string,age string) partitioned by(dt string);
> insert overwrite table tmp.zjp_b partition(dt='2019-02-26')
> select a,age from tmp.zjp_a lateral view explode(split(b,",")) tb as age 
> where age='1'
> union all
> select a,age from tmp.zjp_a lateral view explode(split(b,",")) tb as age 
> where age='2';
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (TEZ-2884) Allow javadocs to be generated with Java8

2019-03-01 Thread Jonathan Eagles (JIRA)


 [ 
https://issues.apache.org/jira/browse/TEZ-2884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles resolved TEZ-2884.
--
Resolution: Duplicate

> Allow javadocs to be generated with Java8
> -
>
> Key: TEZ-2884
> URL: https://issues.apache.org/jira/browse/TEZ-2884
> Project: Apache Tez
>  Issue Type: Task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
>Priority: Major
>
> Java 8 introduces stricter javadoc checks, which causes javadoc generation to 
> fail.
> Allow javadocs to be generated, while we fix the actual issues.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (TEZ-3205) Update tez poms to remove hadoop 2.2/2.4 profiles

2019-03-01 Thread Jonathan Eagles (JIRA)


 [ 
https://issues.apache.org/jira/browse/TEZ-3205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles resolved TEZ-3205.
--
Resolution: Invalid

Closing now that these profiles are no longer valid and have already been 
removed

> Update tez poms to remove hadoop 2.2/2.4 profiles
> -
>
> Key: TEZ-3205
> URL: https://issues.apache.org/jira/browse/TEZ-3205
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Siddharth Seth
>Priority: Major
>  Labels: newbie
> Attachments: TEZ-3205.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (TEZ-3704) Tez-UI unit test failing

2019-02-28 Thread Jonathan Eagles (JIRA)


 [ 
https://issues.apache.org/jira/browse/TEZ-3704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles resolved TEZ-3704.
--
Resolution: Cannot Reproduce

> Tez-UI unit test failing
> 
>
> Key: TEZ-3704
> URL: https://issues.apache.org/jira/browse/TEZ-3704
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Yesha Vora
>Priority: Major
>
> tez-ui unit test is failing as below.
> {code}
> [INFO] 
> 
> [INFO] Reactor Summary:
> [INFO] 
> [INFO] tez  SUCCESS [  1.171 
> s]
> [INFO] tez-api  SUCCESS [ 25.416 
> s]
> [INFO] tez-common . SUCCESS [  0.156 
> s]
> [INFO] tez-runtime-internals .. SUCCESS [  0.812 
> s]
> [INFO] tez-runtime-library  SUCCESS [  1.190 
> s]
> [INFO] tez-mapreduce .. SUCCESS [  2.787 
> s]
> [INFO] tez-examples ... SUCCESS [  0.127 
> s]
> [INFO] tez-dag  SUCCESS [  4.707 
> s]
> [INFO] tez-tests .. SUCCESS [  7.205 
> s]
> [INFO] tez-ui . FAILURE [01:29 
> min]
> [INFO] tez-plugins  SKIPPED
> [INFO] tez-yarn-timeline-history .. SKIPPED
> [INFO] tez-history-parser . SKIPPED
> [INFO] tez-yarn-timeline-history-with-acls  SKIPPED
> [INFO] tez-yarn-timeline-cache-plugin . SKIPPED
> [INFO] tez-yarn-timeline-history-with-fs .. SKIPPED
> [INFO] tez-tools .. SKIPPED
> [INFO] tez-perf-analyzer .. SKIPPED
> [INFO] tez-job-analyzer ... SKIPPED
> [INFO] tez-dist ... SKIPPED
> [INFO] Tez  SKIPPED
> [INFO] 
> 
> [INFO] BUILD FAILURE
> [INFO] 
> 
> [INFO] Total time: 02:14 min
> [INFO] Finished at: 2017-04-19T19:31:02+00:00
> [INFO] Final Memory: 51M/885M
> [INFO] 
> 
> [ERROR] Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.3.2:exec 
> (ember test) on project tez-ui: Command execution failed. Process exited with 
> an error: 1 (Exit value: 1) -> [Help 1]
> [ERROR] 
> [ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
> switch.
> [ERROR] Re-run Maven using the -X switch to enable full debug logging.
> [ERROR] 
> [ERROR] For more information about the errors and possible solutions, please 
> read the following articles:
> [ERROR] [Help 1] 
> http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
> [ERROR] 
> [ERROR] After correcting the problems, you can resume the build with the 
> command
> [ERROR]   mvn  -rf :tez-ui{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (TEZ-2230) Speculative attempt should not have the original attempts machine in its preferred locations

2019-02-28 Thread Jonathan Eagles (JIRA)


 [ 
https://issues.apache.org/jira/browse/TEZ-2230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles resolved TEZ-2230.
--
Resolution: Duplicate

> Speculative attempt should not have the original attempts machine in its 
> preferred locations
> 
>
> Key: TEZ-2230
> URL: https://issues.apache.org/jira/browse/TEZ-2230
> Project: Apache Tez
>  Issue Type: Sub-task
>Affects Versions: 0.6.0
>Reporter: Bikas Saha
>Assignee: Bikas Saha
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (TEZ-3928) [Umbrella] Hadoop 3 test failures

2019-02-28 Thread Jonathan Eagles (JIRA)


 [ 
https://issues.apache.org/jira/browse/TEZ-3928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles resolved TEZ-3928.
--
Resolution: Fixed

> [Umbrella] Hadoop 3 test failures
> -
>
> Key: TEZ-3928
> URL: https://issues.apache.org/jira/browse/TEZ-3928
> Project: Apache Tez
>  Issue Type: Bug
>    Reporter: Jonathan Eagles
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (TEZ-4049) Fix findbugs issues in NotRunningJob

2019-02-27 Thread Jonathan Eagles (JIRA)
Jonathan Eagles created TEZ-4049:


 Summary: Fix findbugs issues in NotRunningJob
 Key: TEZ-4049
 URL: https://issues.apache.org/jira/browse/TEZ-4049
 Project: Apache Tez
  Issue Type: Bug
Reporter: Jonathan Eagles
Assignee: Jonathan Eagles


Introduced by TEZ-4035. Remove fixes while keeping 3.2.0 api compatibility. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (TEZ-4047) Tez trademark in xml is causing xml parsing issue

2019-02-27 Thread Jonathan Eagles (JIRA)
Jonathan Eagles created TEZ-4047:


 Summary: Tez trademark in xml is causing xml parsing issue
 Key: TEZ-4047
 URL: https://issues.apache.org/jira/browse/TEZ-4047
 Project: Apache Tez
  Issue Type: Bug
Reporter: Jonathan Eagles
Assignee: Jonathan Eagles


{code}

docs/src/site/site.xml: [Fatal Error] site.xml:97:34: The entity "reg" was 
referenced, but not declared. java.lang.RuntimeException: 
org.xml.sax.SAXParseException; systemId: 
file:/testptch/tez/./docs/src/site/site.xml; lineNumber: 97; columnNumber: 34; 
The entity "reg" was referenced, but not declared. at 
jdk.nashorn.internal.runtime.ScriptRuntime.apply(ScriptRuntime.java:397) at 
jdk.nashorn.api.scripting.NashornScriptEngine.evalImpl(NashornScriptEngine.java:449)
 at 
jdk.nashorn.api.scripting.NashornScriptEngine.evalImpl(NashornScriptEngine.java:406)
 at 
jdk.nashorn.api.scripting.NashornScriptEngine.evalImpl(NashornScriptEngine.java:402)
 at 
jdk.nashorn.api.scripting.NashornScriptEngine.eval(NashornScriptEngine.java:155)
 at javax.script.AbstractScriptEngine.eval(AbstractScriptEngine.java:264) at 
com.sun.tools.script.shell.Main.evaluateString(Main.java:298) at 
com.sun.tools.script.shell.Main.evaluateString(Main.java:319) at 
com.sun.tools.script.shell.Main.access$300(Main.java:37) at 
com.sun.tools.script.shell.Main$3.run(Main.java:217) at 
com.sun.tools.script.shell.Main.main(Main.java:48) Caused by: 
org.xml.sax.SAXParseException; systemId: 
file:/testptch/tez/./docs/src/site/site.xml; lineNumber: 97; columnNumber: 34; 
The entity "reg" was referenced, but not declared. at 
com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:257) 
at 
com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:339)
 at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:205) at 
jdk.nashorn.internal.scripts.Script$Recompilation$2$19313A$\^system_init\_.XMLDocument(:747)
 at jdk.nashorn.internal.scripts.Script$1$\^string\_.:program(:1) at 
jdk.nashorn.internal.runtime.ScriptFunctionData.invoke(ScriptFunctionData.java:637)
 at jdk.nashorn.internal.runtime.ScriptFunction.invoke(ScriptFunction.java:494) 
at jdk.nashorn.internal.runtime.ScriptRuntime.apply(ScriptRuntime.java:393) ... 
10 more

{code}

Also output from xmllint verifies xml issue as well.

{code}

xmllint .docs/src/site/site.xml
.//src/site/site.xml:97: parser error : Entity 'reg' not defined
 http://tez.apache.org/"/>
 ^
.//src/site/site.xml:123: parser error : Entity 'reg' not defined
 

{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (TEZ-4023) TestExtServicesWithLocalMode with NPE due to USER environment being null in Docker

2019-02-15 Thread Jonathan Eagles (JIRA)


 [ 
https://issues.apache.org/jira/browse/TEZ-4023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles resolved TEZ-4023.
--
Resolution: Duplicate

> TestExtServicesWithLocalMode with NPE due to USER environment being null in 
> Docker
> --
>
> Key: TEZ-4023
> URL: https://issues.apache.org/jira/browse/TEZ-4023
> Project: Apache Tez
>  Issue Type: Bug
>        Reporter: Jonathan Eagles
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (TEZ-712) VertexManagers should be created in the DAG itself

2019-02-15 Thread Jonathan Eagles (JIRA)


 [ 
https://issues.apache.org/jira/browse/TEZ-712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles resolved TEZ-712.
-
Resolution: Won't Fix

Closing this jira to prioritize tez jiras. Please reopen if needed.

> VertexManagers should be created in the DAG itself
> --
>
> Key: TEZ-712
> URL: https://issues.apache.org/jira/browse/TEZ-712
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>Priority: Major
>
> The VertexManager is currently created inside the Vertex - during 
> Initialization. This can be setup earlier, likely in the DAG itself.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (TEZ-564) Add history messages for Containers

2019-02-15 Thread Jonathan Eagles (JIRA)


 [ 
https://issues.apache.org/jira/browse/TEZ-564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles resolved TEZ-564.
-
Resolution: Done

> Add history messages for Containers
> ---
>
> Key: TEZ-564
> URL: https://issues.apache.org/jira/browse/TEZ-564
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
>Priority: Major
>
> Information like when a container was allocated, launched, individual task 
> assignments and completions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (TEZ-550) Allow re-launching the child JVM in between dags

2019-02-15 Thread Jonathan Eagles (JIRA)


 [ 
https://issues.apache.org/jira/browse/TEZ-550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles resolved TEZ-550.
-
Resolution: Won't Fix

Closing to focus tez priorities. Please reopen if needed.

> Allow re-launching the child JVM in between dags
> 
>
> Key: TEZ-550
> URL: https://issues.apache.org/jira/browse/TEZ-550
> Project: Apache Tez
>  Issue Type: New Feature
>Reporter: Bikas Saha
>Priority: Major
>
> Child JVM may leak memory or threads or statics etc. So it may make sense to 
> use the idle time between dags to re-launch the JVMs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (TEZ-505) Rename version in Events to attemptNumber

2019-02-15 Thread Jonathan Eagles (JIRA)


 [ 
https://issues.apache.org/jira/browse/TEZ-505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles resolved TEZ-505.
-
Resolution: Won't Fix

Closing this old jira to focus tez priorities. Please reopen if needed

> Rename version in Events to attemptNumber
> -
>
> Key: TEZ-505
> URL: https://issues.apache.org/jira/browse/TEZ-505
> Project: Apache Tez
>  Issue Type: Task
>Reporter: Siddharth Seth
>Priority: Major
>
> version represents different task attempts - this should be renamed to 
> reflect the same.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (TEZ-504) DataMovementEvent should not expose a constructor with targetIndex

2019-02-15 Thread Jonathan Eagles (JIRA)


 [ 
https://issues.apache.org/jira/browse/TEZ-504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles resolved TEZ-504.
-
Resolution: Won't Fix

Closing this issue to focus priorities. Reopen if still needed.

> DataMovementEvent should not expose a constructor with targetIndex
> --
>
> Key: TEZ-504
> URL: https://issues.apache.org/jira/browse/TEZ-504
> Project: Apache Tez
>  Issue Type: Task
>Reporter: Siddharth Seth
>Priority: Major
>
> This constructor is not meant to be used by users - since the targetIndex is 
> set by the framework. Similarly for other events which may expose this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (TEZ-325) Allow work units to be assigned to running tasks

2019-02-15 Thread Jonathan Eagles (JIRA)


 [ 
https://issues.apache.org/jira/browse/TEZ-325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles resolved TEZ-325.
-
Resolution: Won't Fix

This is a good idea, but I think it has stagnated. Closing this old jira.

> Allow work units to be assigned to running tasks
> 
>
> Key: TEZ-325
> URL: https://issues.apache.org/jira/browse/TEZ-325
> Project: Apache Tez
>  Issue Type: New Feature
>Reporter: Siddharth Seth
>Priority: Major
>
> An extension of container reuse. Instead of re-initializing new data 
> structures for each task run by a container - this is to allow a work unit to 
> be given to the task so that it can make use of existing structures.
> As an example - reusing sort buffers when running map tasks, Hive may need to 
> generate some vertex level information - which is the same across all tasks 
> in a vertex.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (TEZ-421) Allow users to specify YARN AuxiliaryService information

2019-02-15 Thread Jonathan Eagles (JIRA)


 [ 
https://issues.apache.org/jira/browse/TEZ-421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles resolved TEZ-421.
-
Resolution: Duplicate

This was fixed as part of Tez Shuffle Handler Auxiliary service.

> Allow users to specify YARN AuxiliaryService information
> 
>
> Key: TEZ-421
> URL: https://issues.apache.org/jira/browse/TEZ-421
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>Priority: Major
>
> If a custom AuxiliaryService is used (other than ShuffleHandler), users need 
> to be able to provide the Service name as well as meta-information for the 
> service and it's consumer.
> Likely to be an additional field per-edge for connection oriented 
> AuxiliaryServices, and per-vertex for non connection oriented 
> AuxiliaryServices.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (TEZ-60) Add a ContainerListener to the AM

2019-02-15 Thread Jonathan Eagles (JIRA)


 [ 
https://issues.apache.org/jira/browse/TEZ-60?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles resolved TEZ-60.

Resolution: Won't Fix

Closing very old jiras

> Add a ContainerListener to the AM
> -
>
> Key: TEZ-60
> URL: https://issues.apache.org/jira/browse/TEZ-60
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>Priority: Major
>
> The Container listener will be responsible for sending tasks to the Child. 
> Also for monitoring heartbeats from the container itself - useful in cases 
> where there's no child running.
> TaskUmbilical becomes purely task specific.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (TEZ-4042) Speculative attempts should avoid running on the same node

2019-02-13 Thread Jonathan Eagles (JIRA)
Jonathan Eagles created TEZ-4042:


 Summary: Speculative attempts should avoid running on the same node
 Key: TEZ-4042
 URL: https://issues.apache.org/jira/browse/TEZ-4042
 Project: Apache Tez
  Issue Type: Bug
Reporter: Jonathan Eagles






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (TEZ-4041) TestExtServicesWithLocalMode fails in docker

2019-02-13 Thread Jonathan Eagles (JIRA)
Jonathan Eagles created TEZ-4041:


 Summary: TestExtServicesWithLocalMode fails in docker
 Key: TEZ-4041
 URL: https://issues.apache.org/jira/browse/TEZ-4041
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Jonathan Eagles
Assignee: Jonathan Eagles


{code}
2019-02-13 00:24:33,703 INFO  [DAGAppMaster Thread] service.AbstractService 
(AbstractService.java:noteFailure(267)) - Service 
org.apache.tez.dag.app.DAGAppMaster failed in state INITED
org.apache.tez.dag.api.TezUncheckedException: 
java.lang.reflect.InvocationTargetException
at 
org.apache.tez.dag.app.TaskCommunicatorManager.createCustomTaskCommunicator(TaskCommunicatorManager.java:215)
at 
org.apache.tez.dag.app.TaskCommunicatorManager.createTaskCommunicator(TaskCommunicatorManager.java:184)
at 
org.apache.tez.dag.app.TaskCommunicatorManager.(TaskCommunicatorManager.java:152)
at 
org.apache.tez.dag.app.DAGAppMaster.createTaskCommunicatorManager(DAGAppMaster.java:1088)
at 
org.apache.tez.dag.app.DAGAppMaster.serviceInit(DAGAppMaster.java:532)
at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
at org.apache.tez.dag.app.DAGAppMaster$9.run(DAGAppMaster.java:2606)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1686)
at 
org.apache.tez.dag.app.DAGAppMaster.initAndStartAppMaster(DAGAppMaster.java:2603)
at org.apache.tez.client.LocalClient$1.run(LocalClient.java:327)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at 
org.apache.tez.dag.app.TaskCommunicatorManager.createCustomTaskCommunicator(TaskCommunicatorManager.java:213)
... 12 more
Caused by: java.lang.NullPointerException
at 
org.apache.tez.test.service.rpc.TezTestServiceProtocolProtos$SubmitWorkRequestProto$Builder.setUser(TezTestServiceProtocolProtos.java:5549)
at 
org.apache.tez.dag.app.taskcomm.TezTestServiceTaskCommunicatorImpl.(TezTestServiceTaskCommunicatorImpl.java:65)
... 17 more
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (TEZ-4040) Upgrade RoaringBitmap version to avoid NoSuchMethodError

2019-02-12 Thread Jonathan Eagles (JIRA)
Jonathan Eagles created TEZ-4040:


 Summary: Upgrade RoaringBitmap version to avoid NoSuchMethodError
 Key: TEZ-4040
 URL: https://issues.apache.org/jira/browse/TEZ-4040
 Project: Apache Tez
  Issue Type: Task
Reporter: Jonathan Eagles
Assignee: Jonathan Eagles


a common request is to use the runOptimize function which is present is later 
versions of roaringbitmap



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (TEZ-4037) Add back DAG search status KILLED

2019-02-06 Thread Jonathan Eagles (JIRA)
Jonathan Eagles created TEZ-4037:


 Summary: Add back DAG search status KILLED 
 Key: TEZ-4037
 URL: https://issues.apache.org/jira/browse/TEZ-4037
 Project: Apache Tez
  Issue Type: Task
  Components: UI
Reporter: Jonathan Eagles
Assignee: Jonathan Eagles


https://issues.apache.org/jira/browse/TEZ-2447 removed KILLED since sometimes 
this status can fail to search all KILLED DAGs. This jira re-adds KILLED dag 
status search since it still has value and would rather focus on fixing the 
DAGs who fail to write killed status to history log file. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Scaling VertexInitializedEvent

2019-02-04 Thread Jonathan Eagles
Eric, Could you post a stack trace of the error. We have had a few
bugs in the past that prevented messages over 64MB. This is only an
artificial limit based on the high-level reader and writer APIs that
are used. If a low-level api is used, messages of arbitrary length (or
at least up to 2GB IIRC) should be possible.

On Wed, Jan 30, 2019 at 2:47 PM Eric Goodman
 wrote:
>
> Hi Tez devs,
>
> The current design of 
> VertexInitializedEvent
>  contains all of the InputDataInformationEvents for a particular vertex. 
> We've had trouble scaling this for large vertices as Protobuf limits message 
> sizes to 64 MB. I'm wondering if anyone is working on a more scaleable 
> solution, and if not, if you guys have any suggestions for how to decompose 
> this event into smaller events so that Protobuf's message size limit is never 
> an issue.
>
> Thanks,
> Eric


[NOTICE] Move to gitbox Jan 14th 1900UTC

2019-01-09 Thread Jonathan Eagles
As per community discussion, I have filed a jira to have a scheduled
move to gitbox.
https://issues.apache.org/jira/browse/INFRA-17599

The INFRA team has confirmed a Monday, Jan 14th 1900UTC move to
gitbox. As part of this move, I have filed a TEZ jira to support the
migration (which is mainly to update the documentation)

https://issues.apache.org/jira/browse/TEZ-4031

Regards,
jeagles


[jira] [Created] (TEZ-4031) Support tez gitbox migration

2019-01-09 Thread Jonathan Eagles (JIRA)
Jonathan Eagles created TEZ-4031:


 Summary: Support tez gitbox migration
 Key: TEZ-4031
 URL: https://issues.apache.org/jira/browse/TEZ-4031
 Project: Apache Tez
  Issue Type: Task
Reporter: Jonathan Eagles
Assignee: Jonathan Eagles






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] Early Move to gitbox

2019-01-09 Thread Jonathan Eagles
Thanks all. With all this positive feedback, I will file an INFRA
ticket to move this issue forward.

On Mon, Jan 7, 2019 at 2:55 PM Jason Lowe  wrote:
>
> Sorry for the late reply.  I'm +1 for getting this moved early.  It sounds
> like the mandatory move could be inconvenient and surprising.  Better to do
> this on our own terms.
>
> Jason
>
> On Fri, Dec 14, 2018 at 4:54 PM Jonathan Eagles  wrote:
>
> > Apache Tez git repository is in git-wip-us server and it will be
> > decommissioned.
> > Please discuss issues and preferred timeline, I'll file a JIRA ticket
> > with INFRA to
> > migrate to https://gitbox.apache.org/ and update documentation.
> >
> > According to ASF infra team, the timeframe is as follows:
> >
> > > - December 9th 2018 -> January 9th 2019: Voluntary (coordinated)
> > relocation
> > > - January 9th -> February 6th: Mandated (coordinated) relocation
> > > - February 7th: All remaining repositories are mass migrated.
> > > This timeline may change to accommodate various scenarios.
> >
> > If we get consensus by January 9th, I can file a ticket with INFRA and
> > migrate it.
> > Even if we cannot got consensus, the repository will be migrated by
> > February 7th.
> >
> > Regards,
> > jeagles
> >
> > 
> > ORIGINAL NOTICE
> > 
> > Hello Apache projects,
> >
> > I am writing to you because you may have git repositories on the
> > git-wip-us server, which is slated to be decommissioned in the coming
> > months. All repositories will be moved to the new gitbox service which
> > includes direct write access on github as well as the standard ASF
> > commit access via gitbox.apache.org.
> >
> > ## Why this move? ##
> > The move comes as a result of retiring the git-wip service, as the
> > hardware it runs on is longing for retirement. In lieu of this, we
> > have decided to consolidate the two services (git-wip and gitbox), to
> > ease the management of our repository systems and future-proof the
> > underlying hardware. The move is fully automated, and ideally, nothing
> > will change in your workflow other than added features and access to
> > GitHub.
> >
> > ## Timeframe for relocation ##
> > Initially, we are asking that projects voluntarily request to move
> > their repositories to gitbox, hence this email. The voluntary
> > timeframe is between now and January 9th 2019, during which projects
> > are free to either move over to gitbox or stay put on git-wip. After
> > this phase, we will be requiring the remaining projects to move within
> > one month, after which we will move the remaining projects over.
> >
> > To have your project moved in this initial phase, you will need:
> >
> > - Consensus in the project (documented via the mailing list)
> > - File a JIRA ticket with INFRA to voluntarily move your project repos
> >over to gitbox (as stated, this is highly automated and will take
> >between a minute and an hour, depending on the size and number of
> >your repositories)
> >
> > To sum up the preliminary timeline;
> >
> > - December 9th 2018 -> January 9th 2019: Voluntary (coordinated)
> >relocation
> > - January 9th -> February 6th: Mandated (coordinated) relocation
> > - February 7th: All remaining repositories are mass migrated.
> >
> > This timeline may change to accommodate various scenarios.
> >
> > ## Using GitHub with ASF repositories ##
> > When your project has moved, you are free to use either the ASF
> > repository system (gitbox.apache.org) OR GitHub for your development
> > and code pushes. To be able to use GitHub, please follow the primer
> > at: https://reference.apache.org/committer/github
> >
> >
> > We appreciate your understanding of this issue, and hope that your
> > project can coordinate voluntarily moving your repositories in a
> > timely manner.
> >
> > All settings, such as commit mail targets, issue linking, PR
> > notification schemes etc will automatically be migrated to gitbox as
> > well.
> >
> > With regards, Daniel on behalf of ASF Infra.
> >
> > PS:For inquiries, please reply to us...@infra.apache.org, not your
> > project's dev list :-).
> >


Re: [DRAFT][REPORT] Tez - Jan 2019

2019-01-07 Thread Jonathan Eagles
Sorry. Please ignore this message sent to the wrong email list

On Mon, Jan 7, 2019 at 5:11 PM Jonathan Eagles  wrote:
>
> Devs, please give feedback on the January 2019 report.
> Regards,
> jeagles
>
> ## Description:
>  - Apache Tez is an effort to develop a generic application framework which
> can be used to process arbitrarily complex directed-acyclic graphs (DAGs) of
> data-processing tasks and also a re-usable set of data-processing primitives
> which can be used by other projects.
>
> ## Issues:
>  - There are no issues requiring board attention at this time.
>
> ## Activity:
> - New developers are getting support on using Tez for new projects and
> much community work is going toward that effort as well as support for
> JDK 11 and Apache Hadoop 3.0
>
> ## Health report:
> - Delay of Tez 0.9.2 release due to jetty 9 CVE issues. Branch 0.10.0
> branch cut and waiting on RM to post artifacts
>
> ## PMC changes:
>
>  - Currently 35 PMC members.
>  - No new PMC members added in the last 3 months
>  - Last PMC addition was Kuhu Shukla on Sun Mar 25 2018
>
> ## Committer base changes:
>
>  - Currently 38 committers.
>  - No new committers added in the last 3 months
>  - Last committer addition was Kuhu Shukla at Wed May 10 2017
>
> ## Releases:
>
>  - Last release was 0.9.0 on Thu Jul 27 2017
>
> ## Mailing list activity:
>
>  - dev@tez.apache.org:
> - 144 subscribers (up 0 in the last 3 months):
> - 115 emails sent to list (102 in previous quarter)
>
>  - iss...@tez.apache.org:
> - 48 subscribers (up 1 in the last 3 months):
> - 328 emails sent to list (531 in previous quarter)
>
>  - u...@tez.apache.org:
> - 212 subscribers (down -1 in the last 3 months):
> - 3 emails sent to list (5 in previous quarter)
>
>
> ## JIRA activity:
>
>  - 28 JIRA tickets created in the last 3 months
>  - 17 JIRA tickets closed/resolved in the last 3 months


[DRAFT][REPORT] Tez - Jan 2019

2019-01-07 Thread Jonathan Eagles
Devs, please give feedback on the January 2019 report.
Regards,
jeagles

## Description:
 - Apache Tez is an effort to develop a generic application framework which
can be used to process arbitrarily complex directed-acyclic graphs (DAGs) of
data-processing tasks and also a re-usable set of data-processing primitives
which can be used by other projects.

## Issues:
 - There are no issues requiring board attention at this time.

## Activity:
- New developers are getting support on using Tez for new projects and
much community work is going toward that effort as well as support for
JDK 11 and Apache Hadoop 3.0

## Health report:
- Delay of Tez 0.9.2 release due to jetty 9 CVE issues. Branch 0.10.0
branch cut and waiting on RM to post artifacts

## PMC changes:

 - Currently 35 PMC members.
 - No new PMC members added in the last 3 months
 - Last PMC addition was Kuhu Shukla on Sun Mar 25 2018

## Committer base changes:

 - Currently 38 committers.
 - No new committers added in the last 3 months
 - Last committer addition was Kuhu Shukla at Wed May 10 2017

## Releases:

 - Last release was 0.9.0 on Thu Jul 27 2017

## Mailing list activity:

 - dev@tez.apache.org:
- 144 subscribers (up 0 in the last 3 months):
- 115 emails sent to list (102 in previous quarter)

 - iss...@tez.apache.org:
- 48 subscribers (up 1 in the last 3 months):
- 328 emails sent to list (531 in previous quarter)

 - u...@tez.apache.org:
- 212 subscribers (down -1 in the last 3 months):
- 3 emails sent to list (5 in previous quarter)


## JIRA activity:

 - 28 JIRA tickets created in the last 3 months
 - 17 JIRA tickets closed/resolved in the last 3 months


Re: [DISCUSS] Early Move to gitbox

2019-01-07 Thread Jonathan Eagles
Without consensus, tez will be unable to move early onto gitbox. Are
there any thoughts/votes on this?

On Fri, Dec 14, 2018 at 4:53 PM Jonathan Eagles  wrote:
>
> Apache Tez git repository is in git-wip-us server and it will be
> decommissioned.
> Please discuss issues and preferred timeline, I'll file a JIRA ticket
> with INFRA to
> migrate to https://gitbox.apache.org/ and update documentation.
>
> According to ASF infra team, the timeframe is as follows:
>
> > - December 9th 2018 -> January 9th 2019: Voluntary (coordinated) relocation
> > - January 9th -> February 6th: Mandated (coordinated) relocation
> > - February 7th: All remaining repositories are mass migrated.
> > This timeline may change to accommodate various scenarios.
>
> If we get consensus by January 9th, I can file a ticket with INFRA and
> migrate it.
> Even if we cannot got consensus, the repository will be migrated by
> February 7th.
>
> Regards,
> jeagles
>
> 
> ORIGINAL NOTICE
> 
> Hello Apache projects,
>
> I am writing to you because you may have git repositories on the
> git-wip-us server, which is slated to be decommissioned in the coming
> months. All repositories will be moved to the new gitbox service which
> includes direct write access on github as well as the standard ASF
> commit access via gitbox.apache.org.
>
> ## Why this move? ##
> The move comes as a result of retiring the git-wip service, as the
> hardware it runs on is longing for retirement. In lieu of this, we
> have decided to consolidate the two services (git-wip and gitbox), to
> ease the management of our repository systems and future-proof the
> underlying hardware. The move is fully automated, and ideally, nothing
> will change in your workflow other than added features and access to
> GitHub.
>
> ## Timeframe for relocation ##
> Initially, we are asking that projects voluntarily request to move
> their repositories to gitbox, hence this email. The voluntary
> timeframe is between now and January 9th 2019, during which projects
> are free to either move over to gitbox or stay put on git-wip. After
> this phase, we will be requiring the remaining projects to move within
> one month, after which we will move the remaining projects over.
>
> To have your project moved in this initial phase, you will need:
>
> - Consensus in the project (documented via the mailing list)
> - File a JIRA ticket with INFRA to voluntarily move your project repos
>over to gitbox (as stated, this is highly automated and will take
>between a minute and an hour, depending on the size and number of
>your repositories)
>
> To sum up the preliminary timeline;
>
> - December 9th 2018 -> January 9th 2019: Voluntary (coordinated)
>relocation
> - January 9th -> February 6th: Mandated (coordinated) relocation
> - February 7th: All remaining repositories are mass migrated.
>
> This timeline may change to accommodate various scenarios.
>
> ## Using GitHub with ASF repositories ##
> When your project has moved, you are free to use either the ASF
> repository system (gitbox.apache.org) OR GitHub for your development
> and code pushes. To be able to use GitHub, please follow the primer
> at: https://reference.apache.org/committer/github
>
>
> We appreciate your understanding of this issue, and hope that your
> project can coordinate voluntarily moving your repositories in a
> timely manner.
>
> All settings, such as commit mail targets, issue linking, PR
> notification schemes etc will automatically be migrated to gitbox as
> well.
>
> With regards, Daniel on behalf of ASF Infra.
>
> PS:For inquiries, please reply to us...@infra.apache.org, not your
> project's dev list :-).


Re: Full cross join of root inputs

2018-12-19 Thread Jonathan Eagles
Adrian,

Sorry for the late reply. I'm out until second week in January. Not
being completely familiar with how full cross product edge was
implemented as part of TEZ-2104, I would want to compare the solutions
you present and weigh the pro/cons. I can definitely see how your two
solutions differ especially with messaging and amount of effort.

https://issues.apache.org/jira/browse/TEZ-2104

On Thu, Dec 6, 2018 at 7:58 PM Adrian Nicoara
 wrote:
>
> Hello Tez devs,
>
> I will start with an example of vertex output full cross join, and then 
> circle back to the root case:
>
> V1[2]   V2[3]
>  |   |
> DME1(2) DME2(3)
> \  /
>  V3[2x3]
>
> Assume we have:
> 1. Producer vertex V1, with 2 tasks, in which every task generates one 
> physical output.
> 2. Once the tasks of V1 complete, they each raise a DataMovementEvent, for a 
> total of 2 DMEs, called DME1.
> 3. Producer vertex V2, with 3 tasks, in which every task generates one 
> physical output.
> 4. Once the tasks of V2 complete, they each raise a DataMovementEvent, for a 
> total of 3 DMEs, called DME2.
> 5. Consumer vertex V3, with 6 tasks (2x3), one for each of the V1xV2 output 
> combinations.
> 6. It is the responsibility of the EdgeManager on the edge V1->V3, to take 
> the DME1 events {0} and {1}, and broadcast them to the tasks {0, *} and {1, 
> *} respectively in V3 (assuming we view V3 as a 2D array).
> 7. It is the responsibility of the EdgeManager on the edge V2->V3, to take 
> the DME2 events {0}, {1}, and {2} and broadcast them to the tasks {*, 0}, {*, 
> 1} and {*, 2} respectively in V3 (assuming we view V3 as a 2D array).
>
> Now, consider the example of a full cross join of two root inputs:
>
> R1[2]   R2[3]
>  |   |
> IDIE1(?)IDIE2(?)
>  \ /
>   V3[2x3]
>
> Then we have:
> 1. Root input R1, with 2 physical partitions.
> 2. Root input R2, with 3 physical partitions.
> 3. A full cross join, with a task setup to process 1 physical partition from 
> each input, would result in V3 having again 6 tasks.
>
> Problem 1 - configuring the number of tasks in V3, and how many 
> InputDataInformationEvents to send:
>
> At a first glance, each InputInitializer would have to result in 6 
> InputDataInformationEvents, one for each task.
> This is in contrast with the DataMovementEvent model, where the number of 
> **generated/stored** (as opposed to transmitted downstream) events is 
> relative to the number of physical outputs.
> Furthermore, configuring the vertex V3 requires global information (all of 
> the inputs to the vertex), while an InputInitializer functions on local 
> information (the respective root input).
> R1 then takes a dependency on R2, and R2 takes a dependency on R1.
> The delay that the InputInitializers experience in obtaining the global 
> information can be made larger, by adding another input from a vertex V4, and 
> delaying configuration up until V4 is configured.
>
> Problem 2 - the VertexManagerPlugin doesn't have a proper chance of mutating 
> the InputInitializer events
>
> While there is VertexManagerPluginContext#addRootInputEvents:
> https://github.com/apache/tez/blob/282bb0a3fddf20260d71b0a6cd798fa5479e7038/tez-api/src/main/java/org/apache/tez/dag/api/VertexManagerPluginContext.java#L259-L272
> that allows a VertexManagerPlugin to add root input events, that API can only 
> be used within VertexManagerPlugin#onRootVertexInitialized:
> https://github.com/apache/tez/blob/282bb0a3fddf20260d71b0a6cd798fa5479e7038/tez-api/src/main/java/org/apache/tez/dag/api/VertexManagerPlugin.java#L125-L133
> Because the queued events get processed by the 
> VertexManagerRootInputInitializedCallback:
> https://github.com/apache/tez/blob/282bb0a3fddf20260d71b0a6cd798fa5479e7038/tez-dag/src/main/java/org/apache/tez/dag/app/dag/impl/VertexManager.java#L589-L591
> which runs independently of the calls the VertexManagerPlugin does to 
> VertexManagerPluginContext#addRootInputEvents.
>
> So the only chance of acting and mutating the events is during that one 
> function call, at which point the VertexManagerPlugin might not have all the 
> required global information to actually do the mutation.
>
> Solution 1:
> Modify VertexManagerPluginContext#addRootInputEvents to be dependent on the 
> VertexManagerPluginContext#vertexReconfigurationPlanned and 
> VertexManagerPluginContext#doneReconfiguringVertex calls, and be able to send 
> a VertexEventInputDataInformation event to the VertexImpl, similar to how the 
> VertexManagerPluginContext#reconfigureVertex calls work; or overload a 
> #reconfigureVertex function to take root input events also.
>
> Benefits:
> 1. Small contained change.
> 2. Delegates responsibility of configuration to the user code.
>
> Drawbacks:
> 1. Routing, which is usually delegated/implemented in the EdgeManagerPlugin 
> layer must now be done in the VertexManagerPlugin for root inputs only.
> 2. Duplicate events must be stored.
>
> Solution 2:
> Treat data sources 

[DISCUSS] Early Move to gitbox

2018-12-14 Thread Jonathan Eagles
Apache Tez git repository is in git-wip-us server and it will be
decommissioned.
Please discuss issues and preferred timeline, I'll file a JIRA ticket
with INFRA to
migrate to https://gitbox.apache.org/ and update documentation.

According to ASF infra team, the timeframe is as follows:

> - December 9th 2018 -> January 9th 2019: Voluntary (coordinated) relocation
> - January 9th -> February 6th: Mandated (coordinated) relocation
> - February 7th: All remaining repositories are mass migrated.
> This timeline may change to accommodate various scenarios.

If we get consensus by January 9th, I can file a ticket with INFRA and
migrate it.
Even if we cannot got consensus, the repository will be migrated by
February 7th.

Regards,
jeagles


ORIGINAL NOTICE

Hello Apache projects,

I am writing to you because you may have git repositories on the
git-wip-us server, which is slated to be decommissioned in the coming
months. All repositories will be moved to the new gitbox service which
includes direct write access on github as well as the standard ASF
commit access via gitbox.apache.org.

## Why this move? ##
The move comes as a result of retiring the git-wip service, as the
hardware it runs on is longing for retirement. In lieu of this, we
have decided to consolidate the two services (git-wip and gitbox), to
ease the management of our repository systems and future-proof the
underlying hardware. The move is fully automated, and ideally, nothing
will change in your workflow other than added features and access to
GitHub.

## Timeframe for relocation ##
Initially, we are asking that projects voluntarily request to move
their repositories to gitbox, hence this email. The voluntary
timeframe is between now and January 9th 2019, during which projects
are free to either move over to gitbox or stay put on git-wip. After
this phase, we will be requiring the remaining projects to move within
one month, after which we will move the remaining projects over.

To have your project moved in this initial phase, you will need:

- Consensus in the project (documented via the mailing list)
- File a JIRA ticket with INFRA to voluntarily move your project repos
   over to gitbox (as stated, this is highly automated and will take
   between a minute and an hour, depending on the size and number of
   your repositories)

To sum up the preliminary timeline;

- December 9th 2018 -> January 9th 2019: Voluntary (coordinated)
   relocation
- January 9th -> February 6th: Mandated (coordinated) relocation
- February 7th: All remaining repositories are mass migrated.

This timeline may change to accommodate various scenarios.

## Using GitHub with ASF repositories ##
When your project has moved, you are free to use either the ASF
repository system (gitbox.apache.org) OR GitHub for your development
and code pushes. To be able to use GitHub, please follow the primer
at: https://reference.apache.org/committer/github


We appreciate your understanding of this issue, and hope that your
project can coordinate voluntarily moving your repositories in a
timely manner.

All settings, such as commit mail targets, issue linking, PR
notification schemes etc will automatically be migrated to gitbox as
well.

With regards, Daniel on behalf of ASF Infra.

PS:For inquiries, please reply to us...@infra.apache.org, not your
project's dev list :-).


[jira] [Created] (TEZ-4026) Fetch Download rate shows 0.0 MB per second if duration is 0 millis

2018-12-06 Thread Jonathan Eagles (JIRA)
Jonathan Eagles created TEZ-4026:


 Summary: Fetch Download rate shows 0.0 MB per second if duration 
is 0 millis
 Key: TEZ-4026
 URL: https://issues.apache.org/jira/browse/TEZ-4026
 Project: Apache Tez
  Issue Type: Bug
Reporter: Jonathan Eagles
Assignee: Jonathan Eagles


I think it will be more accurate to assume 1 millisecond so that the download 
rate is more accurate. The other way is to use a more accurate timer, but I 
think that is out of scope for this jira.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] Slack for Apache Tez

2018-11-29 Thread Jonathan Eagles
I have created a slack channel for tez. Here is an invite link that
hopefully will work for everyone.

https://join.slack.com/t/apache-tez/shared_invite/enQtNDkxOTc5ODY3ODI5LTg3MzMzNDU4YzdlOTE2ODA4ZmVhYzEyNDE1MTZiNmEzMDA0MDY5YzYyYzk0NTYyNWRmMDIwOWI5NTQ5NTk2ZDE
On Tue, Oct 16, 2018 at 2:02 PM Jin Sun  wrote:
>
> +1
>
>
> > On Oct 16, 2018, at 12:00 PM, Eric Wohlstadter  wrote:
> >
> > I like the idea of a 2-3 month trial.
> >
> > On Tue, Oct 16, 2018 at 9:42 AM Jonathan Eagles  wrote:
> >
> >> Thanks for the feedback everyone. I can see that on the one hand, 1)
> >> this tool could help the community feel more connected. At the same
> >> time, if we move discussions to this tool, 2) they no longer are part
> >> of the public record, and could to lead to information loss.
> >>
> >> If we started a 2-3 month trial, would that be enough time to
> >> understand if the tool is working? Then we could hold another
> >> discussion about if it is working.
> >>
> >> Regards,
> >> jeagles
> >> On Sat, Oct 6, 2018 at 7:21 PM Kuhu Shukla 
> >> wrote:
> >>>
> >>> "How does everyone feel about trying slack temporarily to see if it aids
> >> Tez
> >>> development?"
> >>>
> >>> This would be a great way to get attention on certain JIRAs and ask
> >>> questions/discuss releases and features. Agree that reviews should be
> >> JIRA
> >>> based only.
> >>>
> >>> Regards,
> >>> Kuhu
> >>>
> >>> On Tue, Oct 2, 2018 at 2:17 PM Eric Wohlstadter 
> >> wrote:
> >>>
> >>>> I like the idea for general questions and discussions. Agree that
> >> reviews
> >>>> shouldn't take place on Slack.
> >>>>
> >>>> On Tue, Oct 2, 2018 at 7:13 AM Eric Badger 
> >>>> wrote:
> >>>>
> >>>>> What would the purpose of the slack be? Would it be for hashing out
> >> big
> >>>>> details about Tez, discussing issues that people are seeing, and
> >> maybe
> >>>>> identifying patches that need review? My main concern would be that
> >> patch
> >>>>> reviews would become dominated by slack messaging back and forth and
> >> so
> >>>> we
> >>>>> would lose the information on JIRA.
> >>>>>
> >>>>> On the surface this sounds like a really nice idea. But, I think we
> >> need
> >>>> to
> >>>>> be clear about how much reviewing we do in slack vs on the JIRA so
> >> that
> >>>> we
> >>>>> have a centralized history of patches and reviews.
> >>>>>
> >>>>> Eric
> >>>>>
> >>>>> On Mon, Oct 1, 2018 at 5:44 PM Jonathan Eagles 
> >>>> wrote:
> >>>>>
> >>>>>> At our most recent meetup we discussed the possibility of creating
> >> a
> >>>>> slack
> >>>>>> channel that could be used for developers of Apache Tez.
> >>>>>>
> >>>>>> How does everyone feel about trying slack temporarily to see if it
> >> aids
> >>>>> Tez
> >>>>>> development?
> >>>>>> Since ASF does not host or support slack channel they are run by
> >>>>> volunteers
> >>>>>> on behalf of projects. I would be happy to volunteer to setup this
> >> up
> >>>> and
> >>>>>> run the trial.
> >>>>>>
> >>>>>> Jon
> >>>>>>
> >>>>>
> >>>>
> >>
>


[jira] [Created] (TEZ-4025) javadoc compilation is broken in jdk11

2018-11-29 Thread Jonathan Eagles (JIRA)
Jonathan Eagles created TEZ-4025:


 Summary: javadoc compilation is broken in jdk11
 Key: TEZ-4025
 URL: https://issues.apache.org/jira/browse/TEZ-4025
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Jonathan Eagles
Assignee: Jonathan Eagles


{code}
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-javadoc-plugin:2.10.4:javadoc (default-cli) on 
project tez-mapreduce: An error has occurred in JavaDocs report generation:
[ERROR] Exit code: 1 - 
/Users/jeagles/hadoop/tez/tez-mapreduce/src/main/java/org/apache/tez/mapreduce/hadoop/DeprecatedKeys.java:175:
 error: as of release 9, '_' is a keyword, and may not be used as an identifier
[ERROR]   private static void _(String mrKey, String tezKey) {
[ERROR]   ^
[ERROR]
[ERROR] Command line was: 
/Library/Java/JavaVirtualMachines/openjdk-11.0.1.jdk/Contents/Home/bin/javadoc 
@options @packages
[ERROR]
[ERROR] Refer to the generated Javadoc files in 
'/Users/jeagles/hadoop/tez/tez-mapreduce/target/site/apidocs' dir.
{code}

{code}
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-javadoc-plugin:2.10.4:javadoc (default-cli) on 
project tez-javadoc-tools: An error has occurred in JavaDocs report generation:
[ERROR] Exit code: 1 - 
/Users/jeagles/hadoop/tez/tez-tools/tez-javadoc-tools/src/main/java/org/apache/tez/tools/javadoc/doclet/ConfigStandardDoclet.java:36:
 warning: [removal] AnnotationDesc in com.sun.javadoc has been deprecated and 
marked for removal
[ERROR] import com.sun.javadoc.AnnotationDesc.ElementValuePair;
[ERROR]   ^
[ERROR] 
/Users/jeagles/hadoop/tez/tez-tools/tez-javadoc-tools/src/main/java/org/apache/tez/tools/javadoc/doclet/ConfigStandardDoclet.java:42:
 error: package com.sun.tools.doclets.standard is not visible
[ERROR] import com.sun.tools.doclets.standard.Standard;
[ERROR] ^
[ERROR]   (package com.sun.tools.doclets.standard is declared in module 
jdk.javadoc, which does not export it)
[ERROR]
[ERROR] Command line was: 
/Library/Java/JavaVirtualMachines/openjdk-11.0.1.jdk/Contents/Home/bin/javadoc 
@options @packages
[ERROR]
[ERROR] Refer to the generated Javadoc files in 
'/Users/jeagles/hadoop/tez/tez-tools/tez-javadoc-tools/target/site/apidocs' dir.
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (TEZ-4023) TestExtServicesWithLocalMode with NPE due to USER environment being null in Docker

2018-11-27 Thread Jonathan Eagles (JIRA)
Jonathan Eagles created TEZ-4023:


 Summary: TestExtServicesWithLocalMode with NPE due to USER 
environment being null in Docker
 Key: TEZ-4023
 URL: https://issues.apache.org/jira/browse/TEZ-4023
 Project: Apache Tez
  Issue Type: Bug
Reporter: Jonathan Eagles






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (TEZ-4021) API incompatibility wro4j-maven-plugin

2018-11-26 Thread Jonathan Eagles (JIRA)
Jonathan Eagles created TEZ-4021:


 Summary: API incompatibility wro4j-maven-plugin
 Key: TEZ-4021
 URL: https://issues.apache.org/jira/browse/TEZ-4021
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Jonathan Eagles
Assignee: Jonathan Eagles






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (TEZ-4020) Support Java 11 LTS in Tez

2018-11-26 Thread Jonathan Eagles (JIRA)
Jonathan Eagles created TEZ-4020:


 Summary: Support Java 11 LTS in Tez
 Key: TEZ-4020
 URL: https://issues.apache.org/jira/browse/TEZ-4020
 Project: Apache Tez
  Issue Type: Improvement
Reporter: Jonathan Eagles


Oracle JDK 8 will be EoL during January 2019, and RedHat will end support for 
OpenJDK 8 in October 2020 (https://access.redhat.com/articles/1299013), so we 
need to support Java 11 LTS at least before October 2020.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (TEZ-4012) Add docker support for Tez.

2018-10-19 Thread Jonathan Eagles (JIRA)
Jonathan Eagles created TEZ-4012:


 Summary: Add docker support for Tez.
 Key: TEZ-4012
 URL: https://issues.apache.org/jira/browse/TEZ-4012
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Jonathan Eagles


Hadoop label builds contain a mix of development tools and versions. In 
particular H11-H20 are unusable by tez since protoc -version is 2.6.x and 
hadoop only supports 2.5.0. This jira will allow builds across all H1-H20 
jenkins machines.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Jenkins tests failing to compile?

2018-10-17 Thread Jonathan Eagles
Jaume, you have found another node where protoc 2.6.1 is running. The
tez precommit builds are tied to the jenkins machine label "Hadoop"
https://builds.apache.org/label/Hadoop/

Machines H14-H20 are named Ubuntu and were taken out of the tez
pre-commit build since the Ubuntu machine are running protoc 2.6.1. In
addition H11-H13 don't have this naming convention but appear to be
running Ubuntu and are running protoc 2.6.1. I have taken them out of
the tez pre-commit builds for now. Once a docker image is created for
tez pre-commit with protoc 2.5.0, then the machine restriction can be
removed again.

Thanks for bringing this to my attention. I have restarted TEZ-3976
pre-commit build.

Regards,
jeagles
On Wed, Oct 17, 2018 at 11:19 PM Jaume Marhuenda
 wrote:
>
> Hello,
>
> I’m getting this error: 
> https://builds.apache.org/job/PreCommit-TEZ-Build/32/artifact/out/branch-mvninstall-root.txt,
>  for this patch: 
> https://issues.apache.org/jira/secure/attachment/12944472/TEZ-3976.5.patch. 
> It looks like if the image where the test run has been updated and protoc 
> version is 2.6.1. Does anyone know if this is the case?
>
> Thank you,
> Jaume


[jira] [Resolved] (TEZ-1155) there are many ATS exceptions in the logs

2018-10-17 Thread Jonathan Eagles (JIRA)


 [ 
https://issues.apache.org/jira/browse/TEZ-1155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles resolved TEZ-1155.
--
Resolution: Not A Problem

Closing as this old jira appears to be a setup problem. [~sershe], please 
reopen discussion on the tez user list u...@tez.apache.org and I will be happy 
to discuss solutions.

> there are many ATS exceptions in the logs
> -
>
> Key: TEZ-1155
> URL: https://issues.apache.org/jira/browse/TEZ-1155
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Major
>
> Unfortunately I only preserved part of call stack.
> I get lots of exceptions in the logs like these; they do not fail the tasks 
> but flood the log and are rather annoying:
> {noformat}
>at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74)
>at com.sun.jersey.api.client.WebResource$Builder.post(WebResource.java:563)
>at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.doPostingEntities(TimelineClientImpl.java:116)
>at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:89)
>at 
> org.apache.tez.dag.history.ats.ATSService.handleEvent(ATSService.java:191)
>at org.apache.tez.dag.history.ats.ATSService.access$600(ATSService.java:40)
>at org.apache.tez.dag.history.ats.ATSService$1.run(ATSService.java:106)
>at java.lang.Thread.run(Thread.java:724)
> Caused by: java.net.ConnectException: Connection refused
>at java.net.PlainSocketImpl.socketConnect(Native Method)
>at 
> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
>at 
> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
>at 
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
>at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>at java.net.Socket.connect(Socket.java:579)
>at java.net.Socket.connect(Socket.java:528)
>at sun.net.NetworkClient.doConnect(NetworkClient.java:180)
>at sun.net.www.http.HttpClient.openServer(HttpClient.java:378)
>at sun.net.www.http.HttpClient.openServer(HttpClient.java:473)
>at sun.net.www.http.HttpClient.(HttpClient.java:203)
>at sun.net.www.http.HttpClient.New(HttpClient.java:290)
>at sun.net.www.http.HttpClient.New(HttpClient.java:306)
>at 
> sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:995)
>at 
> sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:931)
>at 
> sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:849)
>at 
> sun.net.www.protocol.http.HttpURLConnection.getOutputStream(HttpURLConnection.java:1090)
>at 
> com.sun.jersey.client.urlconnection.URLConnectionClientHandler$1$1.getOutputStream(URLConnectionClientHandler.java:225)
>at 
> com.sun.jersey.api.client.CommittingOutputStream.commitWrite(CommittingOutputStream.java:117)
>at 
> com.sun.jersey.api.client.CommittingOutputStream.write(CommittingOutputStream.java:89)
>at 
> org.codehaus.jackson.impl.Utf8Generator._flushBuffer(Utf8Generator.java:1754)
>at org.codehaus.jackson.impl.Utf8Generator.flush(Utf8Generator.java:1088)
>at org.codehaus.jackson.map.ObjectMapper.writeValue(ObjectMapper.java:1354)
>at 
> org.codehaus.jackson.jaxrs.JacksonJsonProvider.writeTo(JacksonJsonProvider.java:527)
>at 
> com.sun.jersey.api.client.RequestWriter.writeRequestEntity(RequestWriter.java:300)
>at 
> com.sun.jersey.client.urlconnection.URLConnectionClientHandler._invoke(URLConnectionClientHandler.java:204)
>at 
> com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:147)
>... 10 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (TEZ-1147) TaskScheduler should handle preemption requests from YARN

2018-10-17 Thread Jonathan Eagles (JIRA)


 [ 
https://issues.apache.org/jira/browse/TEZ-1147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles resolved TEZ-1147.
--
Resolution: Duplicate

Closing as TaskSchedulers have implemented YARN preemption since the filing of 
this jira.

> TaskScheduler should handle preemption requests from YARN
> -
>
> Key: TEZ-1147
> URL: https://issues.apache.org/jira/browse/TEZ-1147
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (TEZ-962) Logs need to be improved

2018-10-17 Thread Jonathan Eagles (JIRA)


 [ 
https://issues.apache.org/jira/browse/TEZ-962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles resolved TEZ-962.
-
Resolution: Not A Problem

No response from reporter after response. Closing as not a problem.

> Logs need to be improved
> 
>
> Key: TEZ-962
> URL: https://issues.apache.org/jira/browse/TEZ-962
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.4.0
>Reporter: Vikram Dixit K
>Priority: Major
>
> 2014-03-19 17:55:29,044 INFO [AsyncDispatcher event handler] 
> org.apache.hadoop.mapred.split.TezMapredSplitsGrouper: Desired splits: 1 too 
> large.  Desired splitLength: 224 Min splitLength: 16777216 New desired 
> splits: 1 Total length: 224 Original splits: 4
> 2014-03-19 17:55:29,045 INFO [AsyncDispatcher event handler] 
> org.apache.hadoop.mapred.split.TezMapredSplitsGrouper: Number of splits 
> desired: 1 created: 3 splitsProcessed: 4
> Desired splits too large? But had 4 splits and generated 4.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (TEZ-521) Create a streaming byte output

2018-10-17 Thread Jonathan Eagles (JIRA)


 [ 
https://issues.apache.org/jira/browse/TEZ-521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles resolved TEZ-521.
-
Resolution: Duplicate

With the addition of NonSyncByteArrayInputStream and 
NonSyncByteArrayOutputStream, I think this jira can be marked as completed

> Create a streaming byte output
> --
>
> Key: TEZ-521
> URL: https://issues.apache.org/jira/browse/TEZ-521
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Bikas Saha
>Assignee: Bikas Saha
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (TEZ-72) Support running of single-task vertices inlined in the DAG AM

2018-10-17 Thread Jonathan Eagles (JIRA)


 [ 
https://issues.apache.org/jira/browse/TEZ-72?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles resolved TEZ-72.

Resolution: Later

Closing this old jira. Reopen if interested in this feature

> Support running of single-task vertices inlined in the DAG AM
> -
>
> Key: TEZ-72
> URL: https://issues.apache.org/jira/browse/TEZ-72
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Hitesh Shah
>Priority: Major
>  Labels: TEZ-1
>
> For vertices that have parrellism of one or via a supported flag, the vertex 
> should be able to run inline in the DAG AM itself.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] Slack for Apache Tez

2018-10-16 Thread Jonathan Eagles
Thanks for the feedback everyone. I can see that on the one hand, 1)
this tool could help the community feel more connected. At the same
time, if we move discussions to this tool, 2) they no longer are part
of the public record, and could to lead to information loss.

If we started a 2-3 month trial, would that be enough time to
understand if the tool is working? Then we could hold another
discussion about if it is working.

Regards,
jeagles
On Sat, Oct 6, 2018 at 7:21 PM Kuhu Shukla  wrote:
>
> "How does everyone feel about trying slack temporarily to see if it aids Tez
> development?"
>
> This would be a great way to get attention on certain JIRAs and ask
> questions/discuss releases and features. Agree that reviews should be JIRA
> based only.
>
> Regards,
> Kuhu
>
> On Tue, Oct 2, 2018 at 2:17 PM Eric Wohlstadter  wrote:
>
> > I like the idea for general questions and discussions. Agree that reviews
> > shouldn't take place on Slack.
> >
> > On Tue, Oct 2, 2018 at 7:13 AM Eric Badger 
> > wrote:
> >
> > > What would the purpose of the slack be? Would it be for hashing out big
> > > details about Tez, discussing issues that people are seeing, and maybe
> > > identifying patches that need review? My main concern would be that patch
> > > reviews would become dominated by slack messaging back and forth and so
> > we
> > > would lose the information on JIRA.
> > >
> > > On the surface this sounds like a really nice idea. But, I think we need
> > to
> > > be clear about how much reviewing we do in slack vs on the JIRA so that
> > we
> > > have a centralized history of patches and reviews.
> > >
> > > Eric
> > >
> > > On Mon, Oct 1, 2018 at 5:44 PM Jonathan Eagles 
> > wrote:
> > >
> > > > At our most recent meetup we discussed the possibility of creating a
> > > slack
> > > > channel that could be used for developers of Apache Tez.
> > > >
> > > > How does everyone feel about trying slack temporarily to see if it aids
> > > Tez
> > > > development?
> > > > Since ASF does not host or support slack channel they are run by
> > > volunteers
> > > > on behalf of projects. I would be happy to volunteer to setup this up
> > and
> > > > run the trial.
> > > >
> > > > Jon
> > > >
> > >
> >


[jira] [Created] (TEZ-4004) Update jetty9 to align with Hadoop and Hive

2018-10-10 Thread Jonathan Eagles (JIRA)
Jonathan Eagles created TEZ-4004:


 Summary: Update jetty9 to align with Hadoop and Hive
 Key: TEZ-4004
 URL: https://issues.apache.org/jira/browse/TEZ-4004
 Project: Apache Tez
  Issue Type: Bug
Reporter: Jonathan Eagles
Assignee: Jonathan Eagles






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (TEZ-4003) Add gop...@apache.org to KEYS file

2018-10-09 Thread Jonathan Eagles (JIRA)


 [ 
https://issues.apache.org/jira/browse/TEZ-4003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles resolved TEZ-4003.
--
   Resolution: Fixed
Fix Version/s: 0.10.0
   0.9.2

+1. Committed to branch-0.9 and master branch. Thanks, [~gopalv]

> Add gop...@apache.org to KEYS file
> --
>
> Key: TEZ-4003
> URL: https://issues.apache.org/jira/browse/TEZ-4003
> Project: Apache Tez
>  Issue Type: Task
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Trivial
> Fix For: 0.9.2, 0.10.0
>
> Attachments: TEZ-4003.patch
>
>
> {code}
> -END PGP PUBLIC KEY BLOCK-
> pub   rsa4096 2018-09-20 [SC]
>   6CFAA64865AD19C55C5662680C5267F97FBEC4F9
> uid   [ultimate] Gopal Vijayaraghavan (CODE SIGNING KEY) 
> 
> sig 30C5267F97FBEC4F9 2018-09-20  Gopal Vijayaraghavan (CODE SIGNING 
> KEY) 
> sub   rsa4096 2018-09-20 [E]
> sig  0C5267F97FBEC4F9 2018-09-20  Gopal Vijayaraghavan (CODE SIGNING 
> KEY) 
> -BEGIN PGP PUBLIC KEY BLOCK-
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (TEZ-3497) Upgrade Jetty to 9.3.X

2018-10-08 Thread Jonathan Eagles (JIRA)


 [ 
https://issues.apache.org/jira/browse/TEZ-3497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles resolved TEZ-3497.
--
Resolution: Duplicate

> Upgrade Jetty to 9.3.X
> --
>
> Key: TEZ-3497
> URL: https://issues.apache.org/jira/browse/TEZ-3497
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: darion yaphet
>Assignee: darion yaphet
>Priority: Major
> Attachments: TEZ-3497.1.patch
>
>
> Jetty 6.X have not keep in maintain and should upgrade to the current version.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Design doc for new feature?

2018-10-05 Thread Jonathan Eagles
Ok. I have added Chyler to tez contributors as well. Please verify accounts
can now assign Jiras to themselves.

Regards,
Jon Eagles

On Fri, Oct 5, 2018, 5:21 PM Yingda Chen  wrote:

> Sorry my bad, it should be Chyler.
>
> https://issues.apache.org/jira/secure/ViewProfile.jspa?name=Chyler
>
> Thanks,
> -Yingda
>
> On Sat, Oct 6, 2018 at 6:17 AM Jonathan Eagles  wrote:
>
> > Yingda, I added yingdachen as contributor but was unable to find user
> name
> > for Chryler. Can you verify the username for this account?
> >
> > Regards,
> > jeagles
> >
> > On Fri, Oct 5, 2018 at 5:11 PM Yingda Chen  wrote:
> >
> > > Hi Jon,
> > >
> > > We have created the first batch of JIRAs that we plan to work on, under
> > the
> > > new proposed feature TEZ-3998.
> > >
> > > Just a gentle ping, could you please add our JIRA accounts (yingdachen
> > and
> > > Chryler) to contributor list so that we can assign things to ourselves?
> > >
> > > Thanks,
> > > -Yingda
> > >
> > > On Fri, Oct 5, 2018 at 6:19 AM Yingda Chen  wrote:
> > >
> > > > Sounds good and Thanks Jon.  We have signed up for the dev mailing
> > list,
> > > > and will be creating new JIRAs shortly. Lets go from there.
> > > >
> > > > In the mean time, could you please add the following two accounts to
> > the
> > > > contributors list first (we may have more folks join later as well)?
> > > >
> > > > yingdachen
> > > > Chyler
> > > >
> > > > Thanks,
> > > > -Yingda
> > > >
> > > > On Fri, Oct 5, 2018 at 4:40 AM Jonathan Eagles 
> > > wrote:
> > > >
> > > >> Welcome and thank you for reaching out, Yingda.
> > > >>
> > > >> Here is a sample of the process. Feel free to create a new JIRA for
> > each
> > > >> feature. You can attach a design doc to the JIRA and request for
> > > comments
> > > >> on the dev list. After design discussions, usually the design takes
> > one
> > > of
> > > >> two paths. For smaller projects, subtasks are added to the
> > > JIRA(umbrella)
> > > >> and are added in pieces. For larger projects, a feature branch is
> > > created
> > > >> and sub-jiras are committed directly to the feature branch and later
> > > >> merged
> > > >> back to the main line.
> > > >>
> > > >> As far as google doc link versus other formats, I feel this is ok as
> > it
> > > >> does not require certain software to read and contribute to the
> design
> > > >> discussion. However, versioning and history of changes are part of a
> > > >> separate application as opposed to being stand alone in the JIRA.
> I'm
> > ok
> > > >> with google doc as well as PDF.
> > > >>
> > > >> Please have a look at
> > > >>
> > >
> https://cwiki.apache.org/confluence/display/TEZ/How+to+Contribute+to+Tez
> > > >> for
> > > >> some process.
> > > >>
> > > >> Get signed up for tez development and let me know what apache jira
> > > account
> > > >> (others welcome as well) to add to the contributors list, so that
> you
> > > are
> > > >> able to assign jiras to yourself.
> > > >>
> > > >> Regards,
> > > >> Jon Eagles
> > > >> Tez PMC Chair
> > > >>
> > > >> On Thu, Oct 4, 2018 at 2:48 PM Yingda Chen 
> wrote:
> > > >>
> > > >> > Hi all,
> > > >> >
> > > >> > We are interested in contributing a few new features to Tez and
> some
> > > of
> > > >> > these new features should ideally be preceded with design doc and
> > > >> > discussions.
> > > >> >
> > > >> >  Is JIRA (possibly with external Google Doc link) sufficient for
> > that?
> > > >> >
> > > >> > Thanks all,
> > > >> > -Yingda
> > > >> >
> > > >>
> > > >
> > >
> >
>


Re: Design doc for new feature?

2018-10-05 Thread Jonathan Eagles
Yingda, I added yingdachen as contributor but was unable to find user name
for Chryler. Can you verify the username for this account?

Regards,
jeagles

On Fri, Oct 5, 2018 at 5:11 PM Yingda Chen  wrote:

> Hi Jon,
>
> We have created the first batch of JIRAs that we plan to work on, under the
> new proposed feature TEZ-3998.
>
> Just a gentle ping, could you please add our JIRA accounts (yingdachen and
> Chryler) to contributor list so that we can assign things to ourselves?
>
> Thanks,
> -Yingda
>
> On Fri, Oct 5, 2018 at 6:19 AM Yingda Chen  wrote:
>
> > Sounds good and Thanks Jon.  We have signed up for the dev mailing list,
> > and will be creating new JIRAs shortly. Lets go from there.
> >
> > In the mean time, could you please add the following two accounts to the
> > contributors list first (we may have more folks join later as well)?
> >
> > yingdachen
> > Chyler
> >
> > Thanks,
> > -Yingda
> >
> > On Fri, Oct 5, 2018 at 4:40 AM Jonathan Eagles 
> wrote:
> >
> >> Welcome and thank you for reaching out, Yingda.
> >>
> >> Here is a sample of the process. Feel free to create a new JIRA for each
> >> feature. You can attach a design doc to the JIRA and request for
> comments
> >> on the dev list. After design discussions, usually the design takes one
> of
> >> two paths. For smaller projects, subtasks are added to the
> JIRA(umbrella)
> >> and are added in pieces. For larger projects, a feature branch is
> created
> >> and sub-jiras are committed directly to the feature branch and later
> >> merged
> >> back to the main line.
> >>
> >> As far as google doc link versus other formats, I feel this is ok as it
> >> does not require certain software to read and contribute to the design
> >> discussion. However, versioning and history of changes are part of a
> >> separate application as opposed to being stand alone in the JIRA. I'm ok
> >> with google doc as well as PDF.
> >>
> >> Please have a look at
> >>
> https://cwiki.apache.org/confluence/display/TEZ/How+to+Contribute+to+Tez
> >> for
> >> some process.
> >>
> >> Get signed up for tez development and let me know what apache jira
> account
> >> (others welcome as well) to add to the contributors list, so that you
> are
> >> able to assign jiras to yourself.
> >>
> >> Regards,
> >> Jon Eagles
> >> Tez PMC Chair
> >>
> >> On Thu, Oct 4, 2018 at 2:48 PM Yingda Chen  wrote:
> >>
> >> > Hi all,
> >> >
> >> > We are interested in contributing a few new features to Tez and some
> of
> >> > these new features should ideally be preceded with design doc and
> >> > discussions.
> >> >
> >> >  Is JIRA (possibly with external Google Doc link) sufficient for that?
> >> >
> >> > Thanks all,
> >> > -Yingda
> >> >
> >>
> >
>


Re: Design doc for new feature?

2018-10-04 Thread Jonathan Eagles
Welcome and thank you for reaching out, Yingda.

Here is a sample of the process. Feel free to create a new JIRA for each
feature. You can attach a design doc to the JIRA and request for comments
on the dev list. After design discussions, usually the design takes one of
two paths. For smaller projects, subtasks are added to the JIRA(umbrella)
and are added in pieces. For larger projects, a feature branch is created
and sub-jiras are committed directly to the feature branch and later merged
back to the main line.

As far as google doc link versus other formats, I feel this is ok as it
does not require certain software to read and contribute to the design
discussion. However, versioning and history of changes are part of a
separate application as opposed to being stand alone in the JIRA. I'm ok
with google doc as well as PDF.

Please have a look at
https://cwiki.apache.org/confluence/display/TEZ/How+to+Contribute+to+Tez for
some process.

Get signed up for tez development and let me know what apache jira account
(others welcome as well) to add to the contributors list, so that you are
able to assign jiras to yourself.

Regards,
Jon Eagles
Tez PMC Chair

On Thu, Oct 4, 2018 at 2:48 PM Yingda Chen  wrote:

> Hi all,
>
> We are interested in contributing a few new features to Tez and some of
> these new features should ideally be preceded with design doc and
> discussions.
>
>  Is JIRA (possibly with external Google Doc link) sufficient for that?
>
> Thanks all,
> -Yingda
>


[DISCUSS] Slack for Apache Tez

2018-10-01 Thread Jonathan Eagles
At our most recent meetup we discussed the possibility of creating a slack
channel that could be used for developers of Apache Tez.

How does everyone feel about trying slack temporarily to see if it aids Tez
development?
Since ASF does not host or support slack channel they are run by volunteers
on behalf of projects. I would be happy to volunteer to setup this up and
run the trial.

Jon


Re: [DISCUSS] Tez build Yetus support

2018-10-01 Thread Jonathan Eagles
I have enabled the Yetus based build system based on feedback and after
correcting some minor issues (enabling findbugs)

https://builds.apache.org/job/PreCommit-TEZ-Build

I have archived the previous build, so manual builds can be submitted in
the case of issues with the Yetus based system.
https://builds.apache.org/job/PreCommit-TEZ-Build-NoYetus

Please submit feedback and issues and I'll try to correct them quickly
jeagles

On Mon, Oct 1, 2018 at 8:33 AM, Kuhu Shukla 
wrote:

> Thank you Jon for the prototype. I am in support of moving to Yetus. Branch
> targeting, GitHub PR requests and additional features like whitespace and
> check-style warnings would be very useful.
>
> Regards,
> Kuhu
>
> On Fri, Sep 28, 2018 at 4:23 PM Jaume Marhuenda <
> jmarhue...@hortonworks.com>
> wrote:
>
> > Thank you for taking a look into this. I think this is a great idea.
> > Personally, I’ve missed being able to target a particular branch to run
> the
> > tests against when submitting a patch in JIRA and getting the tests to
> run
> > just by opening a PR in github.
> >
> > Best,
> > Jaume
> >
> > On 9/28/18, 12:03 PM, "Jonathan Eagles"  wrote:
> >
> > Devs,
> >
> > I have prototyped a Yetus based build system for Tez. The system has
> a
> > number of features that would enabled better supporting builds and
> > further
> > the ease for development. Specifically, Yetus has support for patches
> > targeting, increased static analysis, support for docker builds
> > (currently
> > disabled for tez), targeted tests run, and support for github PR
> > requests
> > (currently untested).
> >
> >  I have submitted some tests patches in
> > https://issues.apache.org/jira/browse/TEZ-3891 to for devs to
> > evaluate the
> > features.
> >
> > Do devs want to migrate this system? And if so what features need to
> be
> > supported before migrating to this system permanently?
> >
> > Regards,
> > jeagles
> >
> >
> >
>


[DISCUSS] Tez build Yetus support

2018-09-28 Thread Jonathan Eagles
Devs,

I have prototyped a Yetus based build system for Tez. The system has a
number of features that would enabled better supporting builds and further
the ease for development. Specifically, Yetus has support for patches
targeting, increased static analysis, support for docker builds (currently
disabled for tez), targeted tests run, and support for github PR requests
(currently untested).

 I have submitted some tests patches in
https://issues.apache.org/jira/browse/TEZ-3891 to for devs to evaluate the
features.

Do devs want to migrate this system? And if so what features need to be
supported before migrating to this system permanently?

Regards,
jeagles


[jira] [Created] (TEZ-3995) Fix dot files produced by tests to prevent ASF license warnings in yetus

2018-09-27 Thread Jonathan Eagles (JIRA)
Jonathan Eagles created TEZ-3995:


 Summary: Fix dot files produced by tests to prevent ASF license 
warnings in yetus
 Key: TEZ-3995
 URL: https://issues.apache.org/jira/browse/TEZ-3995
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Jonathan Eagles






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (TEZ-3994) Upgrade maven-surefire-plugin to 0.21.0 to support yetus

2018-09-27 Thread Jonathan Eagles (JIRA)
Jonathan Eagles created TEZ-3994:


 Summary: Upgrade maven-surefire-plugin to 0.21.0 to support yetus
 Key: TEZ-3994
 URL: https://issues.apache.org/jira/browse/TEZ-3994
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Jonathan Eagles






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (TEZ-3993) Tez fails to parse windows file paths in local mode

2018-09-20 Thread Jonathan Eagles (JIRA)
Jonathan Eagles created TEZ-3993:


 Summary: Tez fails to parse windows file paths in local mode
 Key: TEZ-3993
 URL: https://issues.apache.org/jira/browse/TEZ-3993
 Project: Apache Tez
  Issue Type: Bug
Reporter: Jonathan Eagles
Assignee: Jonathan Eagles


TezLocalCacheManager tries to generate symlinks to files that it puts in the 
local cache, but the code that it uses to construct the path names is not safe 
on Windows and causes bad file names to be constructed when run in a Windows 
environment. On Windows, a path like file:/c:/path/to/my/file should be legal 
and transform to c:\path\to\my\file, but with the invalid construct, it turns 
into the illegal value /c:/path/to/my/file instead.

In TezLocalCacheManager, there is code that does

{code}
private boolean createSymlink(Path target, Path link) throws IOException {
LOG.info("Creating symlink: {} <- {}", target, link);
String targetPath = target.toUri().getPath();
String linkPath = link.toUri().getPath();
{code}

It looks like there are several other places in the Tez code that also use the 
Path.toUri().getPath() construct that probably also need to be fixed in order 
to work correctly on Windows.

The construct Path.toUri().getPath() doesn't handle windows directories 
correctly. The Java File class understands how to do this correctly, so this 
should really be replaced by

{code}
private boolean createSymlink(Path target, Path link) throws IOException {
LOG.info("Creating symlink: {} <- {}", target, link);
String targetPath = new File(target.toUri()).getCanonicalPath();
String linkPath = new File(link.toUri()).getCanonicalPath();
{code}
{code}
2018-09-19T16:32:53,287 ERROR [LocalContainerLauncher-SubTaskRunner] 
org.apache.tez.dag.app.launcher.LocalContainerLauncher - TezSubTaskRunner 
failed due to exception
java.nio.file.InvalidPathException: Illegal char <:> at index 2: 
/C:/Users/...fullpath
at sun.nio.fs.WindowsPathParser.normalize(WindowsPathParser.java:182)
at sun.nio.fs.WindowsPathParser.parse(WindowsPathParser.java:153)
at sun.nio.fs.WindowsPathParser.parse(WindowsPathParser.java:77)
at sun.nio.fs.WindowsPath.parse(WindowsPath.java:94)
at sun.nio.fs.WindowsFileSystem.getPath(WindowsFileSystem.java:255)
at java.nio.file.Paths.get(Paths.java:84)
at 
org.apache.tez.dag.app.launcher.TezLocalCacheManager.createSymlink(TezLocalCacheManager.java:173)
at 
org.apache.tez.dag.app.launcher.TezLocalCacheManager.localize(TezLocalCacheManager.java:126)
at 
org.apache.tez.dag.app.launcher.LocalContainerLauncher.launch(LocalContainerLauncher.java:263)
at 
org.apache.tez.dag.app.launcher.LocalContainerLauncher.access$300(LocalContainerLauncher.java:82)
at 
org.apache.tez.dag.app.launcher.LocalContainerLauncher$TezSubTaskRunner.run(LocalContainerLauncher.java:207)
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Submit a patch against branch-0.9

2018-09-19 Thread Jonathan Eagles
This has to do with the fact that yetus isn't setup for TEZ. Until that
occurs, it will always try to apply patches to master.

On Wed, Sep 19, 2018 at 4:02 PM, Jaume Marhuenda  wrote:

> Hello,
>
> I’m trying to submit a patch against branch-0.9, however it’s doing the
> diff against master when it’s applied in Jenkins. Is there a patch naming
> convention that I should follow? I’ve named it in the following way:
> TEZ-[.][-].patch
> (TEZ-3984.2-branch-0.9.patch) without success.
>
> Thank you,
> Jaume
>


Re: [VOTE] Tez 0.10.0 release deadline

2018-08-23 Thread Jonathan Eagles
To work around the fact that you don't have permissions I have created
branch-0.10.0.  Please create a JIRA to update master to point version to
0.10.1-SNAPSHOT and branch-0.10.0 to 0.10.0. Setting the versions is a step
in the how to release section of the wiki.

On Wed, Aug 22, 2018 at 1:29 PM, Eric Wohlstadter 
wrote:

> Hi all,
>  I am still stuck on the issue of branch/tag creation for making a 0.10.0
> release candidate.
> As the assigned release-manager, can I get permissions for branch
> administration?
>
> On Thu, Aug 9, 2018 at 10:51 AM, Eric Wohlstadter 
> wrote:
>
> > Also, it looks like I need permission to create new git tags for release
> > candidates.
> > Can someone with admin give me permission to create new tags?
> >
> > On Tue, Aug 7, 2018 at 12:20 PM, Eric Wohlstadter 
> > wrote:
> >
> >> I've created a ticket to add my key to KEYS file for release management.
> >>
> >> Can someone please review?
> >> https://issues.apache.org/jira/browse/TEZ-3977
> >>
> >> Thanks!
> >>
> >> On Mon, Jul 23, 2018 at 3:37 PM, Eric Wohlstadter 
> >> wrote:
> >>
> >>> Thanks to everyone for voting.
> >>>
> >>> The vote passes with (4) +1, and (0) -1.
> >>>
> >>> If you have any pending tickets that can make it in by this Friday,
> >>> please make sure to get them reviewed and committed.
> >>>
> >>> If there are any PA's that need review, feel free to reply on this
> >>> thread to ask for help.
> >>>
> >>> I'll plan to cut an RC early next week for review.
> >>>
> >>> On Sun, Jul 22, 2018 at 8:21 AM, Kuhu Shukla  >
> >>> wrote:
> >>>
> >>>> +1.
> >>>>
> >>>> Regards,
> >>>> Kuhu
> >>>>
> >>>> On Fri, Jul 20, 2018 at 1:08 PM, Gunther Hagleitner <
> >>>> gunther.hagleit...@gmail.com> wrote:
> >>>>
> >>>> > +1
> >>>> >
> >>>> > Thanks,
> >>>> > Gunther.
> >>>> >
> >>>> > On Fri, Jul 20, 2018 at 8:33 AM Jonathan Eagles 
> >>>> wrote:
> >>>> >
> >>>> > > +1. Seems like a good plan
> >>>> > >
> >>>> > > On Wed, Jul 18, 2018, 2:57 PM Eric Wohlstadter <
> wohls...@gmail.com>
> >>>> > wrote:
> >>>> > >
> >>>> > > > Hi all,
> >>>> > > >  I'd like to propose a deadline of July 27th for work on tickets
> >>>> to be
> >>>> > > > included in the 0.10.0 release.
> >>>> > > >
> >>>> > > > Please +1 or -1 this proposal by July 20th. My understanding is
> >>>> that
> >>>> > this
> >>>> > > > proposal requires a Lazy Majority.
> >>>> > > >
> >>>> > >
> >>>> >
> >>>>
> >>>
> >>>
> >>
> >
>


[jira] [Created] (TEZ-3978) DAGClientServer Socket exception when localhost name lookup failures

2018-08-13 Thread Jonathan Eagles (JIRA)
Jonathan Eagles created TEZ-3978:


 Summary: DAGClientServer Socket exception when localhost name 
lookup failures
 Key: TEZ-3978
 URL: https://issues.apache.org/jira/browse/TEZ-3978
 Project: Apache Tez
  Issue Type: Bug
Reporter: Jonathan Eagles
Assignee: Jonathan Eagles


Call From 0.0.0.0 to null:0 failed on socket exception: 
java.net.SocketException: Invalid argument
{code}
2018-08-10 21:19:55,523 [ERROR] [ServiceThread:DAGClientRPCServer] 
|client.DAGClientServer|: Failed to start DAGClientServer: 
java.net.SocketException: Call From 0.0.0.0 to null:0 failed on socket 
exception: java.net.SocketException: Invalid argument; For more details see:  
http://wiki.apache.org/hadoop/SocketException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:804)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:777)
at org.apache.hadoop.ipc.Server.bind(Server.java:563)
at org.apache.hadoop.ipc.Server$Listener.(Server.java:958)
at org.apache.hadoop.ipc.Server.(Server.java:2657)
at org.apache.hadoop.ipc.RPC$Server.(RPC.java:968)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server.(ProtobufRpcEngine.java:367)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:342)
at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:810)
at 
org.apache.tez.dag.api.client.DAGClientServer.createServer(DAGClientServer.java:134)
at 
org.apache.tez.dag.api.client.DAGClientServer.serviceStart(DAGClientServer.java:82)
at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at 
org.apache.tez.dag.app.DAGAppMaster$ServiceWithDependency.start(DAGAppMaster.java:1909)
at 
org.apache.tez.dag.app.DAGAppMaster$ServiceThread.run(DAGAppMaster.java:1930)
Caused by: java.net.SocketException: Invalid argument
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:433)
at sun.nio.ch.Net.bind(Net.java:425)
at 
sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
at org.apache.hadoop.ipc.Server.bind(Server.java:553)
... 11 more
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Merge process for contributions

2018-07-24 Thread Jonathan Eagles
Thanks Jaume for bringing up this discussion.

I am in agreement that the development process is fluid and should change
and evolve in the same way as our code base. After looking into this, I can
see that git itself supports a separate committer and author natively and
github likewise supports this feature set and preserves committer and
author fields.

Jaume would you be willing to drive the process update by gathering
consensus and new process and then updating process in How to Contribute to
Tez cwiki?

Couple of talking points.
- Limiting barriers to contributions. Requiring a github account to
contribute is a barriers that may exclude some contributors. Keeping a
flexible process where both patches and github PRs can work seems like a
very good thing.
- Github Tez Pre-Commit process. Currently PRs submitted to github don't
automatically start the Tez Pre-Commit process. If github PRs are allowed
as part of the process, they should have the same feature set as patches
and integrate fully with Tez Pre-Commit (though Yetus?)
- Patch process with author. Author field can be extended to patches via
git command line in a few ways. Author email can be added to the current
patch process by committer via git commit --author="author email". Although
getting email address can be problematic as pointed out above. Another way
could be git diff can be replaced with git format-patch 
(--no-prefix) --stdout > TEZ-ID.REV.patch that will include configured
email address as well as Commit message. Patches can be applied with
corresponding cat TEZ-ID.REV.patch | git am (-p0). Reference:
https://robots.thoughtbot.com/send-a-patch-to-someone-using-git-format-patch

jeagles

On Mon, Jul 23, 2018 at 4:53 PM, Jaume Marhuenda  wrote:

> Hi all,
>
> I wanted to bring up the topic about how contributions are made to the
> project, regarding the committer and author fields in the commit metadata.
> The process for contributing it's described here:
> https://cwiki.apache.org/confluence/display/TEZ/How+to+Contribute+to+Tez.
>
> There seems to be to be two options to contribute: submitting a patch and
> submitting a pull request. If a patch is submitted it's harder to preserve
> authorship in the commit metadata since the user doing so may not have a
> github account. In this situation, for a general case, I can't think of
> anything better than the current approach which I understand consists of
> specifying the author in the commit message. But most of the commits are
> going to be made by people that we know of. It shouldn't be too hard for
> any of the committers to find out the github id and email.
>
> The second option is to open a pull request. I think for this case ideally
> we'd preserve authorship since we have all the necessary information to do
> so. A possible way of doing this that would be consistent with the commit
> history and with the previous options is to cherry-pick-squash from the PR
> and then commit to master.
>
> Jaume.
>


Re: [VOTE] Tez 0.10.0 release deadline

2018-07-20 Thread Jonathan Eagles
+1. Seems like a good plan

On Wed, Jul 18, 2018, 2:57 PM Eric Wohlstadter  wrote:

> Hi all,
>  I'd like to propose a deadline of July 27th for work on tickets to be
> included in the 0.10.0 release.
>
> Please +1 or -1 this proposal by July 20th. My understanding is that this
> proposal requires a Lazy Majority.
>


[jira] [Created] (TEZ-3970) NullPointerException in Tez ShuffleHandler Ranged Fetch

2018-07-12 Thread Jonathan Eagles (JIRA)
Jonathan Eagles created TEZ-3970:


 Summary: NullPointerException in Tez ShuffleHandler Ranged Fetch
 Key: TEZ-3970
 URL: https://issues.apache.org/jira/browse/TEZ-3970
 Project: Apache Tez
  Issue Type: Bug
Reporter: Jonathan Eagles
Assignee: Jonathan Eagles


TEZ-3954 introduce a NullPointerException by removing the map id from the 
outputInfo before subsequent partition heads are fetched. Patch moved the 
cleanup to after the loop to prevent NPE and further more adds Unit Tests that 
verify correct behavior of ranged fetch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (TEZ-3955) Upgrade hadoop dependency to 3.0.3

2018-06-15 Thread Jonathan Eagles (JIRA)
Jonathan Eagles created TEZ-3955:


 Summary: Upgrade hadoop dependency to 3.0.3
 Key: TEZ-3955
 URL: https://issues.apache.org/jira/browse/TEZ-3955
 Project: Apache Tez
  Issue Type: Bug
Reporter: Jonathan Eagles
Assignee: Jonathan Eagles






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (TEZ-3954) Reduce Tez Shuffle Handler Memory needs for holding TezIndexRecords

2018-06-15 Thread Jonathan Eagles (JIRA)
Jonathan Eagles created TEZ-3954:


 Summary: Reduce Tez Shuffle Handler Memory needs for holding 
TezIndexRecords
 Key: TEZ-3954
 URL: https://issues.apache.org/jira/browse/TEZ-3954
 Project: Apache Tez
  Issue Type: Bug
Reporter: Jonathan Eagles
Assignee: Jonathan Eagles


Tez Shuffle Handler holds on to entire SpillRecord to better accommodate for 
range fetch operation. For single reduce fetch holding on to spillRecord can 
overwhelm memory if Index Cache is overrun. In the case of single fetch, only 
keep the single tez index record needed and for range fetch, keep the entire 
spill record in memory, but free it after it is needed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   3   4   >