[jira] [Created] (YARN-11162) Set the zk acl for nodes created by ZKConfigurationStore.

2022-05-23 Thread Owen O'Malley (Jira)
Owen O'Malley created YARN-11162:


 Summary: Set the zk acl for nodes created by ZKConfigurationStore.
 Key: YARN-11162
 URL: https://issues.apache.org/jira/browse/YARN-11162
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.10.1
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Fix For: 2.10.2






--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



Re: [DISCUSS] About creation of Hadoop Thirdparty repository for shaded artifacts

2019-09-27 Thread Owen O'Malley
I'm very unhappy with this direction. In particular, I don't think git is a
good place for distribution of binary artifacts. Furthermore, the PMC
shouldn't be releasing anything without a release vote.

I'd propose that we make a third party module that contains the *source* of
the pom files to build the relocated jars. This should absolutely be
treated as a last resort for the mostly Google projects that regularly
break binary compatibility (eg. Protobuf & Guava).

In terms of naming, I'd propose something like:

org.apache.hadoop.thirdparty.protobuf2_5
org.apache.hadoop.thirdparty.guava28

In particular, I think we absolutely need to include the version of the
underlying project. On the other hand, since we should not be shading
*everything* we can drop the leading com.google.

The Hadoop project can make releases of  the thirdparty module:


  org.apache.hadoop
  hadoop-thirdparty-protobuf25
  1.0


Note that the version has to be the hadoop thirdparty release number, which
is part of why you need to have the underlying version in the artifact
name. These we can push to maven central as new releases from Hadoop.

Thoughts?

.. Owen

On Fri, Sep 27, 2019 at 8:38 AM Vinayakumar B 
wrote:

> Hi All,
>
>I wanted to discuss about the separate repo for thirdparty dependencies
> which we need to shaded and include in Hadoop component's jars.
>
>Apologies for the big text ahead, but this needs clear explanation!!
>
>Right now most needed such dependency is protobuf. Protobuf dependency
> was not upgraded from 2.5.0 onwards with the fear that downstream builds,
> which depends on transitive dependency protobuf coming from hadoop's jars,
> may fail with the upgrade. Apparently protobuf does not guarantee source
> compatibility, though it guarantees wire compatibility between versions.
> Because of this behavior, version upgrade may cause breakage in known and
> unknown (private?) downstreams.
>
>So to tackle this, we came up the following proposal in HADOOP-13363.
>
>Luckily, As far as I know, no APIs, either public to user or between
> Hadoop processes, is not directly using protobuf classes in signatures. (If
> any exist, please let us know).
>
>Proposal:
>
>
>1. Create a artifact(s) which contains shaded dependencies. All such
> shading/relocation will be with known prefix
> **org.apache.hadoop.thirdparty.**.
>2. Right now protobuf jar (ex: o.a.h.thirdparty:hadoop-shaded-protobuf)
> to start with, all **com.google.protobuf** classes will be relocated as
> **org.apache.hadoop.thirdparty.com.google.protobuf**.
>3. Hadoop modules, which needs protobuf as dependency, will add this
> shaded artifact as dependency (ex:
> o.a.h.thirdparty:hadoop-shaded-protobuf).
>4. All previous usages of "com.google.protobuf" will be relocated to
> "org.apache.hadoop.thirdparty.com.google.protobuf" in the code and will be
> committed. Please note, this replacement is One-Time directly in source
> code, NOT during compile and package.
>5. Once all usages of "com.google.protobuf" is relocated, then hadoop
> dont care about which version of original  "protobuf-java" is in
> dependency.
>6. Just keep "protobuf-java:2.5.0" in dependency tree not to break the
> downstreams. But hadoop will be originally using the latest protobuf
> present in "o.a.h.thirdparty:hadoop-shaded-protobuf".
>
>7. Coming back to separate repo, Following are most appropriate reasons
> of keeping shaded dependency artifact in separate repo instead of
> submodule.
>
>   7a. These artifacts need not be built all the time. It needs to be
> built only when there is a change in the dependency version or the build
> process.
>   7b. If added as "submodule in Hadoop repo", maven-shade-plugin:shade
> will execute only in package phase. That means, "mvn compile" or "mvn
> test-compile" will not be failed as this artifact will not have relocated
> classes, instead it will have original classes, resulting in compilation
> failure. Workaround, build thirdparty submodule first and exclude
> "thirdparty" submodule in other executions. This will be a complex process
> compared to keeping in a separate repo.
>
>   7c. Separate repo, will be a subproject of Hadoop, using the same
> HADOOP jira project, with different versioning prefixed with "thirdparty-"
> (ex: thirdparty-1.0.0).
>   7d. Separate will have same release process as Hadoop.
>
>
> HADOOP-13363 (https://issues.apache.org/jira/browse/HADOOP-13363) is
> an
> umbrella jira tracking the changes to protobuf upgrade.
>
> PR (https://github.com/apache/hadoop-thirdparty/pull/1) has been
> raised
> for separate repo creation in (HADOOP-16595 (
> https://issues.apache.org/jira/browse/HADOOP-16595)
>
> Please provide your inputs for the proposal and review the PR to
> proceed with the proposal.
>
>
>-Thanks,
> Vinay
>
> On Fri, Sep 27, 2019 at 11:54 AM Vinod Kumar Vavilapalli <
> vino...@apache.org>
> wrote:
>
> > Moving the 

Re: [VOTE] Moving Submarine to a separate Apache project proposal

2019-09-06 Thread Owen O'Malley
Since you don't have any Apache Members, I'll join to provide Apache
oversight.

.. Owen

On Fri, Sep 6, 2019 at 1:38 PM Owen O'Malley  wrote:

> +1 for moving to a new project.
>
> On Sat, Aug 31, 2019 at 10:19 PM Wangda Tan  wrote:
>
>> Hi all,
>>
>> As we discussed in the previous thread [1],
>>
>> I just moved the spin-off proposal to CWIKI and completed all TODO parts.
>>
>>
>> https://cwiki.apache.org/confluence/display/HADOOP/Submarine+Project+Spin-Off+to+TLP+Proposal
>>
>> If you have interests to learn more about this. Please review the proposal
>> let me know if you have any questions/suggestions for the proposal. This
>> will be sent to board post voting passed. (And please note that the
>> previous voting thread [2] to move Submarine to a separate Github repo is
>> a
>> necessary effort to move Submarine to a separate Apache project but not
>> sufficient so I sent two separate voting thread.)
>>
>> Please let me know if I missed anyone in the proposal, and reply if you'd
>> like to be included in the project.
>>
>> This voting runs for 7 days and will be concluded at Sep 7th, 11 PM PDT.
>>
>> Thanks,
>> Wangda Tan
>>
>> [1]
>>
>> https://lists.apache.org/thread.html/4a2210d567cbc05af92c12aa6283fd09b857ce209d537986ed800029@%3Cyarn-dev.hadoop.apache.org%3E
>> [2]
>>
>> https://lists.apache.org/thread.html/6e94469ca105d5a15dc63903a541bd21c7ef70b8bcff475a16b5ed73@%3Cyarn-dev.hadoop.apache.org%3E
>>
>


Re: [VOTE] Moving Submarine to a separate Apache project proposal

2019-09-06 Thread Owen O'Malley
+1 for moving to a new project.

On Sat, Aug 31, 2019 at 10:19 PM Wangda Tan  wrote:

> Hi all,
>
> As we discussed in the previous thread [1],
>
> I just moved the spin-off proposal to CWIKI and completed all TODO parts.
>
>
> https://cwiki.apache.org/confluence/display/HADOOP/Submarine+Project+Spin-Off+to+TLP+Proposal
>
> If you have interests to learn more about this. Please review the proposal
> let me know if you have any questions/suggestions for the proposal. This
> will be sent to board post voting passed. (And please note that the
> previous voting thread [2] to move Submarine to a separate Github repo is a
> necessary effort to move Submarine to a separate Apache project but not
> sufficient so I sent two separate voting thread.)
>
> Please let me know if I missed anyone in the proposal, and reply if you'd
> like to be included in the project.
>
> This voting runs for 7 days and will be concluded at Sep 7th, 11 PM PDT.
>
> Thanks,
> Wangda Tan
>
> [1]
>
> https://lists.apache.org/thread.html/4a2210d567cbc05af92c12aa6283fd09b857ce209d537986ed800029@%3Cyarn-dev.hadoop.apache.org%3E
> [2]
>
> https://lists.apache.org/thread.html/6e94469ca105d5a15dc63903a541bd21c7ef70b8bcff475a16b5ed73@%3Cyarn-dev.hadoop.apache.org%3E
>


[NOTIFICATION] Hadoop trunk rebased

2018-04-26 Thread Owen O'Malley
As we discussed in hdfs-dev@hadoop, I did a force push to Hadoop's trunk to
replace the Ozone merge with a rebase.

That means that you'll need to rebase your branches.

.. Owen


Re: [VOTE] Merging branch HDFS-7240 to trunk

2018-03-14 Thread Owen O'Malley
This discussion seems to have died down coming closer consensus without a
resolution.

I'd like to propose the following compromise:

* HDSL become a subproject of Hadoop.
* HDSL will release separately from Hadoop. Hadoop releases will not
contain HDSL and vice versa.
* HDSL will get its own jira instance so that the release tags stay
separate.
* On trunk (as opposed to release branches) HDSL will be a separate module
in Hadoop's source tree. This will enable the HDSL to work on their trunk
and the Hadoop trunk without making releases for every change.
* Hadoop's trunk will only build HDSL if a non-default profile is enabled.
* When Hadoop creates a release branch, the RM will delete the HDSL module
from the branch.
* HDSL will have their own Yetus checks and won't cause failures in the
Hadoop patch check.

I think this accomplishes most of the goals of encouraging HDSL development
while minimizing the potential for disruption of HDFS development.

Thoughts? Andrew, Jitendra, & Sanjay?

Thanks,
   Owen


Re: [VOTE] Merging branch HDFS-7240 to trunk

2018-03-09 Thread Owen O'Malley
Hi Joep,

On Tue, Mar 6, 2018 at 6:50 PM, J. Rottinghuis 
wrote:

Obviously when people do want to use Ozone, then having it in the same repo
> is easier. The flipside is that, separate top-level project in the same
> repo or not, it adds to the Hadoop releases.
>

Apache projects are about the group of people who are working together.
There is a large overlap between the team working on HDFS and Ozone, which
is a lot of the motivation to keep project overhead to a minimum and not
start a new project.

Using the same releases or separate releases is a distinct choice. Many
Apache projects, such as Common and Maven, have multiple artifacts that
release independently. In Hive, we have two sub-projects that release
indepdendently: Hive Storage API, and Hive.

One thing we did during that split to minimize the challenges to the
developers was that Storage API and Hive have the same master branch.
However, since they have different releases, they have their own release
branches and release numbers.

If there is a change in Ozone and a new release needed, it would have to
> wait for a Hadoop release. Ditto if there is a Hadoop release and there is
> an issue with Ozone. The case that one could turn off Ozone through a Maven
> profile works only to some extend.
> If we have done a 3.x release with Ozone in it, would it make sense to do
> a 3.y release with y>x without Ozone in it? That would be weird.
>

Actually, if Ozone is marked as unstable/evolving (we should actually have
an even stronger warning for a feature preview), we could remove it in a
3.x. If a user picks up a feature before it is stable, we try to provide a
stable platform, but mistakes happen. Introducing an incompatible change to
the Ozone API between 3.1 and 3.2 wouldn't be good, but it wouldn't be the
end of the world.

.. Owen


Re: [VOTE] Merging branch HDFS-7240 to trunk

2018-03-02 Thread Owen O'Malley
On Thu, Mar 1, 2018 at 11:03 PM, Andrew Wang 
wrote:

Owen mentioned making a Hadoop subproject; we'd have to
> hash out what exactly this means (I assume a separate repo still managed by
> the Hadoop project), but I think we could make this work if it's more
> attractive than incubation or a new TLP.


Ok, there are multiple levels of sub-projects that all make sense:

   - Same source tree, same releases - examples like HDFS & YARN
   - Same master branch, separate releases and release branches - Hive's
   Storage API vs Hive. It is in the source tree for the master branch, but
   has distinct releases and release branches.
   - Separate source, separate release - Apache Commons.

There are advantages and disadvantages to each. I'd propose that we use the
same source, same release pattern for Ozone. Note that we tried and later
reverted doing Common, HDFS, and YARN as separate source, separate release
because it was too much trouble. I like Daryn's idea of putting it as a top
level directory in Hadoop and making sure that nothing in Common, HDFS, or
YARN depend on it. That way if a Release Manager doesn't think it is ready
for release, it can be trivially removed before the release.

One thing about using the same releases, Sanjay and Jitendra are signing up
to make much more regular bugfix and minor releases in the near future. For
example, they'll need to make 3.2 relatively soon to get it released and
then 3.3 somewhere in the next 3 to 6 months. That would be good for the
project. Hadoop needs more regular releases and fewer big bang releases.

.. Owen


Re: [VOTE] Merging branch HDFS-7240 to trunk

2018-03-01 Thread Owen O'Malley
I think it would be good to get this in sooner rather than later, but I
have some thoughts.

   1. It is hard to tell what has changed. git rebase -i tells me the
   branch has 722 commits. The rebase failed with a conflict. It would really
   help if you rebased to current trunk.
   2. I think Ozone would be a good Hadoop subproject, but it should be
   outside of HDFS.
   3. CBlock, which is also coming in this merge, would benefit from more
   separation from HDFS.
   4. What are the new transitive dependencies that Ozone, HDSL, and CBlock
   adding to the clients? The servers matter too, but the client dependencies
   have a huge impact on our users.
   5. Have you checked the new dependencies for compatibility with ASL?


On Thu, Mar 1, 2018 at 2:45 PM, Clay B.  wrote:

> Oops, retrying now subscribed to more than solely yarn-dev.
>
> -Clay
>
>
> On Wed, 28 Feb 2018, Clay B. wrote:
>
> +1 (non-binding)
>>
>> I have walked through the code and find it very compelling as a user; I
>> really look forward to seeing the Ozone code mature and it maturing HDFS
>> features together. The points which excite me as an eight year HDFS user
>> are:
>>
>> * Excitement for making the datanode a storage technology container - this
>>  patch clearly brings fresh thought to HDFS keeping it from growing stale
>>
>> * Ability to build upon a shared storage infrastructure for diverse
>>  loads: I do not want to have "stranded" storage capacity or have to
>>  manage competing storage systems on the same disks (and further I want
>>  the metrics datanodes can provide me today, so I do not have to
>>  instrument two systems or evolve their instrumentation separately).
>>
>> * Looking forward to supporting object-sized files!
>>
>> * Moves HDFS in the right direction to test out new block management
>>  techniques for scaling HDFS. I am really excited to see the raft
>>  integration; I hope it opens a new era in Hadoop matching modern systems
>>  design with new consistency and replication options in our ever
>>  distributed ecosystem.
>>
>> -Clay
>>
>> On Mon, 26 Feb 2018, Jitendra Pandey wrote:
>>
>>Dear folks,
>>>   We would like to start a vote to merge HDFS-7240 branch into
>>> trunk. The context can be reviewed in the DISCUSSION thread, and in the
>>> jiras (See references below).
>>>
>>>HDFS-7240 introduces Hadoop Distributed Storage Layer (HDSL), which
>>> is a distributed, replicated block layer.
>>>The old HDFS namespace and NN can be connected to this new block
>>> layer as we have described in HDFS-10419.
>>>We also introduce a key-value namespace called Ozone built on HDSL.
>>>
>>>The code is in a separate module and is turned off by default. In a
>>> secure setup, HDSL and Ozone daemons cannot be started.
>>>
>>>The detailed documentation is available at
>>> https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+
>>> Distributed+Storage+Layer+and+Applications
>>>
>>>
>>>I will start with my vote.
>>>+1 (binding)
>>>
>>>
>>>Discussion Thread:
>>>  https://s.apache.org/7240-merge
>>>  https://s.apache.org/4sfU
>>>
>>>Jiras:
>>>   https://issues.apache.org/jira/browse/HDFS-7240
>>>   https://issues.apache.org/jira/browse/HDFS-10419
>>>   https://issues.apache.org/jira/browse/HDFS-13074
>>>   https://issues.apache.org/jira/browse/HDFS-13180
>>>
>>>
>>>Thanks
>>>jitendra
>>>
>>>
>>>
>>>
>>>
>>>DISCUSSION THREAD SUMMARY :
>>>
>>>On 2/13/18, 6:28 PM, "sanjay Radia" 
>>> wrote:
>>>
>>>Sorry the formatting got messed by my email client.  Here
>>> it is again
>>>
>>>
>>>Dear
>>> Hadoop Community Members,
>>>
>>>   We had multiple community discussions, a few meetings
>>> in smaller groups and also jira discussions with respect to this thread. We
>>> express our gratitude for participation and valuable comments.
>>>
>>>The key questions raised were following
>>>1) How the new block storage layer and OzoneFS benefit
>>> HDFS and we were asked to chalk out a roadmap towards the goal of a
>>> scalable namenode working with the new storage layer
>>>2) We were asked to provide a security design
>>>3)There were questions around stability given ozone
>>> brings in a large body of code.
>>>4) Why can?t they be separate projects forever or merged
>>> in when production ready?
>>>
>>>We have responded to all the above questions with
>>> detailed explanations and answers on the jira as well as in the
>>> discussions. We believe that should sufficiently address community?s
>>> concerns.
>>>
>>>Please see the summary below:
>>>
>>>1) The new code base benefits HDFS scaling and a roadmap
>>> has been provided.
>>>
>>>