Re: [VOTE] Merging branch HDFS-7240 to trunk

2018-03-16 Thread sanjay Radia

> On Mar 5, 2018, at 4:08 PM, Andrew Wang  wrote:
> 
> - NN on top HDSL where the NN uses the new block layer (Both Daryn and Owen 
> acknowledge the benefit of the new block layer).  We have two choices here
>  ** a) Evolve NN so that it can interact with both old and new block layer,
>  **  b) Fork and create new NN that works only with new block layer, the old 
> NN will continue to work with old block layer.
> There are trade-offs but clearly the 2nd option has least impact on the old 
> HDFS code.
> 
> Are you proposing that we pursue the 2nd option to integrate HDSL with HDFS?


Originally I would have prefered (a); but Owen made a strong case for (b) in my 
discussions with his last week.
Overall we need a broader discussion around the next steps for NN evolution and 
how to chart the course; I am not locked into any particular path or how we 
would do it. 
Let me make a more detailed response in HDFS-10419.

sanjay



Re: [VOTE] Merging branch HDFS-7240 to trunk

2018-03-09 Thread sanjay Radia
Joep,  You raise a number of points:

(1) Ozone vs and object stores. “Some users would choose Ozone as that layer, 
some might use S3, others GCS, or Azure, or something else”.
(2) How HDSL/Ozone fits into Hadoop and whether it is necessary.
(3) You raise the release issue which we will respond in a separate email.

Let me respond to 1 & 2:
***Wrt to (1) Ozone vs other object stores***
Neither HDFS or Ozone has any real role in cloud except for temp data. The cost 
of local disk or EBS is so high that long term data storage on HDFS or even 
Ozone is prohibitive.
So why the hell create the KV namespace? We need to stabilize the HDSL where 
data is stored.  - We are targeting Hive and SPark apps to stabilize HDSL using 
real Hadoop apps over OzoneFS.
But HDSL/Ozone is not feature compatible with HDFS so how will users even use 
it for real to stability. Users can run HDFS and Ozone side by side in same 
cluster and have two namespace (just like in Federation) and run apps on both: 
run some hive and spark apps on Ozone and others that need full HDFS feature 
(encryption) on HDFS. As it becomes stable they can start using HDSL/Ozone for 
production use for a portion of their data.



***Wrt to (2) HDSL/Ozone fitting into Hadoop and why the same repository***
Ozone KV is a temporary step. Real goal is to put NN on top of HDSL, We have 
shown how to do that in the roadmap that Konstantine and Chris D asked. 
Milestone 1 is feasible and doesn't require removal of FSN lock. We have also 
shown several cases of sharing other code in future (protocol engine). This 
co-development will be easier if in the same repo. Over time HDSL + ported NN  
will create a new HDFS and become feature compatible - some of the feature will 
come for free because they are in NN and will port over to the new NN, Some are 
in block layer (erasure code) and will have to be added to HDSL.

--- You compare with Yarn, HDFS and Common. HDFS and Yarn are independent but 
both depend on Hadoop common (e.g. HBase runs on HDFS without Yarn).   HDSL and 
Ozone will depend on Hadoop common, Indeed the new protocol engine of HDSL 
might move to Hadoop common or HDFS. We have made sure that there are no 
dependencies of HDFS on HDSL or currently.


***The Repo issue and conclusion***
HDFS community will need to work together as we evolve old HDFS to use HDSL, 
new protocol engine and Raft. and together evolve to a newer more powerful set 
of sub components. It is important that they are in same repo and that we can 
share code through both private interface. We are not trying to build a 
competing Object store but to improve HDFS and fixing scalability fundamentally 
is hard and we are asking for an environment for that to happen easily over the 
next year while heeding to the stability concerns of HDFS developers (eg we  
remove compile time dependency, maven profile). This work is not being done by 
members of foreign project trying to insert code in Hadoop, but by Hadoop/HDFS 
developers with given track record s and are active participation in Hadoop and 
HDFS. Our jobs depend on HDFS/Hadoop stability - destabilizing is the last 
thing we want to do; we have responded every constructive feedback 


sanjay


> On Mar 6, 2018, at 6:50 PM, J. Rottinghuis  wrote:
> 
> Sorry for jumping in late into the fray of this discussion.
> 
> It seems Ozone is a large feature. I appreciate the development effort and
> the desire to get this into the hands of users.
> I understand the need to iterate quickly and to reduce overhead for
> development.
> I also agree that Hadoop can benefit from a quicker release cycle. For our
> part, this is a challenge as we have a large installation with multiple
> clusters and thousands of users. It is a constant balance between jumping
> to the newest release and the cost of this integration and test at our
> scale, especially when things aren't backwards compatible. We try to be
> good citizens and upstream our changes and contribute back.
> 
> The point was made that splitting the projects such as common and Yarn
> didn't work and had to be reverted. That was painful and a lot of work for
> those involved for sure. This project may be slightly different in that
> hadoop-common, Yarn and HDFS made for one consistent whole. One couldn't
> run a project without the other.
> 
> Having a separate block management layer with possibly multiple block
> implementation as pluggable under the covers would be a good future
> development for HDFS. Some users would choose Ozone as that layer, some
> might use S3, others GCS, or Azure, or something else.
> If the argument is made that nobody will be able to run Hadoop as a
> consistent stack without Ozone, then that would be a strong case to keep
> things in the same repo.
> 
> Obviously when people do want to use Ozone, then having it in the same repo
> is easier. The flipside is that, separate top-level project in the same
> repo or not, it adds to the Hadoop 

Re: [VOTE] Merging branch HDFS-7240 to trunk

2018-03-05 Thread sanjay Radia
 hdfs but ignore it due to anticipation of its
>> departure/demise.  I’m not implying that’s currently happening, it’s just
>> what I don’t want to see.
>> 
>> 
>> We as a community and our customers need an evolution, not a revolution,
>> and definitively not a civil war.  Hdfs has too much legacy code rot that
>> is hard to change.  Too many poorly implemented features.   Perhaps I’m
>> overly optimistic that freshly redesigned code can counterbalance
>> performance degradations in the NN.  I’m also reluctant, but realize it is
>> being driven by some hdfs veterans that know/understand historical hdfs
>> design strengths and flaws.
>> 
>> 
>> If the initially cited issues are addressed, I’m +0.5 for the concept of
>> bringing in ozone if it's not going to be a proverbial bull in the china
>> shop.
>> 
>> 
>> Daryn
>> 
>> On Mon, Feb 26, 2018 at 3:18 PM, Jitendra Pandey <jiten...@hortonworks.com
>>> 
>> wrote:
>> 
>>>Dear folks,
>>>   We would like to start a vote to merge HDFS-7240 branch into
>>> trunk. The context can be reviewed in the DISCUSSION thread, and in the
>>> jiras (See references below).
>>> 
>>>HDFS-7240 introduces Hadoop Distributed Storage Layer (HDSL), which
>> is
>>> a distributed, replicated block layer.
>>>The old HDFS namespace and NN can be connected to this new block
>> layer
>>> as we have described in HDFS-10419.
>>>We also introduce a key-value namespace called Ozone built on HDSL.
>>> 
>>>The code is in a separate module and is turned off by default. In a
>>> secure setup, HDSL and Ozone daemons cannot be started.
>>> 
>>>The detailed documentation is available at
>>> https://cwiki.apache.org/confluence/display/HADOOP/
>>> Hadoop+Distributed+Storage+Layer+and+Applications
>>> 
>>> 
>>>I will start with my vote.
>>>+1 (binding)
>>> 
>>> 
>>>Discussion Thread:
>>>  https://s.apache.org/7240-merge
>>>  https://s.apache.org/4sfU
>>> 
>>>Jiras:
>>>   https://issues.apache.org/jira/browse/HDFS-7240
>>>   https://issues.apache.org/jira/browse/HDFS-10419
>>>   https://issues.apache.org/jira/browse/HDFS-13074
>>>   https://issues.apache.org/jira/browse/HDFS-13180
>>> 
>>> 
>>>Thanks
>>>jitendra
>>> 
>>> 
>>> 
>>> 
>>> 
>>>DISCUSSION THREAD SUMMARY :
>>> 
>>>On 2/13/18, 6:28 PM, "sanjay Radia" <sanjayo...@gmail.com>
>>> wrote:
>>> 
>>>Sorry the formatting got messed by my email client.  Here
>>> it is again
>>> 
>>> 
>>>Dear
>>> Hadoop Community Members,
>>> 
>>>   We had multiple community discussions, a few meetings
>>> in smaller groups and also jira discussions with respect to this thread.
>> We
>>> express our gratitude for participation and valuable comments.
>>> 
>>>The key questions raised were following
>>>1) How the new block storage layer and OzoneFS benefit
>>> HDFS and we were asked to chalk out a roadmap towards the goal of a
>>> scalable namenode working with the new storage layer
>>>2) We were asked to provide a security design
>>>3)There were questions around stability given ozone
>> brings
>>> in a large body of code.
>>>4) Why can’t they be separate projects forever or merged
>>> in when production ready?
>>> 
>>>We have responded to all the above questions with
>> detailed
>>> explanations and answers on the jira as well as in the discussions. We
>>> believe that should sufficiently address community’s concerns.
>>> 
>>>Please see the summary below:
>>> 
>>>1) The new code base benefits HDFS scaling and a roadmap
>>> has been provided.
>>> 
>>>Summary:
>>>  - New block storage layer addresses the scalability of
>>> the block layer. We have shown how existing NN can be connected to the
>> new
>>> block layer and its benefits. We have shown 2 milestones, 1st milestone
>> is
>>

Re: [VOTE] Merging branch HDFS-7240 to trunk

2018-02-28 Thread sanjay Radia
equisite for HDFS-on-HDSL to be possible.Finally, I earnestly believe
>> that Ozone/HDSL itself would benefit from being a separate project. Ozone
>> could release faster and iterate more quickly if it wasn't hampered by
>> Hadoop's release schedule and security and compatibility requirements.
>> There are also publicity and community benefits; it's an opportunity to
>> build a community focused on the novel capabilities and architectural
>> choices of Ozone/HDSL. There are examples of other projects that were
>> "incubated" on a branch in the Hadoop repo before being spun off to great
>> success.In conclusion, I'd like to see Ozone succeeding and thriving as a
>> separate project. Meanwhile, we can work on the HDFS refactoring required
>> to separate the FSN and BM and make it pluggable. At that point (likely in
>> the Hadoop 4 timeframe), we'll be ready to pursue HDFS-on-HDSL integration.*
>> Best,
>> Andrew
>> 
>> On Mon, Feb 26, 2018 at 1:18 PM, Jitendra Pandey <jiten...@hortonworks.com
>>> wrote:
>> 
>>>Dear folks,
>>>   We would like to start a vote to merge HDFS-7240 branch into
>>> trunk. The context can be reviewed in the DISCUSSION thread, and in the
>>> jiras (See references below).
>>> 
>>>HDFS-7240 introduces Hadoop Distributed Storage Layer (HDSL), which
>>> is a distributed, replicated block layer.
>>>The old HDFS namespace and NN can be connected to this new block
>>> layer as we have described in HDFS-10419.
>>>We also introduce a key-value namespace called Ozone built on HDSL.
>>> 
>>>The code is in a separate module and is turned off by default. In a
>>> secure setup, HDSL and Ozone daemons cannot be started.
>>> 
>>>The detailed documentation is available at
>>> https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+
>>> Distributed+Storage+Layer+and+Applications
>>> 
>>> 
>>>I will start with my vote.
>>>+1 (binding)
>>> 
>>> 
>>>Discussion Thread:
>>>  https://s.apache.org/7240-merge
>>>  https://s.apache.org/4sfU
>>> 
>>>Jiras:
>>>   https://issues.apache.org/jira/browse/HDFS-7240
>>>   https://issues.apache.org/jira/browse/HDFS-10419
>>>   https://issues.apache.org/jira/browse/HDFS-13074
>>>   https://issues.apache.org/jira/browse/HDFS-13180
>>> 
>>> 
>>>Thanks
>>>jitendra
>>> 
>>> 
>>> 
>>> 
>>> 
>>>DISCUSSION THREAD SUMMARY :
>>> 
>>>On 2/13/18, 6:28 PM, "sanjay Radia" <sanjayo...@gmail.com>
>>> wrote:
>>> 
>>>Sorry the formatting got messed by my email client.  Here
>>> it is again
>>> 
>>> 
>>>Dear
>>> Hadoop Community Members,
>>> 
>>>   We had multiple community discussions, a few meetings
>>> in smaller groups and also jira discussions with respect to this thread. We
>>> express our gratitude for participation and valuable comments.
>>> 
>>>The key questions raised were following
>>>1) How the new block storage layer and OzoneFS benefit
>>> HDFS and we were asked to chalk out a roadmap towards the goal of a
>>> scalable namenode working with the new storage layer
>>>2) We were asked to provide a security design
>>>3)There were questions around stability given ozone
>>> brings in a large body of code.
>>>4) Why can’t they be separate projects forever or merged
>>> in when production ready?
>>> 
>>>We have responded to all the above questions with
>>> detailed explanations and answers on the jira as well as in the
>>> discussions. We believe that should sufficiently address community’s
>>> concerns.
>>> 
>>>Please see the summary below:
>>> 
>>>1) The new code base benefits HDFS scaling and a roadmap
>>> has been provided.
>>> 
>>>Summary:
>>>  - New block storage layer addresses the scalability of
>>> the block layer. We have shown how existing NN can be connected to the new
>>> block layer and its benefits. We have shown 2 mileston

Re: [DISCUSSION] Merging HDFS-7240 Object Store (Ozone) to trunk

2018-02-20 Thread sanjay Radia



Konstantine


Thanks for your feedback and comments over the last few months.  Have we 
addressed all your  issues and concerns?

sanjay


> On Feb 13, 2018, at 6:28 PM, sanjay Radia <sanjayo...@gmail.com> wrote:
> 
> Sorry the formatting got messed by my email client.  Here it is again
> 
> 
> Dear
> Hadoop Community Members,
> 
>  We had multiple community discussions, a few meetings in smaller groups and 
> also jira discussions with respect to this thread. We express our gratitude 
> for participation and valuable comments. 
> 
> The key questions raised were following
> 1) How the new block storage layer and OzoneFS benefit HDFS and we were asked 
> to chalk out a roadmap towards the goal of a scalable namenode working with 
> the new storage layer
> 2) We were asked to provide a security design
> 3)There were questions around stability given ozone brings in a large body of 
> code.
> 4) Why can’t they be separate projects forever or merged in when production 
> ready?
> 
> We have responded to all the above questions with detailed explanations and 
> answers on the jira as well as in the discussions. We believe that should 
> sufficiently address community’s concerns. 
> 
> Please see the summary below:
> 
> 1) The new code base benefits HDFS scaling and a roadmap has been provided. 
> 
> Summary:
> - New block storage layer addresses the scalability of the block layer. We 
> have shown how existing NN can be connected to the new block layer and its 
> benefits. We have shown 2 milestones, 1st milestone is much simpler than 2nd 
> milestone while giving almost the same scaling benefits. Originally we had 
> proposed simply milestone 2 and the community felt that removing the FSN/BM 
> lock was was a fair amount of work and a simpler solution would be useful
> - We provide a new K-V namespace called Ozone FS with FileSystem/FileContext 
> plugins to allow the users to use the new system. BTW Hive and Spark work 
> very well on KV-namespaces on the cloud. This will facilitate stabilizing the 
> new block layer. 
> - The new block layer has a new netty based protocol engine in the Datanode 
> which, when stabilized, can be used by  the old hdfs block layer. See details 
> below on sharing of code.
> 
> 
> 2) Stability impact on the existing HDFS code base and code separation. The 
> new block layer and the OzoneFS are in modules that are separate from old 
> HDFS code - currently there are no calls from HDFS into Ozone except for DN 
> starting the new block  layer module if configured to do so. It does not add 
> instability (the instability argument has been raised many times). Over time 
> as we share code, we will ensure that the old HDFS continues to remains 
> stable. (for example we plan to stabilize the new netty based protocol engine 
> in the new block layer before sharing it with HDFS’s old block layer)
> 
> 
> 3) In the short term and medium term, the new system and HDFS  will be used 
> side-by-side by users. Side by-side usage in the short term for testing and 
> side-by-side in the medium term for actual production use till the new system 
> has feature parity with old HDFS. During this time, sharing the DN daemon and 
> admin functions between the two systems is operationally important:  
> - Sharing DN daemon to avoid additional operational daemon lifecycle 
> management
> - Common decommissioning of the daemon and DN: One place to decommission for 
> a node and its storage.
> - Replacing failed disks and internal balancing capacity across disks - this 
> needs to be done for both the current HDFS blocks and the new block-layer 
> blocks.
> - Balancer: we would like use the same balancer and provide a common way to 
> balance and common management of the bandwidth used for balancing
> - Security configuration setup - reuse existing set up for DNs rather then a 
> new one for an independent cluster.
> 
> 
> 4) Need to easily share the block layer code between the two systems when 
> used side-by-side. Areas where sharing code is desired over time: 
> - Sharing new block layer’s  new netty based protocol engine for old HDFS DNs 
> (a long time sore issue for HDFS block layer). 
> - Shallow data copy from old system to new system is practical only if within 
> same project and daemon otherwise have to deal with security setting and 
> coordinations across daemons. Shallow copy is useful as customer migrate from 
> old to new.
> - Shared disk scheduling in the future and in the short term have a single 
> round robin rather than independent round robins.
> While sharing code across projects is technically possible (anything is 
> possible in software),  it is significantly harder typically requiring  
> cleaner public apis e

Re: [DISCUSSION] Merging HDFS-7240 Object Store (Ozone) to trunk

2018-02-13 Thread sanjay Radia
Sorry the formatting got messed by my email client.  Here it is again


Dear
 Hadoop Community Members,

   We had multiple community discussions, a few meetings in smaller groups and 
also jira discussions with respect to this thread. We express our gratitude for 
participation and valuable comments. 

The key questions raised were following
1) How the new block storage layer and OzoneFS benefit HDFS and we were asked 
to chalk out a roadmap towards the goal of a scalable namenode working with the 
new storage layer
2) We were asked to provide a security design
3)There were questions around stability given ozone brings in a large body of 
code.
4) Why can’t they be separate projects forever or merged in when production 
ready?

We have responded to all the above questions with detailed explanations and 
answers on the jira as well as in the discussions. We believe that should 
sufficiently address community’s concerns. 

Please see the summary below:

1) The new code base benefits HDFS scaling and a roadmap has been provided. 

Summary:
  - New block storage layer addresses the scalability of the block layer. We 
have shown how existing NN can be connected to the new block layer and its 
benefits. We have shown 2 milestones, 1st milestone is much simpler than 2nd 
milestone while giving almost the same scaling benefits. Originally we had 
proposed simply milestone 2 and the community felt that removing the FSN/BM 
lock was was a fair amount of work and a simpler solution would be useful
  - We provide a new K-V namespace called Ozone FS with FileSystem/FileContext 
plugins to allow the users to use the new system. BTW Hive and Spark work very 
well on KV-namespaces on the cloud. This will facilitate stabilizing the new 
block layer. 
  - The new block layer has a new netty based protocol engine in the Datanode 
which, when stabilized, can be used by  the old hdfs block layer. See details 
below on sharing of code.


2) Stability impact on the existing HDFS code base and code separation. The new 
block layer and the OzoneFS are in modules that are separate from old HDFS code 
- currently there are no calls from HDFS into Ozone except for DN starting the 
new block  layer module if configured to do so. It does not add instability 
(the instability argument has been raised many times). Over time as we share 
code, we will ensure that the old HDFS continues to remains stable. (for 
example we plan to stabilize the new netty based protocol engine in the new 
block layer before sharing it with HDFS’s old block layer)


3) In the short term and medium term, the new system and HDFS  will be used 
side-by-side by users. Side by-side usage in the short term for testing and 
side-by-side in the medium term for actual production use till the new system 
has feature parity with old HDFS. During this time, sharing the DN daemon and 
admin functions between the two systems is operationally important:  
  - Sharing DN daemon to avoid additional operational daemon lifecycle 
management
  - Common decommissioning of the daemon and DN: One place to decommission for 
a node and its storage.
  - Replacing failed disks and internal balancing capacity across disks - this 
needs to be done for both the current HDFS blocks and the new block-layer 
blocks.
  - Balancer: we would like use the same balancer and provide a common way to 
balance and common management of the bandwidth used for balancing
  - Security configuration setup - reuse existing set up for DNs rather then a 
new one for an independent cluster.


4) Need to easily share the block layer code between the two systems when used 
side-by-side. Areas where sharing code is desired over time: 
  - Sharing new block layer’s  new netty based protocol engine for old HDFS DNs 
(a long time sore issue for HDFS block layer). 
  - Shallow data copy from old system to new system is practical only if within 
same project and daemon otherwise have to deal with security setting and 
coordinations across daemons. Shallow copy is useful as customer migrate from 
old to new.
  - Shared disk scheduling in the future and in the short term have a single 
round robin rather than independent round robins.
While sharing code across projects is technically possible (anything is 
possible in software),  it is significantly harder typically requiring  cleaner 
public apis etc. Sharing within a project though internal APIs is often simpler 
(such as the protocol engine that we want to share).


5) Security design, including a threat model and and the solution has been 
posted.
6) Temporary Separation and merge later: Several of the comments in the jira 
have argued that we temporarily separate the two code bases for now and then 
later merge them when the new code is stable:

  - If there is agreement to merge later, why bother separating now - there 
needs to be to be good reasons to separate now.  We have addressed the 
stability and separation of the new code from existing above.
  - Merge 

Re: [DISCUSSION] Merging HDFS-7240 Object Store (Ozone) to trunk

2018-02-13 Thread sanjay Radia
Dear Hadoop Community Members,

We had multiple community discussions, a few meetings in smaller groups and 
also jira discussions with respect to this thread. We express our gratitude for 
participation and valuable comments. The key questions raised were following 
How the new block storage layer and OzoneFS benefit HDFS and we were asked to 
chalk out a roadmap towards the goal of a scalable namenode working with the 
new storage layer
We were asked to provide a security design
There were questions around stability given ozone brings in a large body of 
code.
Why can’t they be separate projects forever or merged in when production ready?

We have responded to all the above questions with detailed explanations and 
answers on the jira as well as in the discussions. We believe that should 
sufficiently address community’s concerns. Please see the summary below:

The new code base benefits to HDFS scaling and a roadmap has been provided. 
Summary:
New block storage layer addresses the scalability of the block layer. We have 
shown how existing NN can be connected to the new block layer and its benefits. 
We have shown 2 milestones, 1st milestone is much simpler than 2nd milestone 
while giving almost the same scaling benefits.  Originally we had proposed 
simply milestone 2 and the community felt that removing the FSN/BM lock was was 
a fair amount of work and a simpler solution would be useful.
We provide a new K-V namespace called Ozone FS with FileSystem/FileContext 
plugins to allow the users to use the new system. BTW Hive and Spark work very 
well on KV-namespaces on the cloud. This will facilitate stabilizing the new 
block layer. 
The new block layer has a new netty based protocol engine in the Datanode 
which, when stabilized, can be used by the old hdfs block layer. See details 
below on sharing of code.
Stability impact on the existing HDFS code base and code separation. The new 
block layer and the OzoneFS are in modules that are separate from old HDFS code 
- currently there are no calls from HDFS into Ozone except for DN starting the 
new block  layer module if configured to do so. It does not add instability 
(the instability argument has been raised many times). Over time as we share 
code, we will ensure that the old HDFS continues to remains stable. (for 
example we plan to stabilize the new netty based protocol engine in the new 
block layer before sharing it with HDFS’s old block layer)
In the short term and medium term, the new system and HDFS  will be used 
side-by-side by users. Side by-side usage in the short term for testing and 
side-by-side in the medium term for actual production use till the new system 
has feature parity with old HDFS. During this time, sharing the DN daemon and 
admin functions between the two systems is operationally important:  
Sharing DN daemon to avoid additional operational daemon lifecycle management
Common decommissioning of the daemon and DN: One place to decommission for a 
node and its storage.
Replacing failed disks and internal balancing capacity across disks - this 
needs to be done for both the current HDFS blocks and the new block-layer 
blocks.
Balancer: we would like use the same balancer and provide a common way to 
balance and common management of the bandwidth used for balancing
Security configuration setup - reuse existing set up for DNs rather then a new 
one for an independent cluster.
Need to easily share the block layer code between the two systems when used 
side-by-side. Areas where sharing code is desired over time: 
Sharing new block layer’s  new netty based protocol engine for old HDFS DNs (a 
long time sore issue for HDFS block layer). 
Shallow data copy from old system to new system is practical only if within 
same project and daemon otherwise have to deal with security setting and 
coordinations across daemons. Shallow copy is useful as customer migrate from 
old to new.
Shared disk scheduling in the future and in the short term have a single round 
robin rather than independent round robins.
While sharing code across projects is technically possible (anything is 
possible in software),  it is significantly harder typically requiring  cleaner 
public apis etc. Sharing within a project though internal APIs is often simpler 
(such as the protocol engine that we want to share).
Security design, including a threat model and and the solution has been posted.
Temporary Separation and merge later: Several of the comments in the jira have 
argued that we temporarily separate the two code bases for now and then later 
merge them when the new code is stable:
If there is agreement to merge later, why bother separating now - there needs 
to be to be good reasons to separate now.  We have addressed the stability and 
separation of the new code from existing above.
Merge the new code back into HDFS later will be harder. 
The code and goals will diverge further. 
We will be taking on extra work to split and then take extra work to 

Re: [DISCUSSION] Merging HDFS-7240 Object Store (Ozone) to trunk

2017-11-03 Thread sanjay Radia
Konstantine, 
 Thanks for your comments, questions and feedback. I have attached a document 
to the HDFS-7240 jira 
 that explains a design for scaling HDFS and how Ozone paves the way towards 
the full solution.


https://issues.apache.org/jira/secure/attachment/12895963/HDFS%20Scalability%20and%20Ozone.pdf


sanjay




> On Oct 28, 2017, at 2:00 PM, Konstantin Shvachko  wrote:
> 
> Hey guys,
> 
> It is an interesting question whether Ozone should be a part of Hadoop.
> There are two main reasons why I think it should not.
> 
> 1. With close to 500 sub-tasks, with 6 MB of code changes, and with a
> sizable community behind, it looks to me like a whole new project.
> It is essentially a new storage system, with different (than HDFS)
> architecture, separate S3-like APIs. This is really great - the World sure
> needs more distributed file systems. But it is not clear why Ozone should
> co-exist with HDFS under the same roof.
> 
> 2. Ozone is probably just the first step in rebuilding HDFS under a new
> architecture. With the next steps presumably being HDFS-10419 and
> HDFS-8.
> The design doc for the new architecture has never been published. I can
> only assume based on some presentations and personal communications that
> the idea is to use Ozone as a block storage, and re-implement NameNode, so
> that it stores only a partial namesapce in memory, while the bulk of it
> (cold data) is persisted to a local storage.
> Such architecture makes me wonder if it solves Hadoop's main problems.
> There are two main limitations in HDFS:
>  a. The throughput of Namespace operations. Which is limited by the number
> of RPCs the NameNode can handle
>  b. The number of objects (files + blocks) the system can maintain. Which
> is limited by the memory size of the NameNode.
> The RPC performance (a) is more important for Hadoop scalability than the
> object count (b). The read RPCs being the main priority.
> The new architecture targets the object count problem, but in the expense
> of the RPC throughput. Which seems to be a wrong resolution of the tradeoff.
> Also based on the use patterns on our large clusters we read up to 90% of
> the data we write, so cold data is a small fraction and most of it must be
> cached.
> 
> To summarize:
> - Ozone is a big enough system to deserve its own project.
> - The architecture that Ozone leads to does not seem to solve the intrinsic
> problems of current HDFS.
> 
> I will post my opinion in the Ozone jira. Should be more convenient to
> discuss it there for further reference.
> 
> Thanks,
> --Konstantin
> 
> 
> 
> On Wed, Oct 18, 2017 at 6:54 PM, Yang Weiwei  wrote:
> 
>> Hello everyone,
>> 
>> 
>> I would like to start this thread to discuss merging Ozone (HDFS-7240) to
>> trunk. This feature implements an object store which can co-exist with
>> HDFS. Ozone is disabled by default. We have tested Ozone with cluster sizes
>> varying from 1 to 100 data nodes.
>> 
>> 
>> 
>> The merge payload includes the following:
>> 
>>  1.  All services, management scripts
>>  2.  Object store APIs, exposed via both REST and RPC
>>  3.  Master service UIs, command line interfaces
>>  4.  Pluggable pipeline Integration
>>  5.  Ozone File System (Hadoop compatible file system implementation,
>> passes all FileSystem contract tests)
>>  6.  Corona - a load generator for Ozone.
>>  7.  Essential documentation added to Hadoop site.
>>  8.  Version specific Ozone Documentation, accessible via service UI.
>>  9.  Docker support for ozone, which enables faster development cycles.
>> 
>> 
>> To build Ozone and run ozone using docker, please follow instructions in
>> this wiki page. https://cwiki.apache.org/confl
>> uence/display/HADOOP/Dev+cluster+with+docker.
>> 
>> 
>> We have built a passionate and diverse community to drive this feature
>> development. As a team, we have achieved significant progress in past 3
>> years since first JIRA for HDFS-7240 was opened on Oct 2014. So far, we
>> have resolved almost 400 JIRAs by 20+ contributors/committers from
>> different countries and affiliations. We also want to thank the large
>> number of community members who were supportive of our efforts and
>> contributed ideas and participated in the design of ozone.
>> 
>> 
>> Please share your thoughts, thanks!
>> 
>> 
>> -- Weiwei Yang
>> 
> 
> 
> On Wed, Oct 18, 2017 at 6:54 PM, Yang Weiwei  wrote:
> 
>> Hello everyone,
>> 
>> 
>> I would like to start this thread to discuss merging Ozone (HDFS-7240) to
>> trunk. This feature implements an object store which can co-exist with
>> HDFS. Ozone is disabled by default. We have tested Ozone with cluster sizes
>> varying from 1 to 100 data nodes.
>> 
>> 
>> 
>> The merge payload includes the following:
>> 
>>  1.  All services, management scripts
>>  2.  Object store APIs, exposed via both REST and RPC
>>  3.  Master service UIs, command line interfaces
>>  4.  Pluggable pipeline Integration

Re: [VOTE] Merge HADOOP-13345 (S3Guard feature branch)

2017-08-23 Thread sanjay Radia

+1 (binding)
Thanks  community for all the hard that went into this critical piece of work.


sanjay
> 
> 
> On 22 Aug 2017, at 11:17, Steve Loughran 
> <ste...@hortonworks.com<mailto:ste...@hortonworks.com>> wrote:
> 
> +1 (binding)
> 
> I'm happy with it; it's a great piece of work by (in no particular order): 
> Chris Nauroth, Aaron Fabbri, Sean McRory & Mingliang Liu. plus a few bits in 
> the corners where I got to break things while they were all asleep. Also 
> deserving a mention: Thomas Demoor & Ewan Higgs @ WDC for consultancy on the 
> corners of S3, everyone who tested in (including our QA team), Sanjay Radia, 
> & others.
> 
> I've already done a couple of iterations of fixing checksyles & code reviews, 
> so I think it is ready. I also have a branch-2 patch based on earlier work by 
> Mingliang, for people who want that.
> 
> 
> 
> 
> On 17 Aug 2017, at 23:07, Aaron Fabbri 
> <fab...@cloudera.com<mailto:fab...@cloudera.com>> wrote:
> 
> Hello,
> 
> I'd like to open a vote (7 days, ending August 24 at 3:10 PST) to merge the
> HADOOP-13345 feature branch into trunk.
> 
> This branch contains the new S3Guard feature which adds metadata
> consistency features to the S3A client.  Formatted site documentation can
> be found here:
> 
> https://github.com/apache/hadoop/blob/HADOOP-13345/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/s3guard.md
> 
> The current patch against trunk is posted here:
> 
> https://issues.apache.org/jira/browse/HADOOP-13998
> 
> The branch modifies the s3a portion of the hadoop-tools/hadoop-aws module:
> 
> - The feature is off by default, and care has been taken to insure it has
> no impact when disabled.
> - S3Guard can be enabled with the production database which is backed by
> DynamoDB, or with a local, in-memory implementation that facilitates
> integration testing without having to pay for a database.
> - getFileStatus() as well as directory listing consistency has been
> implemented and thoroughly tested, including delete tracking.
> - Convenient Maven profiles for testing with and without S3Guard.
> - New failure injection code and integration tests that exercise it.  We
> use timers and a wrapper around the Amazon SDK client object to force
> consistency delays to occur.  This allows us to assert that S3Guard works
> as advertised.  This will be extended with more types of failure injection
> to continue hardening the S3A client.
> 
> Outside of hadoop-tools/hadoop-aws's s3a directory there are some minor
> changes:
> 
> - core-default.xml defaults and documentation for s3guard parameters.
> - A couple additional FS contract test cases around rename.
> - More goodies in LambdaTestUtils
> - A new CLI tool for inspecting and manipulating S3Guard features,
> including the backing MetadataStore database.
> 
> This branch has seen extensive testing as well as use in production.  This
> branch makes significant improvements to S3A's test toolkit as well.
> 
> Performance is typically on par with, and in some cases better than, the
> existing S3A code without S3Guard enabled.
> 
> This feature was developed with contributions and feedback from many
> people.  I'd like to thank everyone who worked on HADOOP-13345 as well as
> all of those who contributed feedback and work on the original design
> document.
> 
> This is the first major Apache Hadoop project I've worked on from start to
> finish, and I've really enjoyed it.  Please shout if I've missed anything
> important here or in the VOTE process.
> 
> Cheers,
> Aaron Fabbri
> 
> 
> -
> To unsubscribe, e-mail: 
> common-dev-unsubscr...@hadoop.apache.org<mailto:common-dev-unsubscr...@hadoop.apache.org>
> For additional commands, e-mail: 
> common-dev-h...@hadoop.apache.org<mailto:common-dev-h...@hadoop.apache.org>
> 
> 


-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: LinkedIn Dynamometer Tool (was About 2.7.4 Release)

2017-07-21 Thread sanjay Radia
Erik
  Great stuff. 
BTW did you build on top of the “simulated data nodes” in HDFS which has a way 
to storing only the length of data (but not real data)? That work allowed 
supplementing  with a matching editsLog for the NN. Your approach of using a 
real image has the advantage of being able to replay traces from audit logs.
(Ref 
https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/DataNodeCluster.java)

thanks

sanjay
> On Jul 20, 2017, at 10:42 AM, Erik Krogen  
> wrote:
> 
> forking off of the 2.7.4 release thread to answer this question about
> Dynamometer
> 
> Dynamometer is a tool developed at LinkedIn for scale testing HDFS,
> specifically the NameNode. We have been using it for some time now and have
> recently been making some enhancements to ease of use and reproducibility.
> We hope to post a blog post sometime in the not-too-distant future, and
> also to open source it. I can provide some details here given that we have
> been leveraging it as part of our 2.7.4 release / upgrade process (in
> addition to previous upgrades).
> 
> The basic idea is to get full-scale black-box testing of the HDFS NN while
> using significantly less (~10%) hardware than a real cluster of that size
> would require. We use real NN images from our at-scale clusters paired with
> some logic to fake out DNs into thinking they are storing data when they
> are not, allowing us to stuff more DNs onto each machine. Since we use a
> real image, we can replay real traces (collected from audit logs) to
> compare actual production performance vs. performance on this simulated
> cluster (with additional tuning, different version, etc.). We leverage YARN
> to manage setting up this cluster and to replay the traces.
> 
> Happy to answer questions.
> 
> Erik
> 
> On Wed, Jul 19, 2017 at 5:05 PM, Konstantin Shvachko 
> wrote:
> 
>> Hi Tianyi,
>> 
>> Glad you are interested in Dynamometer. Erik (CC-ed) is actively working
>> on this project right now, I'll let him elaborate.
>> Erik, you should probably respond on Apache dev list, as I think it could
>> be interesting for other people as well, asince we planned to open source
>> it. You can fork the "About 2.7.4 Release" thread with a new subject and
>> give some details about Dynamometer there.
>> 
>> Thanks,
>> --Konstantin
>> 
>> On Wed, Jul 19, 2017 at 1:40 AM, 何天一  wrote:
>> 
>>> Hi, Shavachko.
>>> 
>>> You mentioned an internal tool called Dynamometer to test NameNode
>>> performance earlier in the 2.7.4 release thread.
>>> I wonder if you could share some ideas behind the tool. Or is there a
>>> plan to bring Dynamometer to open source community?
>>> 
>>> Thanks.
>>> 
>>> BR,
>>> Tianyi
>>> 
>>> On Fri, Jul 14, 2017 at 8:45 AM Konstantin Shvachko 
>>> wrote:
>>> 
 Hi everybody.
 
 We have been doing some internal testing of Hadoop 2.7.4. The testing is
 going well.
 Did not find any major issues on our workloads.
 Used an internal tool called Dynamometer to check NameNode performance on
 real cluster traces. Good.
 Overall test cluster performance looks good.
 Some more testing is still going on.
 
 I plan to build an RC next week. If there are no objection.
 
 Thanks,
 --Konst
 
 On Thu, Jun 15, 2017 at 4:42 PM, Konstantin Shvachko <
 shv.had...@gmail.com>
 wrote:
 
> Hey guys.
> 
> An update on 2.7.4 progress.
> We are down to 4 blockers. There is some work remaining on those.
> https://issues.apache.org/jira/browse/HDFS-11896?filter=12340814
> Would be good if people could follow up on review comments.
> 
> I looked through nightly Jenkins build results for 2.7.4 both on Apache
> Jenkins and internal.
> Some test fail intermittently, but there no consistent failures. I
 filed
> HDFS-11985 to track some of them.
> https://issues.apache.org/jira/browse/HDFS-11985
> I do not currently consider these failures as blockers. LMK if some of
> them are.
> 
> We started internal testing of branch-2.7 on one of our smallish (100+
> nodes) test clusters.
> Will update on the results.
> 
> There is a plan to enable BigTop for 2.7.4 testing.
> 
> Akira, Brahma thank you for setting up a wiki page for 2.7.4 release.
> Thank you everybody for contributing to this effort.
> 
> Regards,
> --Konstantin
> 
> 
> On Tue, May 30, 2017 at 12:08 AM, Akira Ajisaka 
> wrote:
> 
>> Sure.
>> If you want to edit the wiki, please tell me your ASF confluence
 account.
>> 
>> -Akira
>> 
>> On 2017/05/30 15:31, Rohith Sharma K S wrote:
>> 
>>> Couple of more JIRAs need to be back ported for 2.7.4 release. These
 will
>>> solve RM HA unstability issues.
>>> 

Re: Looking to a Hadoop 3 release

2015-03-09 Thread sanjay Radia

 On Mar 5, 2015, at 3:21 PM, Siddharth Seth ss...@apache.org wrote:
 
 2) Simplification of configs - potentially separating client side configs
 and those used by daemons. This is another source of perpetual confusion
 for users.
+ 1 on this.

sanjay

Re: Looking to a Hadoop 3 release

2015-03-03 Thread sanjay Radia

 On Mar 3, 2015, at 9:36 AM, Karthik Kambatla ka...@cloudera.com wrote:
 
 If we preserve API compat and try to preserve wire compat, I don't see the
 harm in bumping the major release.

If we preserve compatibility, then there is no need to bump major number.
 It allows us to include several
 fixes/features in trunk in a release. If we are not actively thinking of a
 way to release items in trunk, why even have it?

What are the fixes and features in trunk that you would like to see get out 
quickly?
Can these be back ported easily to branch 2?

sanjay



Re: Looking to a Hadoop 3 release

2015-03-02 Thread sanjay Radia
Andrew 
  Thanks for bringing up the issue of moving to Java8. Java8 is important
However, I am not seeing a strong motivation for changing the major number.
We can go to Java8 in  the 2.series. 
The classpath issue for Hadoop-11656 is too minor to force a major number 
change (no pun intended).

Lets separate the issue of Java8 and Hadoop 3.0

sanjay


 On Mar 2, 2015, at 3:19 PM, Andrew Wang andrew.w...@cloudera.com wrote:
 
 Hi devs,
 
 It's been a year and a half since 2.x went GA, and I think we're about due
 for a 3.x release.
 Notably, there are two incompatible changes I'd like to call out, that will
 have a tremendous positive impact for our users.
 
 First, classpath isolation being done at HADOOP-11656, which has been a
 long-standing request from many downstreams and Hadoop users.
 
 Second, bumping the source and target JDK version to JDK8 (related to
 HADOOP-11090), which is important since JDK7 is EOL in April 2015 (two
 months from now). In the past, we've had issues with our dependencies
 discontinuing support for old JDKs, so this will future-proof us.
 
 Between the two, we'll also have quite an opportunity to clean up and
 upgrade our dependencies, another common user and developer request.
 
 I'd like to propose that we start rolling a series of monthly-ish series of
 3.0 alpha releases ASAP, with myself volunteering to take on the RM and
 other cat herding responsibilities. There are already quite a few changes
 slated for 3.0 besides the above (for instance the shell script rewrite) so
 there's already value in a 3.0 alpha, and the more time we give downstreams
 to integrate, the better.
 
 This opens up discussion about inclusion of other changes, but I'm hoping
 to freeze incompatible changes after maybe two alphas, do a beta (with no
 further incompat changes allowed), and then finally a 3.x GA. For those
 keeping track, that means a 3.x GA in about four months.
 
 I would also like to stress though that this is not intended to be a big
 bang release. For instance, it would be great if we could maintain wire
 compatibility between 2.x and 3.x, so rolling upgrades work. Keeping
 branch-2 and branch-3 similar also makes backports easier, since we're
 likely maintaining 2.x for a while yet.
 
 Please let me know any comments / concerns related to the above. If people
 are friendly to the idea, I'd like to cut a branch-3 and start working on
 the first alpha.
 
 Best,
 Andrew



Re: [VOTE] Merge fs-encryption branch to trunk

2014-08-15 Thread sanjay Radia


+1 (binding)
We have made some great progress in the last few days on some of the issues I 
raised.
I have posted a summary of the followup items that are needed on the Jira today.
I am +1ing expecting the team will  complete Items 1 (distcp/cp) and 2 
(webhdfs)  promptly. Before we publish transparent encryption in a 2.x release 
for pubic consumption, let us at least complete item 1 (ie distcp and cp) and 
the flag to turn this feature on/of.

This is a great work; thanks team for contributing this important feature.

sanjay

On Aug 14, 2014, at 1:05 AM, sanjay Radia san...@hortonworks.com wrote:

 While I was originally skeptical of transparent encryption, I like the value 
 proposition of transparent encryption. HDFS has several layers, protocols  
 and tools. While the HDFS core part seems to be well done in the Jira, 
 inserting the matching transparency in the other tools or protocols need to 
 be worked through.
 
 I have the following areas of concern:
 - Common protocols like webhdfs should continue to work (the design doc marks 
 this as a goal), This issue is being discussed in the Jira but it appears 
 that webhdfs does not currently work with encrypted files: Andrew say that 
 Regarding webhdfs, it's not a recommended deployment and that he will 
 modify the documentation to match that. Aljeandro say Both httpfs and 
 webhdfs will work just fine but then in the same paragraph says this could 
 fail some security audits. We need to resolve this quickly. Webhdfs is 
 heavily used by many Hadoop users.
 
 
 - Common tools should like cp, distcp and HAR should continue  to work with 
 non-encrypted and encrypted files in an automatic fashion. This issue has 
 been heavily discussed in the Jira and at the meeting. The /.reserved./.raw 
 mechanism appears to be a step in the right direction for distcp and cp, 
 however this work has not reached its conclusion in my opinion; Charles are I 
 are going through the use cases and I think we are close to a clean solution 
 for distcp and cp.  HAR still needs a concrete proposal.
 
 - KMS scalability in medium to large clusters. This can perhaps  be addressed 
 by getting the keys ahead of time when a job is submitted.  Without this the  
 KMS will need to be as highly available and scalable as the NN.  I think this 
 is future implementation work but we need to at least determine if this is 
 indeed possible in case we need to modify some of the APIs right now to 
 support that.
 
 There are some other minor things under discussion, and I still need to go 
 through the new APIs.
 
 Unfortunately at this stage I cannot give a +1 for this merge; I hope to 
 change this in the next day or -  I am working with the Jira's team.  
 Alejandoro, Charles, Andrew, Atm, ...  to resolve the above as quickly as 
 possible.
 
 Sanjay (binding)
 
 
 
 On Aug 8, 2014, at 11:45 AM, Andrew Wang andrew.w...@cloudera.com wrote:
 
 Hi all,
 
 I'd like to call a vote to merge the fs-encryption branch to trunk.
 Development of this feature has been ongoing since March on HDFS-6134 and
 HADOOP-10150, totally approximately 50 commits.
 
 .
 Thanks,
 Andrew
 


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: [VOTE] Merge fs-encryption branch to trunk

2014-08-14 Thread sanjay Radia
While I was originally skeptical of transparent encryption, I like the value 
proposition of transparent encryption. HDFS has several layers, protocols  and 
tools. While the HDFS core part seems to be well done in the Jira, inserting 
the matching transparency in the other tools or protocols need to be worked 
through.

I have the following areas of concern:
- Common protocols like webhdfs should continue to work (the design doc marks 
this as a goal), This issue is being discussed in the Jira but it appears that 
webhdfs does not currently work with encrypted files: Andrew say that 
Regarding webhdfs, it's not a recommended deployment and that he will modify 
the documentation to match that. Aljeandro say Both httpfs and webhdfs will 
work just fine but then in the same paragraph says this could fail some 
security audits. We need to resolve this quickly. Webhdfs is heavily used by 
many Hadoop users.


- Common tools should like cp, distcp and HAR should continue  to work with 
non-encrypted and encrypted files in an automatic fashion. This issue has been 
heavily discussed in the Jira and at the meeting. The /.reserved./.raw 
mechanism appears to be a step in the right direction for distcp and cp, 
however this work has not reached its conclusion in my opinion; Charles are I 
are going through the use cases and I think we are close to a clean solution 
for distcp and cp.  HAR still needs a concrete proposal.

- KMS scalability in medium to large clusters. This can perhaps  be addressed 
by getting the keys ahead of time when a job is submitted.  Without this the  
KMS will need to be as highly available and scalable as the NN.  I think this 
is future implementation work but we need to at least determine if this is 
indeed possible in case we need to modify some of the APIs right now to support 
that.

There are some other minor things under discussion, and I still need to go 
through the new APIs.

 Unfortunately at this stage I cannot give a +1 for this merge; I hope to 
change this in the next day or -  I am working with the Jira's team.  
Alejandoro, Charles, Andrew, Atm, ...  to resolve the above as quickly as 
possible.

Sanjay (binding)



On Aug 8, 2014, at 11:45 AM, Andrew Wang andrew.w...@cloudera.com wrote:

 Hi all,
 
 I'd like to call a vote to merge the fs-encryption branch to trunk.
 Development of this feature has been ongoing since March on HDFS-6134 and
 HADOOP-10150, totally approximately 50 commits.
 
 .
 Thanks,
 Andrew


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: [VOTE] Migration from subversion to git for version control

2014-08-14 Thread sanjay Radia
+1 
sanjay
 
 On Fri, Aug 8, 2014 at 7:57 PM, Karthik Kambatla ka...@cloudera.com wrote:
 I have put together this proposal based on recent discussion on this topic.
 
 Please vote on the proposal. The vote runs for 7 days.
 
   1. Migrate from subversion to git for version control.
   2. Force-push to be disabled on trunk and branch-* branches. Applying
   changes from any of trunk/branch-* to any of branch-* should be through
   git cherry-pick -x.
   3. Force-push on feature-branches is allowed. Before pulling in a
   feature, the feature-branch should be rebased on latest trunk and the
   changes applied to trunk through git rebase --onto or git cherry-pick
   commit-range.
   4. Every time a feature branch is rebased on trunk, a tag that
   identifies the state before the rebase needs to be created (e.g.
   tag_feature_JIRA-2454_2014-08-07_rebase). These tags can be deleted once
   the feature is pulled into trunk and the tags are no longer useful.
   5. The relevance/use of tags stay the same after the migration.
 
 Thanks
 Karthik
 
 PS: Per Andrew Wang, this should be a Adoption of New Codebase kind of
 vote and will be Lazy 2/3 majority of PMC members.


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: Moving to JDK7, JDK8 and new major releases

2014-06-24 Thread sanjay Radia
Andrew
  thanks for writing the proposal.

In the proposal you mention:
   Dropping support for a JDK in a minor release is incompatible, so this 
would require a change to our compatibility guidelines.

Why is dropping a JDK incompatible?

sanjay



On Jun 24, 2014, at 11:17 AM, Andrew Wang andrew.w...@cloudera.com wrote:

 Hi all,
 
 Forking this thread as requested by Vinod. To help anyone who's catching up
 with this thread, I've written up a wiki page containing what I think are
 the proposals under discussion. I did my very best to make this as
 fact-based and disinterested as possible; I really appreciate the
 constructive discussion we've had so far. If you believe you have a
 proposal pending, please feel free to edit the wiki.
 
 https://wiki.apache.org/hadoop/MovingToJdk7and8
 
 I think based on our current compatibility guidelines, Proposal A is the
 most attractive. We're pretty hamstrung by the requirement to keep the
 classpath the same, which would be solved by either OSGI or shading our
 deps (but that's a different discussion).
 
 Thanks,
 Andrew


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: [VOTE] Release Apache Hadoop 2.4.0

2014-04-08 Thread sanjay Radia


+1 binding
Verified binaries, ran from binary on single node cluster. Tested some HDFS 
clis and wordcount.

sanjay
On Apr 7, 2014, at 9:52 AM, Suresh Srinivas sur...@hortonworks.com wrote:

 +1 (binding)
 
 Verified the signatures and hashes for both src and binary tars. Built from
 the source, the binary distribution and the documentation. Started a single
 node cluster and tested the following:
 # Started HDFS cluster, verified the hdfs CLI commands such ls, copying
 data back and forth, verified namenode webUI etc.
 # Ran some tests such as sleep job, TestDFSIO, NNBench etc.
 
 I agree with Arun's anaylysis. At this time, the bar for blockers should be
 quite high. We can do a dot release if people want some more bug fixes.
 
 
 On Mon, Mar 31, 2014 at 2:22 AM, Arun C Murthy a...@hortonworks.com wrote:
 
 Folks,
 
 I've created a release candidate (rc0) for hadoop-2.4.0 that I would like
 to get released.
 
 The RC is available at:
 http://people.apache.org/~acmurthy/hadoop-2.4.0-rc0
 The RC tag in svn is here:
 https://svn.apache.org/repos/asf/hadoop/common/tags/release-2.4.0-rc0
 
 The maven artifacts are available via repository.apache.org.
 
 Please try the release and vote; the vote will run for the usual 7 days.
 
 thanks,
 Arun
 
 --
 Arun C. Murthy
 Hortonworks Inc.
 http://hortonworks.com/
 
 
 
 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.
 
 
 
 
 -- 
 http://hortonworks.com/download/
 
 -- 
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to 
 which it is addressed and may contain information that is confidential, 
 privileged and exempt from disclosure under applicable law. If the reader 
 of this message is not the intended recipient, you are hereby notified that 
 any printing, copying, dissemination, distribution, disclosure or 
 forwarding of this communication is strictly prohibited. If you have 
 received this communication in error, please contact the sender immediately 
 and delete it from your system. Thank You.


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: [VOTE] Release Apache Hadoop 2.4.0

2014-04-07 Thread Sanjay Radia
On Thu, Apr 3, 2014 at 4:55 PM, Tsuyoshi OZAWA ozawa.tsuyo...@gmail.comwrote:

 Hi,

 Ran tests and confirmed that some tests(TestSymlinkLocalFSFileSystem)
 fail.

The log of the test failure is as follows:

 https://gist.github.com/oza/9965197

 Should we fix or disable the feature?


Symlinks is still not completed, hence disable.

sanjay


 Thanks,
 - Tsuyoshi

 On Mon, Mar 31, 2014 at 6:22 PM, Arun C Murthy a...@hortonworks.com
 wrote:
  Folks,
 
  I've created a release candidate (rc0) for hadoop-2.4.0 that I would
 like to get released.
 
  The RC is available at:
 http://people.apache.org/~acmurthy/hadoop-2.4.0-rc0
  The RC tag in svn is here:
 https://svn.apache.org/repos/asf/hadoop/common/tags/release-2.4.0-rc0
 
  The maven artifacts are available via repository.apache.org.
 
  Please try the release and vote; the vote will run for the usual 7 days.
 
  thanks,
  Arun
 
  --
  Arun C. Murthy
  Hortonworks Inc.
  http://hortonworks.com/
 
 
 
  --
  CONFIDENTIALITY NOTICE
  NOTICE: This message is intended for the use of the individual or entity
 to
  which it is addressed and may contain information that is confidential,
  privileged and exempt from disclosure under applicable law. If the reader
  of this message is not the intended recipient, you are hereby notified
 that
  any printing, copying, dissemination, distribution, disclosure or
  forwarding of this communication is strictly prohibited. If you have
  received this communication in error, please contact the sender
 immediately
  and delete it from your system. Thank You.



 --
 - Tsuyoshi


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


[jira] [Created] (HADOOP-10044) Improve the javadoc of rpc code

2013-10-11 Thread Sanjay Radia (JIRA)
Sanjay Radia created HADOOP-10044:
-

 Summary: Improve the javadoc of rpc code
 Key: HADOOP-10044
 URL: https://issues.apache.org/jira/browse/HADOOP-10044
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Sanjay Radia
Assignee: Sanjay Radia
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.1#6144)


Re: symlink support in Hadoop 2 GA

2013-10-03 Thread sanjay Radia
There are a number of issues (some minor, some more than minor).
GA is close and we are are still in discussion on the some of them; while I 
believe we will close on these very very shortly, code change like this so 
close to GA is dangerous.

I suggest we do the following:
1) Disable Symlinks  in 2.2 GA- throw unsupported exception on createSymlink in 
both FileSystem and FileContext.
2) Deal with the  isDir() in 2.2GA in preparation for item 3 coming after GA:
a) Deprecate isDir()
b) Add a new API that returns an enum (see FileContext).
3) Fix Symlinks, in a future release, hopefully the very next one after 2.2GA
   a)  change the stack to use the new API replacing isDir(). 
   b) fix isDIr() to do something smarter (we can detail this later but there 
is a solution that has been discussed). This helps customer applications that 
call isDir(). 
  c) Remove isDir in a future release when customers have had sufficient time 
to migrate.

sanjay

PS. J Rottinghuis expressed a similar sentiment in a previous email in this 
thread:



On Sep 18, 2013, at 5:11 PM, J. Rottinghuis wrote:

 I like symlink functionality, but in our migration to Hadoop 2.x this is a
 total distraction. If the APIs stay in 2.2 GA we'll have to choose to:
 a) Not uprev until symlink support is figured out up and down the stack,
 and we've been able to migrate all our 1.x (equivalent) clusters to 2.x
 (equivalent). Or
 b) rip out the API altogether. Or
 c) change the implementation to throw an UnsupportedOperationException
 I'm not sure yet which of these I like least.


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: 2.1.2 (Was: Re: [VOTE] Release Apache Hadoop 2.1.1-beta)

2013-10-01 Thread sanjay Radia
+1 for naming the new branch 2.2.0
sanjay

On Oct 1, 2013, at 4:55 PM, Suresh Srinivas wrote:

 (This time copying all the lists)
 
 I am +1 for naming the new branch 2.2.0.
 
 
 On Tue, Oct 1, 2013 at 4:15 PM, Arun C Murthy a...@hortonworks.com wrote:
 
 Guys,
 
 I took a look at the content in 2.1.2-beta so far, other than the
 critical fixes such as HADOOP-9984 (symlinks) and few others in YARN/MR,
 there is fairly little content (unit tests fixes etc.)
 
 Furthermore, it's standing up well in testing too. Plus, the protocols
 look good for now (I wrote a gohadoop to try convince myself), let's lock
 them in.
 
 Given that, I'm thinking we can just go ahead rename it 2.2.0 rather than
 make another 2.1.x release.
 
 This will drop a short-lived release (2.1.2) and help us move forward on
 2.3 which has a fair bunch of content already...
 
 Thoughts?
 
 thanks,
 Arun
 
 
 On Sep 24, 2013, at 4:24 PM, Zhijie Shen zs...@hortonworks.com wrote:
 
 I've added MAPREDUCE-5531 to the blocker list. - Zhijie
 
 
 On Tue, Sep 24, 2013 at 3:41 PM, Arun C Murthy a...@hortonworks.com
 wrote:
 
 With 4 +1s (3 binding) and no -1s the vote passes. I'll push it out…
 I'll
 make it clear on the release page, that there are some known issues and
 that we will follow up very shortly with another release.
 
 Meanwhile, let's fix the remaining blockers (please mark them as such
 with
 Target Version 2.1.2-beta).
 The current blockers are here:
 http://s.apache.org/hadoop-2.1.2-beta-blockers
 
 thanks,
 Arun
 
 On Sep 16, 2013, at 11:38 PM, Arun C Murthy a...@hortonworks.com
 wrote:
 
 Folks,
 
 I've created a release candidate (rc0) for hadoop-2.1.1-beta that I
 would like to get released - this release fixes a number of bugs on top
 of
 hadoop-2.1.0-beta as a result of significant amounts of testing.
 
 If things go well, this might be the last of the *beta* releases of
 hadoop-2.x.
 
 The RC is available at:
 http://people.apache.org/~acmurthy/hadoop-2.1.1-beta-rc0
 The RC tag in svn is here:
 
 http://svn.apache.org/repos/asf/hadoop/common/tags/release-2.1.1-beta-rc0
 
 The maven artifacts are available via repository.apache.org.
 
 Please try the release and vote; the vote will run for the usual 7
 days.
 
 thanks,
 Arun
 
 
 --
 Arun C. Murthy
 Hortonworks Inc.
 http://hortonworks.com/
 
 
 
 --
 Arun C. Murthy
 Hortonworks Inc.
 http://hortonworks.com/
 
 
 
 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or
 entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the
 reader
 of this message is not the intended recipient, you are hereby notified
 that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender
 immediately
 and delete it from your system. Thank You.
 
 
 
 
 --
 Zhijie Shen
 Hortonworks Inc.
 http://hortonworks.com/
 
 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity
 to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified
 that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender
 immediately
 and delete it from your system. Thank You.
 
 --
 Arun C. Murthy
 Hortonworks Inc.
 http://hortonworks.com/
 
 
 
 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.
 
 
 
 
 -- 
 http://hortonworks.com/download/
 
 -- 
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to 
 which it is addressed and may contain information that is confidential, 
 privileged and exempt from disclosure under applicable law. If the reader 
 of this message is not the intended recipient, you are hereby notified that 
 any printing, copying, dissemination, distribution, disclosure or 
 forwarding of this communication is strictly prohibited. If you have 
 received this communication in error, please contact the sender immediately 
 and delete it from your system. Thank You.


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is 

[jira] [Created] (HADOOP-9671) Improve Hadoop security - master jira

2013-06-25 Thread Sanjay Radia (JIRA)
Sanjay Radia created HADOOP-9671:


 Summary: Improve Hadoop security - master jira
 Key: HADOOP-9671
 URL: https://issues.apache.org/jira/browse/HADOOP-9671
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Sanjay Radia
Assignee: Sanjay Radia





--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: [DISCUSS] Ensuring Consistent Behavior for Alternative Hadoop FileSystems + Workshop

2013-06-07 Thread sanjay Radia
I plan to attend.
A 9:30 time is a little better for me.

sanjay

On Jun 5, 2013, at 8:14 PM, Stephen Watt wrote:

 Hi Folks
 
 Per Roman's recommendation I've created a Wiki Page for organizing the work 
 and managing the logistics - https://wiki.apache.org/hadoop/HCFS/Progress
 
 I'd like to propose a Google Hangout at 9am PST on Monday June 10th to get 
 together and discuss the initiative. Please respond back to me if you're 
 interested or would like to propose a different time. I'll update our Wiki 
 page with the logistics.
 
 Regards
 Steve Watt
 
 - Original Message -
 From: Roman Shaposhnik shaposh...@gmail.com
 To: Stephen Watt sw...@redhat.com
 Cc: common-dev@hadoop.apache.org, mbhandar...@gopivotal.com, shv hadoop 
 shv.had...@gmail.com, ste...@hortonworks.com, erlv5...@gmail.com, 
 apurt...@apache.org
 Sent: Friday, May 31, 2013 5:28:58 PM
 Subject: Re: [DISCUSS] Ensuring Consistent Behavior for Alternative Hadoop 
 FileSystems + Workshop
 
 On Fri, May 31, 2013 at 1:00 PM, Stephen Watt sw...@redhat.com wrote:
 What is the protocol for organizing the logistics and collaborating? I am 
 loathe to flood common-dev with does this time work for you? emails from 
 the interested parties. Do we create a high level JIRA ticket and 
 collaborate and post comments and G+ meetup times on that ? Another option 
 might be the Wiki, I'd be happy to be responsible with tracking progress on 
 https://wiki.apache.org/hadoop/HCFS/Progress until we are able to break 
 initiatives down into more granular JIRA tickets.
 
 I'd go with a wiki page and perhaps http://www.doodle.com/
 
 After we've had a few G+ hangouts, for those that would like to meet face to 
 face, I have also made an all day reservation for a meeting room that can 
 hold up to 20 people at our Red Hat Office in Castro Street, Mountain View 
 on Tuesday June 25th (the day before Hadoop Summit and a short drive away). 
 We don't have to use the whole day, but it gives us some flexibility around 
 the availability of interested parties. I was thinking something along the 
 lines of 10am - 3pm. We are happy to cater lunch.
 
 That also would be very much appreciated!
 
 Thanks,
 Roman.



[jira] [Created] (HADOOP-9619) Mark stability of .proto files

2013-06-04 Thread Sanjay Radia (JIRA)
Sanjay Radia created HADOOP-9619:


 Summary: Mark stability of .proto files
 Key: HADOOP-9619
 URL: https://issues.apache.org/jira/browse/HADOOP-9619
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Sanjay Radia
Assignee: Sanjay Radia




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: [VOTE] - Release 2.0.5-beta

2013-05-21 Thread sanjay Radia
+1 on 2.0.5 defined in this thread with the new features.
But I am supportive of an earlier release that has ALL the compatibility 
changes, without the features.


sanjay

On May 15, 2013, at 10:57 AM, Arun C Murthy wrote:

 Folks,
 
 ...
 
 I propose we continue the original plan and make a 2.0.5-beta release by May 
 end with the following content:
 # HDFS-347
 # HDFS Snapshots
 # Windows support
 # Necessary  final API/protocol changes such as:
 * Final YARN API changes: YARN-386
 * MR Binary Compatibility: MAPREDUCE-5108
 * Final RPC cleanup: HADOOP-8990



[jira] [Created] (HADOOP-9425) Add error codes to rpc-response

2013-03-21 Thread Sanjay Radia (JIRA)
Sanjay Radia created HADOOP-9425:


 Summary: Add error codes to rpc-response
 Key: HADOOP-9425
 URL: https://issues.apache.org/jira/browse/HADOOP-9425
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Sanjay Radia
Assignee: Sanjay Radia




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HADOOP-9421) Add full length to SASL response to allow non-blocking readers

2013-03-20 Thread Sanjay Radia (JIRA)
Sanjay Radia created HADOOP-9421:


 Summary: Add full length to SASL response to allow non-blocking 
readers
 Key: HADOOP-9421
 URL: https://issues.apache.org/jira/browse/HADOOP-9421
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Sanjay Radia




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HADOOP-9380) Add totalLength to rpc response

2013-03-07 Thread Sanjay Radia (JIRA)
Sanjay Radia created HADOOP-9380:


 Summary: Add totalLength to rpc response
 Key: HADOOP-9380
 URL: https://issues.apache.org/jira/browse/HADOOP-9380
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Sanjay Radia
Assignee: Sanjay Radia




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: [Vote] Merge branch-trunk-win to trunk

2013-03-01 Thread sanjay Radia

On Mar 1, 2013, at 1:57 PM, Konstantin Shvachko wrote:

 Commitment is a good thing.
 I think the two builds that I proposed are a prerequisite for Win support.
 If we commit windows patch people will start breaking it the next day.
 Which we wont know without the nightly build and wont be able to fix
 without the on-demand one.

They clearly are a prerequisite for declaring official support for windows. 
But they should not be a prerequisite for the merge,. 
Currently we enable windows through cygwin. There is no jenkins. Folks have 
been  fixing  windows issues as they are discovered.
Merging the branch makes the situation no worse than it is today - all tests 
pass on Linux, there is no regression.
Merging now removes  the cygwin dependency.

Jenkins is critical to make windows officially supported platform without 
cygwin.
When Jenkins is enabled, the team that has worked on this branch will have to 
fix any bugs that have arisen in the mean time.

sanjay






Re: [Vote] Merge branch-trunk-win to trunk

2013-02-28 Thread sanjay Radia
+1
Java has done the bulk of the work in making Hadoop multi-platform.
Windows specific code is a tiny percentage of the code.
Jeninks support for windows is going help us keep the platform portable going 
forward.
I expect that the vast majority of new commits have  no problems. I propose 
that we start by fixing problems that Jenkins raises but not block new commits 
for too long if the author does not have a windows box or if a volunteer does 
not step up.

sanjay





[jira] [Created] (HADOOP-9163) The rpc msg in ProtobufRpcEngine.proto should be moved out to avoid an extra copy

2012-12-20 Thread Sanjay Radia (JIRA)
Sanjay Radia created HADOOP-9163:


 Summary: The rpc msg in  ProtobufRpcEngine.proto should be moved 
out to avoid an extra copy
 Key: HADOOP-9163
 URL: https://issues.apache.org/jira/browse/HADOOP-9163
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Sanjay Radia




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HADOOP-9151) Include RPC error info in RpcResponseHeader instead of sending it separately

2012-12-17 Thread Sanjay Radia (JIRA)
Sanjay Radia created HADOOP-9151:


 Summary: Include RPC error info in RpcResponseHeader instead of 
sending it separately
 Key: HADOOP-9151
 URL: https://issues.apache.org/jira/browse/HADOOP-9151
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Sanjay Radia
Assignee: Sanjay Radia





--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HADOOP-9140) Cleanup rpc PB protos

2012-12-13 Thread Sanjay Radia (JIRA)
Sanjay Radia created HADOOP-9140:


 Summary: Cleanup rpc PB protos
 Key: HADOOP-9140
 URL: https://issues.apache.org/jira/browse/HADOOP-9140
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Sanjay Radia
Assignee: Sanjay Radia




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HADOOP-7347) IPC Wire Compatibility

2012-12-12 Thread Sanjay Radia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-7347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjay Radia resolved HADOOP-7347.
--

   Resolution: Fixed
Fix Version/s: 2.0.0-alpha
 Hadoop Flags: Incompatible change

 IPC Wire Compatibility
 --

 Key: HADOOP-7347
 URL: https://issues.apache.org/jira/browse/HADOOP-7347
 Project: Hadoop Common
  Issue Type: Improvement
  Components: ipc
Affects Versions: 0.23.0
Reporter: Sanjay Radia
Assignee: Sanjay Radia
 Fix For: 2.0.0-alpha

 Attachments: Wire Compatibility – Separating wire types.pdf




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HADOOP-8544) Move an assertion location in 'winutils chmod'

2012-07-12 Thread Sanjay Radia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjay Radia resolved HADOOP-8544.
--

Resolution: Fixed

Committed to branch-1 windows; thanks Chuan.

 Move an assertion location in 'winutils chmod'
 --

 Key: HADOOP-8544
 URL: https://issues.apache.org/jira/browse/HADOOP-8544
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 1-win
Reporter: Chuan Liu
Assignee: Chuan Liu
Priority: Trivial
 Attachments: HADOOP-8544-branch-1-win-2.patch, 
 HADOOP-8544-branch-1-win.patch


 We have an assertion in chmod that will be triggered in case of permission 
 change without giving a permission or a reference, e.g. 'chmod + [FILE]'. 
 Such operations are valid, and will trigger the assertion for winutils debug 
 build. [~bikassaha] noticed the bug.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-8414) Address problems related to localhost resolving to 127.0.0.1 on Windows

2012-07-05 Thread Sanjay Radia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjay Radia resolved HADOOP-8414.
--

Resolution: Fixed

Thanks Ivan, committed to branch-1 windows.

 Address problems related to localhost resolving to 127.0.0.1 on Windows
 ---

 Key: HADOOP-8414
 URL: https://issues.apache.org/jira/browse/HADOOP-8414
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs, test
Affects Versions: 1.0.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Attachments: HADOOP-8414-branch-1-win(2).patch, 
 HADOOP-8414-branch-1-win(3).patch, HADOOP-8414-branch-1-win.patch, 
 HADOOP-8414-branch-1-win.patch


 Localhost resolves to 127.0.0.1 on Windows and that causes the following 
 tests to fail:
  - TestHarFileSystem
  - TestCLI
  - TestSaslRPC
 This Jira tracks fixing these tests and other possible places that have 
 similar issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-8534) Some tests leave a config file open causing failure on windows

2012-07-02 Thread Sanjay Radia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjay Radia resolved HADOOP-8534.
--

Resolution: Fixed

Committed to Hadoop-1 windows branch. Thanks Ivan.

 Some tests leave a config file open causing failure on windows
 --

 Key: HADOOP-8534
 URL: https://issues.apache.org/jira/browse/HADOOP-8534
 Project: Hadoop Common
  Issue Type: Bug
  Components: conf
Affects Versions: 1.0.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Attachments: HADOOP-8534-branch-1-win_Parser(2).patch, 
 HADOOP-8534-branch-1-win_Parser.patch


 Java xml parser keeps file locked after SAXException, causing the following 
 tests to fail:
  - TestQueueManagerForJobKillAndJobPriority
  - TestQueueManagerForJobKillAndNonDefaultQueue
 {{TestQueueManagerForJobKillAndJobPriority#testQueueAclRefreshWithInvalidConfFile()}}
  is creating a temp config file with incorrect syntax. Later, the test tries 
 to delete/cleanup this file and this operation fails on Windows (as the file 
 is still open). From this point on, all subsequent tests fail because they 
 try to use the incorrect config file.
 Forum references on the problem and the fix:
 http://www.linuxquestions.org/questions/programming-9/java-xml-parser-keeps-file-locked-after-saxexception-768613/
 https://forums.oracle.com/forums/thread.jspa?threadID=2046505start=0tstart=0

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-8454) Fix the ‘chmod =[perm]’ bug in winutils

2012-06-20 Thread Sanjay Radia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjay Radia resolved HADOOP-8454.
--

Resolution: Fixed

Reviewed and committed patch to Hadoop-1 windows branch.
Thanks Chuan Lin.

 Fix the ‘chmod =[perm]’ bug in winutils
 ---

 Key: HADOOP-8454
 URL: https://issues.apache.org/jira/browse/HADOOP-8454
 Project: Hadoop Common
  Issue Type: Bug
  Components: native
Affects Versions: 1.1.0, 0.24.0
Reporter: Chuan Liu
Assignee: Chuan Liu
Priority: Minor
 Attachments: HADOOP-8454-2-branch-1-win.patch, 
 HADOOP-8454-branch-1-win.patch


 The original patch for 
 [Hadoop-8235|https://issues.apache.org/jira/browse/HADOOP-8235] contained a 
 bug for ‘chmod’ implantation. The logic to compute new access mask when 
 ‘chmod’ mode string has ‘=’ in them is incorrect. For example, ‘winutils 
 chmod o=g foo’ will result wrong permission settings for the file ‘foo’. The 
 Jira is created to track the bug.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-8101) Access Control support for Non-secure deployment of Hadoop on Windows

2012-06-04 Thread Sanjay Radia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjay Radia resolved HADOOP-8101.
--

  Resolution: Fixed
Target Version/s: HADOOP-1-Windows  (was: 1.1.0, 0.24.0)

 Access Control support for Non-secure deployment of Hadoop on Windows
 -

 Key: HADOOP-8101
 URL: https://issues.apache.org/jira/browse/HADOOP-8101
 Project: Hadoop Common
  Issue Type: Improvement
  Components: native
Reporter: Sanjay Radia
 Attachments: security.patch, security1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-8411) TestStorageDirecotyFailure, TestTaskLogsTruncater, TestWebHdfsUrl and TestSecurityUtil fail on Windows

2012-05-31 Thread Sanjay Radia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjay Radia resolved HADOOP-8411.
--

  Resolution: Fixed
Target Version/s: HADOOP-1-Windows  (was: 1.1.0)

Committed patch. Thanks Ivan.

 TestStorageDirecotyFailure, TestTaskLogsTruncater, TestWebHdfsUrl and 
 TestSecurityUtil fail on Windows
 --

 Key: HADOOP-8411
 URL: https://issues.apache.org/jira/browse/HADOOP-8411
 Project: Hadoop Common
  Issue Type: Bug
  Components: util
Affects Versions: 1.1.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Attachments: HADOOP-8411-2-branch-1-win.patch, 
 HADOOP-8411-branch-1-win.patch, HADOOP-8411-branch-1-win.patch, 
 HADOOP-8411-branch-1-win.patch

   Original Estimate: 48h
  Remaining Estimate: 48h

 Jira tracking failures from the summary.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-8440) HarFileSystem.decodeHarURI fails for URIs whose host contains numbers

2012-05-31 Thread Sanjay Radia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjay Radia resolved HADOOP-8440.
--

  Resolution: Fixed
Target Version/s: HADOOP-1-Windows  (was: 1.1.0)

Committed to the Windows branch. Thanks Ivan.

 HarFileSystem.decodeHarURI fails for URIs whose host contains numbers
 -

 Key: HADOOP-8440
 URL: https://issues.apache.org/jira/browse/HADOOP-8440
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Affects Versions: 1.0.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic
Priority: Minor
 Attachments: HADOOP-8440-2-branch-1-win.patch, 
 HADOOP-8440-branch-1-win.patch


 For example, HarFileSystem.decodeHarURI will fail for the following URI:
 har://hdfs-127.0.0.1:51040/user

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-8374) Improve support for hard link manipulation on Windows

2012-05-31 Thread Sanjay Radia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjay Radia resolved HADOOP-8374.
--

  Resolution: Fixed
Target Version/s: HADOOP-1-Windows  (was: 1.1.0)

Committed to windows branch.

 Improve support for hard link manipulation on Windows
 -

 Key: HADOOP-8374
 URL: https://issues.apache.org/jira/browse/HADOOP-8374
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 1.0.0
Reporter: Bikas Saha
Assignee: Bikas Saha
 Attachments: HADOOP-8374-1.patch, 
 HADOOP-8374-branch-1-win_hardlinks.patch, 
 HADOOP-8374-branch-1-win_hardlinks.patch


 Hard link support for Windows does not work properly. There is some 
 refactoring needed in the code. Also, the code currently executes the fsutil 
 command to manipulate hard links. fsutil requires admin privileges on recent 
 versions of Windows. The main features needed are the ability to create hard 
 links to a file and count the number of hard links to a file. So we could use 
 mklink to create hard links and write a custom executable to count hard 
 links. Or use a custom executable to do both using Windows API's.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-8223) Initial patch for branch-1-win

2012-05-31 Thread Sanjay Radia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjay Radia resolved HADOOP-8223.
--

  Resolution: Fixed
Target Version/s: HADOOP-1-Windows

 Initial patch for branch-1-win
 --

 Key: HADOOP-8223
 URL: https://issues.apache.org/jira/browse/HADOOP-8223
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: native
Reporter: Sanjay Radia
 Attachments: hadoop-8223-2.patch, hadoop-8223.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HADOOP-8366) Use ProtoBuf for RpcResponseHeader

2012-05-07 Thread Sanjay Radia (JIRA)
Sanjay Radia created HADOOP-8366:


 Summary: Use ProtoBuf for RpcResponseHeader
 Key: HADOOP-8366
 URL: https://issues.apache.org/jira/browse/HADOOP-8366
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: HA Branch (HDFS-1623)
Reporter: Sanjay Radia
Assignee: Sanjay Radia
Priority: Blocker




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-7775) RPC Layer improvements to support protocol compatibility

2012-05-07 Thread Sanjay Radia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-7775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjay Radia resolved HADOOP-7775.
--

Resolution: Fixed

All subtasks done.

 RPC Layer improvements to support protocol compatibility
 

 Key: HADOOP-7775
 URL: https://issues.apache.org/jira/browse/HADOOP-7775
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Sanjay Radia
Assignee: Sanjay Radia



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HADOOP-8367) ProtoBufRpcEngine's rpc request header does not need declaringClass name

2012-05-07 Thread Sanjay Radia (JIRA)
Sanjay Radia created HADOOP-8367:


 Summary: ProtoBufRpcEngine's rpc request header does not need 
declaringClass name
 Key: HADOOP-8367
 URL: https://issues.apache.org/jira/browse/HADOOP-8367
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 2.0.0
Reporter: Sanjay Radia
Assignee: Sanjay Radia




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HADOOP-8285) Use ProtoBuf for RpcPayLoadHeader

2012-04-16 Thread Sanjay Radia (Created) (JIRA)
Use ProtoBuf for RpcPayLoadHeader
-

 Key: HADOOP-8285
 URL: https://issues.apache.org/jira/browse/HADOOP-8285
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Sanjay Radia
Assignee: Sanjay Radia




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HADOOP-8184) ProtoBuf RPC engine does not need it own reply packet - it can use the IPC layer reply packet.

2012-03-19 Thread Sanjay Radia (Created) (JIRA)
ProtoBuf RPC engine does not need it own reply packet - it can use the IPC 
layer reply packet.
--

 Key: HADOOP-8184
 URL: https://issues.apache.org/jira/browse/HADOOP-8184
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Sanjay Radia
Assignee: Sanjay Radia




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HADOOP-8161) Support for Azure Storage

2012-03-12 Thread Sanjay Radia (Created) (JIRA)
Support for Azure Storage
-

 Key: HADOOP-8161
 URL: https://issues.apache.org/jira/browse/HADOOP-8161
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Sanjay Radia




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HADOOP-8120) Refactor IPC client and server so shared parts in separate class

2012-02-28 Thread Sanjay Radia (Created) (JIRA)
Refactor IPC client and server so shared parts in separate class


 Key: HADOOP-8120
 URL: https://issues.apache.org/jira/browse/HADOOP-8120
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Sanjay Radia
Assignee: Sanjay Radia
Priority: Minor




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HADOOP-8101) Security changes for Hadoop for Windows

2012-02-22 Thread Sanjay Radia (Created) (JIRA)
Security changes for Hadoop for Windows
---

 Key: HADOOP-8101
 URL: https://issues.apache.org/jira/browse/HADOOP-8101
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Sanjay Radia




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HADOOP-8102) General Util Changes for Hadoop for Windows

2012-02-22 Thread Sanjay Radia (Created) (JIRA)
General Util  Changes for Hadoop for Windows


 Key: HADOOP-8102
 URL: https://issues.apache.org/jira/browse/HADOOP-8102
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Sanjay Radia




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HADOOP-8103) Hadoop-bin commands for windows

2012-02-22 Thread Sanjay Radia (Created) (JIRA)
Hadoop-bin commands for windows
---

 Key: HADOOP-8103
 URL: https://issues.apache.org/jira/browse/HADOOP-8103
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Sanjay Radia




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HADOOP-7913) Fix bug in ProtoBufRpcEngine -

2011-12-12 Thread Sanjay Radia (Created) (JIRA)
Fix bug in ProtoBufRpcEngine - 
---

 Key: HADOOP-7913
 URL: https://issues.apache.org/jira/browse/HADOOP-7913
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Sanjay Radia
Assignee: Sanjay Radia


The parent Jira moved the multiple protocol support to lower layer; it 
introduced a bug: the paramCLass parameter to
#server() constructor should be null so that it uses the registered rpc request 
deserializers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HADOOP-7862) Move the support for multiple protocols to lower layer so that Writable, PB and Avro can all use it

2011-11-25 Thread Sanjay Radia (Created) (JIRA)
Move the support for multiple protocols to lower layer so that Writable, PB and 
Avro can all use it
---

 Key: HADOOP-7862
 URL: https://issues.apache.org/jira/browse/HADOOP-7862
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Sanjay Radia
Assignee: Sanjay Radia




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HADOOP-7776) Make the Ipc-Header in a RPC-Payload an explicit header

2011-10-26 Thread Sanjay Radia (Created) (JIRA)
Make the Ipc-Header in a RPC-Payload an explicit header
---

 Key: HADOOP-7776
 URL: https://issues.apache.org/jira/browse/HADOOP-7776
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Sanjay Radia
Assignee: Sanjay Radia




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HADOOP-7716) Multiple protocol does not log the protocol name only the class which may not be the same

2011-10-03 Thread Sanjay Radia (Created) (JIRA)
Multiple protocol does not log the protocol name only the class which may not 
be the same
-

 Key: HADOOP-7716
 URL: https://issues.apache.org/jira/browse/HADOOP-7716
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Sanjay Radia
Assignee: Sanjay Radia
Priority: Minor




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HADOOP-7719) On protocol version mismatch, return the list of valid versions for the requested protocol

2011-10-03 Thread Sanjay Radia (Created) (JIRA)
On protocol version mismatch, return the list of valid versions for the 
requested protocol
--

 Key: HADOOP-7719
 URL: https://issues.apache.org/jira/browse/HADOOP-7719
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Sanjay Radia
Assignee: Sanjay Radia




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HADOOP-7687) Make getProtocolSignature public

2011-09-27 Thread Sanjay Radia (Created) (JIRA)
Make getProtocolSignature public 
-

 Key: HADOOP-7687
 URL: https://issues.apache.org/jira/browse/HADOOP-7687
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Sanjay Radia
Assignee: Sanjay Radia
Priority: Minor
 Attachments: protSigPublic.patch



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-7119) add Kerberos HTTP SPNEGO authentication support to Hadoop JT/NN/DN/TT web-consoles

2011-09-11 Thread Sanjay Radia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-7119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjay Radia resolved HADOOP-7119.
--

Resolution: Fixed

committed to branch-0.20-security to go into 20.205

 add Kerberos HTTP SPNEGO authentication support to Hadoop JT/NN/DN/TT 
 web-consoles
 --

 Key: HADOOP-7119
 URL: https://issues.apache.org/jira/browse/HADOOP-7119
 Project: Hadoop Common
  Issue Type: New Feature
  Components: security
Affects Versions: 0.23.0
 Environment: all
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Fix For: 0.20.205.0, 0.23.0

 Attachments: HADOOP-7119v3.patch, HADOOP-7119v4-amendment.patch, 
 HADOOP-7119v4.patch, HADOOP-7119v5.patch, HADOOP-7119v6.patch, 
 ha-common-01.patch, ha-common-02.patch, ha-commons.patch, 
 spnego-20-security.patch, spnego-20-security2.patch, 
 spnego-20-security3.patch, spnego-20-security4.patch


 Currently the JT/NN/DN/TT web-consoles don't support any form of 
 authentication.
 Hadoop RPC API already supports Kerberos authentication.
 Kerberos enables single sign-on.
 Popular browsers (Firefox and Internet Explorer) have support for Kerberos 
 HTTP SPNEGO.
 Adding support for Kerberos HTTP SPNEGO to Hadoop web consoles would provide 
 a unified authentication mechanism and single sign-on for Hadoop web UI and 
 Hadoop RPC.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HADOOP-7557) Make IPC and RPC headers be extensible

2011-08-19 Thread Sanjay Radia (JIRA)
Make  IPC and RPC headers be extensible
---

 Key: HADOOP-7557
 URL: https://issues.apache.org/jira/browse/HADOOP-7557
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Sanjay Radia




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HADOOP-7524) Change RPC to allow multiple protocols including multuple versions of the same protocol

2011-08-07 Thread Sanjay Radia (JIRA)
Change RPC to allow multiple protocols including multuple versions of the same 
protocol
---

 Key: HADOOP-7524
 URL: https://issues.apache.org/jira/browse/HADOOP-7524
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Sanjay Radia
Assignee: Sanjay Radia




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HADOOP-7479) Sperate data types

2011-07-20 Thread Sanjay Radia (JIRA)
Sperate data types
--

 Key: HADOOP-7479
 URL: https://issues.apache.org/jira/browse/HADOOP-7479
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Sanjay Radia
Assignee: Sanjay Radia




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-7479) Separate data types

2011-07-20 Thread Sanjay Radia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-7479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjay Radia resolved HADOOP-7479.
--

Resolution: Duplicate

 Separate data types
 ---

 Key: HADOOP-7479
 URL: https://issues.apache.org/jira/browse/HADOOP-7479
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Sanjay Radia
Assignee: Sanjay Radia
 Fix For: 0.23.0




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HADOOP-7426) User Guide for how to use viewfs with federation

2011-06-24 Thread Sanjay Radia (JIRA)
User Guide for how to use viewfs with federation


 Key: HADOOP-7426
 URL: https://issues.apache.org/jira/browse/HADOOP-7426
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Sanjay Radia
Assignee: Sanjay Radia
Priority: Minor




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HADOOP-7391) Copy the interrface classification documentation of HADOOP-5073 into javadoc

2011-06-14 Thread Sanjay Radia (JIRA)
Copy the interrface classification documentation of HADOOP-5073 into javadoc


 Key: HADOOP-7391
 URL: https://issues.apache.org/jira/browse/HADOOP-7391
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Sanjay Radia
Assignee: Sanjay Radia
 Fix For: 0.22.0


The documentation for interface classification in Jira Hadoop-5073 was not 
copied to the Javadoc
of the classification.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HADOOP-7375) Add resolvePath method to FileContext

2011-06-09 Thread Sanjay Radia (JIRA)
Add resolvePath method to FileContext
-

 Key: HADOOP-7375
 URL: https://issues.apache.org/jira/browse/HADOOP-7375
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Sanjay Radia
Assignee: Sanjay Radia
 Fix For: 0.23.0
 Attachments: resolvePath1.patch



--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HADOOP-7347) HDFS Wire compatibility

2011-05-31 Thread Sanjay Radia (JIRA)
HDFS Wire compatibility
---

 Key: HADOOP-7347
 URL: https://issues.apache.org/jira/browse/HADOOP-7347
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Sanjay Radia
Assignee: Sanjay Radia
 Fix For: 0.23.0




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HADOOP-7284) Trash and shell's rm does not work for viewfs

2011-05-12 Thread Sanjay Radia (JIRA)
Trash and shell's rm does not work for viewfs
-

 Key: HADOOP-7284
 URL: https://issues.apache.org/jira/browse/HADOOP-7284
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Sanjay Radia
Assignee: Sanjay Radia
 Fix For: 0.23.0




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Created: (HADOOP-7054) Change NN LoadGenerator to use the new FileContext api

2010-11-29 Thread Sanjay Radia (JIRA)
Change NN LoadGenerator to use the new FileContext api
--

 Key: HADOOP-7054
 URL: https://issues.apache.org/jira/browse/HADOOP-7054
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Sanjay Radia
Assignee: Sanjay Radia
 Fix For: 0.22.0




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HADOOP-7018) FileContext's list operation should return local names in the status object rather then the full path.

2010-11-03 Thread Sanjay Radia (JIRA)
FileContext's list operation should return local names in the status object 
rather then the full path.
--

 Key: HADOOP-7018
 URL: https://issues.apache.org/jira/browse/HADOOP-7018
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Sanjay Radia




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HADOOP-6903) Make AbstractFileSystem's methods public to allow filter-Fs like implementions in a differnt package than fs

2010-08-06 Thread Sanjay Radia (JIRA)
Make AbstractFileSystem's methods public to allow filter-Fs like implementions 
in a differnt package than fs


 Key: HADOOP-6903
 URL: https://issues.apache.org/jira/browse/HADOOP-6903
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Sanjay Radia


Make AbstractFileSystem's methods public to allow filter-Fs like implementions 
in a differnt package than fs

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HADOOP-6899) FileSystem#setWorkingDir() does not work for relative names

2010-08-04 Thread Sanjay Radia (JIRA)
FileSystem#setWorkingDir() does not work for relative names
---

 Key: HADOOP-6899
 URL: https://issues.apache.org/jira/browse/HADOOP-6899
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 0.20.2
Reporter: Sanjay Radia
Assignee: Sanjay Radia
 Fix For: 0.22.0


FileSystem#setWorkingDir() does not work for relative names

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HADOOP-6775) Update Hadoop Common Site's

2010-05-19 Thread Sanjay Radia (JIRA)
Update Hadoop Common Site's 


 Key: HADOOP-6775
 URL: https://issues.apache.org/jira/browse/HADOOP-6775
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Sanjay Radia
Assignee: Sanjay Radia
 Fix For: site


Add documentation on our interface classification scheme to thew common site.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (HADOOP-6421) Symbolic links

2010-02-16 Thread Sanjay Radia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjay Radia resolved HADOOP-6421.
--

  Resolution: Fixed
Release Note: 
 Adds Symbolic links to FileContext, AbstractFileSystem.
 It also adds a limited implementation for the local file system  (RawLocalFs) 
that allows local symlinks. 
Hadoop Flags: [Reviewed]

Thanks Eli.

 Symbolic links
 --

 Key: HADOOP-6421
 URL: https://issues.apache.org/jira/browse/HADOOP-6421
 Project: Hadoop Common
  Issue Type: New Feature
Reporter: Eli Collins
Assignee: Eli Collins
 Attachments: symlink-25-common.patch, symlink-26-common.patch, 
 symlink-26-common.patch, symlink24-common.patch, symlink27-common.patch, 
 symlink28-common.patch, symlink29-common.patch, symlink29-common.patch, 
 symlink29-common.patch, symlink30-common.patch, symlink31-common.patch, 
 symlink32-common.patch, symlink33-common.patch, symlink34-common.patch, 
 symlink35-common.patch, symlink36-common.patch, symlink37-common.patch, 
 symlink38-common.patch, symlink39-common.patch


 Here's a jira for the common parts of HDFS-245, mostly changes to FileContext 
 and AbstractFileSystem.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: HDFS-758 in Hadoop-21 , Updates to Namenode health page

2009-12-07 Thread Sanjay Radia


On Nov 25, 2009, at 12:46 PM, Allen Wittenauer wrote:



Then you'll have no issues patching other things in 0.21 that  
are actual
bug fixes that also meet this criteria, right?  Or does this only  
apply to

things that Yahoo! is hitting/deemed worthy?



Allen raises a good point that the rest of the community may  not need  
some of the features that Yahoo

finds useful internally. It may clutter Hadoop unnecessarily.
Most of these admin GUI improvements are pluggable.
Perhaps this particular  plugin did not even need to go into trunk. It  
could have been made available as a separate downloadable or
contrib module. This way folks can use it across releases and only if  
they need it.


Further, there are a few new GUI improvements in the hadoop community  
that are proprietary - we should make it easier to create new admin  
plugins easily. This way users  can create new plugins that are useful  
to them internally; it also allows companies to create proprietary  
plugins for their customers.


sanjay




On 11/25/09 12:03 PM, Tsz Wo (Nicholas), Sze s29752-hadoop...@yahoo.com 


wrote:

 +1 on committing it to 0.21

 I also agree that it does not impact the 0.21 release since the  
patch is
 already done.  The argument of not committing it to 0.21 would be  
either (1)
 the patch is not safe, or (2) the patch is not that useful.  I  
don't see they

 are the cases here.

 Nicholas Sze




 - Original Message 
 From: Jakob Homan jho...@yahoo-inc.com
 To: common-dev@hadoop.apache.org
 Sent: Wed, November 25, 2009 11:31:08 AM
 Subject: Re: HDFS-758 in Hadoop-21 , Updates to  Namenode health  
page


 +1. Backporting this does not in any way impact the release of 21.
 -Jakob
 Hairong Kuang wrote:
 +1. Although this is a new feature, I'd like to have it  
committed to 0.21

 since we have so many issues with delayed decomission recently.

 Hairong


 On 11/24/09 6:06 PM, Suresh Srinivas wrote:

 +1. This will also help debug the issues when decommissioning  
takes a long

 time to complete.


 On 11/23/09 7:36 PM, Jitendra Nath Pandey wrote:



 Hi,
We will be committing some changes to the Namenode Health  
page
 (dfshealth.jsp) as part of the fix in HDFS-758. This will  
enable us to
 monitor the progress of decommissioning of datanodes more  
effectively.

Summary of changes :
1. A new link on the page for Decommissioning nodes.
2. This link will point to a new page with details about
 decommissioning
 status for each node which include
 a) Number of under-relplicated blocks in the node.
 b) Number of blocks with only no live replica (i.e.  
All its

 replicas
 are on decommissioning nodes).
 c) Number of under-replicated blocks in open files.
d) Time since decommissioning started.
3. The main page will also contain total number of under- 
replicated

 blocks
 in the cluster.

 Thanks
 jitendra








Re: Private, LimitedPrivate and contrib modules

2009-11-05 Thread Sanjay Radia


Sorry for the late reply .. I missed in in my inbox.

On Sep 18, 2009, at 1:29 PM, Tom White wrote:


I'm trying to better understand the meaning of the annotations defined
in org.apache.hadoop.classification.InterfaceAudience.

1. Private is documented as being Intended for use only within Hadoop
itself. Does this mean the whole Hadoop project, or the subproject
that the annotated element is in? (Or another scope?)




Private means private to the subproject (ie private to HDFS, to MR,  
etc).

We should update the doc to clarify this.


2. Is a contrib module considered to be a part of the subproject? For
example, if something is marked as LimitedPrivate in Common say, with
intended audience MapReduce, can MapReduce contrib modules use it?



I believe the answer should be no.

Or, assuming the second meaning for 1, if something is marked as
Private in MapReduce, can MapReduce contrib modules use it?



Again I believe the answer should be no.

Any concrete examples that we can  use to examine/discuss this further


Thanks,
Tom





Re: Private, LimitedPrivate and contrib modules

2009-11-05 Thread Sanjay Radia


On Nov 5, 2009, at 12:15 PM, Dhruba Borthakur wrote:


Hi sanjay,

Most of the contrib modules are in the same package as their  
containers. For
example, the fair share scheduler is in contrib but its package name  
is
org.apache.hadoop.mapred. Doesn't this mean that the fair-share  
scheduler
code can use LimitedPrivate methods from org.apache.hadoop.mapred  
package?





The package/java public private does not imply an API's visibility.
We have often made certain API's java-public because of java's  
limitations on visibility rules (eg. if java had
subpackage-private, then a number of apis would change from java- 
public to java-subpackage-private.


Generally contrib should be using API that are audience-public.

The FS/C schedulers are very good examples. I assume that it is using  
the scheduler's plugin interfaces. Is it using any other internal  
interfaces of MR?


Plugin interfaces or Abstract classes that are to be used by  
implementors are current planned to be marked

audience-public. (See AbstractFileSystem).
Makes sense?

As a side note:
Perhaps we need to add tag to the audience-public of applicationUse  
implementorsOfInteface or something like that.

(like the tag for limited-private).


sanjay


thanks,
dhruba

On Thu, Nov 5, 2009 at 11:05 AM, Sanjay Radia sra...@yahoo-inc.com  
wrote:



 Sorry for the late reply .. I missed in in my inbox.

 On Sep 18, 2009, at 1:29 PM, Tom White wrote:

  I'm trying to better understand the meaning of the annotations  
defined

 in org.apache.hadoop.classification.InterfaceAudience.

 1. Private is documented as being Intended for use only within  
Hadoop

 itself. Does this mean the whole Hadoop project, or the subproject
 that the annotated element is in? (Or another scope?)



 Private means private to the subproject (ie private to HDFS, to  
MR, etc).

 We should update the doc to clarify this.


 2. Is a contrib module considered to be a part of the subproject?  
For
 example, if something is marked as LimitedPrivate in Common say,  
with

 intended audience MapReduce, can MapReduce contrib modules use it?


 I believe the answer should be no.

 Or, assuming the second meaning for 1, if something is marked as
 Private in MapReduce, can MapReduce contrib modules use it?


 Again I believe the answer should be no.

 Any concrete examples that we can  use to examine/discuss this  
further



 Thanks,
 Tom





--
Connect to me at http://www.facebook.com/dhruba





[jira] Created: (HADOOP-6356) Add a Cache for AbstractFileSystem in the new FileContext/AbstractFileSystem framework.

2009-11-02 Thread Sanjay Radia (JIRA)
Add a Cache for AbstractFileSystem in the new FileContext/AbstractFileSystem 
framework.
---

 Key: HADOOP-6356
 URL: https://issues.apache.org/jira/browse/HADOOP-6356
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 0.22.0
Reporter: Sanjay Radia
Assignee: Sanjay Radia
 Fix For: 0.22.0


The new filesystem framework, FileContext and AbstractFileSystem does not 
implement a cache for AbstractFileSystem.
This Jira proposes to add a cache to the new framework just like with the old 
FileSystem.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HADOOP-6327) Fix build error for one of the FileContext errors

2009-10-22 Thread Sanjay Radia (JIRA)
Fix build error for one of the FileContext errors
-

 Key: HADOOP-6327
 URL: https://issues.apache.org/jira/browse/HADOOP-6327
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Sanjay Radia


The build fails in Hudson
org.apache.hadoop.fs.TestLocalFSFileContextMainOperations.testWorkingDirectory  
(from TestLocalFSFileContextMainOperations)
Failing for the past 5 builds (Since Failed#272 )
Took 88 ms.
add description
Error Message

chmod: changing permissions of `/tmp/existingDir': Operation not permitted 

Stacktrace

org.apache.hadoop.util.Shell$ExitCodeException: chmod: changing permissions of 
`/tmp/existingDir': Operation not permitted

at org.apache.hadoop.util.Shell.runCommand(Shell.java:243)
at org.apache.hadoop.util.Shell.run(Shell.java:170)
at 
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:363)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:449)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:432)
at 
org.apache.hadoop.fs.RawLocalFileSystem.execCommand(RawLocalFileSystem.java:545)
at 
org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:537)
at 
org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:347)
at 
org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:184)
at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:769)
at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:539)
at 
org.apache.hadoop.fs.FileContextMainOperationsBaseTest.testWorkingDirectory(FileContextMainOperationsBaseTest.java:170)


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: [VOTE] Should we freeze the public stable APIs after 0.21.0?

2009-09-28 Thread Sanjay Radia


+1

On Sep 25, 2009, at 10:16 AM, Owen O'Malley wrote:


We are getting closer to being able to release a Common/HDFS/MapReduce
1.0. I'd hope that we'll get the last set of things in to 0.22 that
mean that it would be labelled 1.0. Toward that end, I'd like to start
locking down the APIs that we've marked as public stable. What that
would mean is that any interface that is tagged with the
@InterfaceStability.Stable and @InterfaceAudience.Public in the 0.21.0
release should not have any changes committed that require a
recompilation of client code. This will provide a stable basis for our
users' applications and reduce the costs of upgrades.

Clearly, I'm +1.

-- Owen





Re: Towards Hadoop 1.0: Stronger API Compatibility from 0.21 onwards

2009-09-28 Thread Sanjay Radia


On Sep 28, 2009, at 3:15 AM, Steve Loughran wrote:


Dhruba Borthakur wrote:
 It is really nice to have wire-compatibility between clients and  
servers
 running different versions of hadoop. The reason we would like  
this is

 because we can allow the same client (Hive, etc) submit jobs to two
 different clusters running different versions of hadoop. But I am  
not stuck
 up on the name of the release that supports wire-compatibility, it  
can be

 either 1.0  or something later than that.
 API compatibility  +1
 Data compatibility +1
 Job Q compatibility -1Wire compatibility +0


That's stability of the job submission network protocol you are  
looking

for there.
  * We need a job submission API that is designed to work over long- 
haul

links and versions
  * It does not have to be the same as anything used in-cluster
  * It does not actually need to run in the JobTracker. An independent
service bridging the stable long-haul API to an unstable datacentre
protocol does work, though authentication and user-rights are a  
troublespot






I think you are misinterpreting what Job Q compatibility means.
It is about jobs already in the queue surviving an upgrade across a  
release.


See my initial proposal on Jan 16th:
https://issues.apache.org/jira/browse/HADOOP-5071?focusedCommentId=12664691page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel 
#action_12664691


Doug argued that it is nice to have but not required for 1.0 - can be  
added later.



sanjay


Similarly, it would be good for a stable long-haul HDFS protocol, such
as FTP or webdav. Again, no need to build into the namenode .

see http://www.slideshare.net/steve_l/long-haul-hadoop
and commentary under http://wiki.apache.org/hadoop/BristolHadoopWorkshop





Re: Towards Hadoop 1.0: Stronger API Compatibility from 0.21 onwards

2009-09-25 Thread Sanjay Radia


On Sep 25, 2009, at 12:03 PM, Allen Wittenauer wrote:


On 9/25/09 10:13 AM, Dhruba Borthakur dhr...@gmail.com wrote:
 It is really nice to have wire-compatibility between clients and  
servers
 running different versions of hadoop. The reason we would like  
this is

 because we can allow the same client (Hive, etc) submit jobs to two
 different clusters running different versions of hadoop. But I am  
not stuck
 up on the name of the release that supports wire-compatibility, it  
can be

 either 1.0  or something later than that.

To me, the lack of wire compatibility makes will make Hadoop 1.0  
in name

only when in reality it is more like 0.80. :(


My sentiments exactly, though I could learn to live with it 








[jira] Created: (HADOOP-6271) Fix FileContext to allow both recursive and non recursive create and mkdir

2009-09-18 Thread Sanjay Radia (JIRA)
Fix FileContext to allow both recursive and non recursive create and mkdir
--

 Key: HADOOP-6271
 URL: https://issues.apache.org/jira/browse/HADOOP-6271
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 0.21.0
Reporter: Sanjay Radia
Assignee: Sanjay Radia


Modify FileContext to allow recursive and non-recursive create and mkdir (see 
HADOOP-4952)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HADOOP-6265) Remove deprecated protected methods added to FileSystem to support FileContext.

2009-09-17 Thread Sanjay Radia (JIRA)
Remove deprecated protected methods added to FileSystem to support FileContext.
---

 Key: HADOOP-6265
 URL: https://issues.apache.org/jira/browse/HADOOP-6265
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 0.21.0
Reporter: Sanjay Radia


These Deprecated methods can be removed when FileContext is implemented on top 
of the new file system class as described
in HADOOP-6223

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HADOOP-6266) Cleanup class Path

2009-09-17 Thread Sanjay Radia (JIRA)
Cleanup class Path
--

 Key: HADOOP-6266
 URL: https://issues.apache.org/jira/browse/HADOOP-6266
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 0.21.0
Reporter: Sanjay Radia


Class Path is a key class that needs to be better documented and cleaned up 
(removal of deprecated methods, etc).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HADOOP-6223) New improved FileSystem interface for those implementing new files systems.

2009-08-30 Thread Sanjay Radia (JIRA)
New improved FileSystem interface for those implementing new files systems.
---

 Key: HADOOP-6223
 URL: https://issues.apache.org/jira/browse/HADOOP-6223
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Sanjay Radia


The FileContext API (HADOOP-4952) provides an improved interface for the 
application writer.
This lets us simplify the FileSystem API since it will no longer need to deal 
with notions of default filesystem [ / ],  wd, and config
defaults for blocksize, replication factor etc. Further it will not need the 
many overloaded methods for create() and open() since
the FileContext API provides that convenience.
The FileSystem API can be simplified and can now be restricted to those 
implementing new file systems.


This jira proposes that we create new file system API,  and deprecate 
FileSystem API after a few releases.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Towards Hadoop 1.0: Stronger API Compatibility from 0.21 onwards

2009-08-28 Thread Sanjay Radia


Hadoop 1.0's goal was compatibility on several fronts.
(See https://issues.apache.org/jira/browse/HADOOP-5071) for details.

Due to the amount of work involved, it has been necessary to split  
this work across several  releases prior to 1.0.


Turns out that release 0.21 has a number of Jiras targeted towards API  
and config stability.
Further, in 0.21,  we are tagging interfaces with a classification of  
their intended audience(scope) and their stability

(see HADOOP-5073 for the classification).
Post 1.0 stable interfaces will remain stable (both syntax and  
semantics) according the proposed 1.0 rules.
Hadoop's  pre-1.0 rules allow interfaces to be changed regardless of  
stability as long as one allows 2 releases of deprecation.
(See http://wiki.apache.org/hadoop/Roadmap for the current i.e.  
pre-1.0 rules).


So how do we arrange to maintain that stable interfaces remain stable  
(both syntax and semantics) between 0.21 and 1.0?
I propose that we honor the compatibility of stable interfaces  from  
release 0.21 onwards;

i.e. apply the same post 1.0 rules to pre-1.0 releases.

The actual discussion on what needs to be stable or not belongs inside  
Jira Hadoop-5073, not in this email thread;
 I would like to use this email thread to discuss the proposal of  
honoring  compatibility of stable interfaces prior to 1.0.


Feedback?

sanjay




Re: [VOTE] Push back code freeze for 0.21

2009-07-24 Thread Sanjay Radia


On Jul 24, 2009, at 5:11 PM, Eric Baldeschwieler wrote:


AGGHH!

Let's not push out the general freeze even as far as sept 4th if
possible.  I'd suggest oct 31st.


You mean Aug 31?
sanjay



If we push out the general freeze
until append is complete, we will delay having a stable release with
append at least an additional month.

E14



 From: Nigel Daley nda...@yahoo-inc.com
 Date: Fri, 24 Jul 2009 16:30:55 -0700
 To: common-dev@hadoop.apache.org
 Subject: Re: [VOTE] Push back code freeze for 0.21


 There's a 3rd option I'd support.  Feature freeze on Sept 4, except
 for Append which gets a 2 week extension.  This allows us to start
 stabilizing the rest of the code base while Append is finished up.

 Nige

 On Jul 24, 2009, at 4:25 PM, Konstantin Boudnik wrote:

  I second Konstantin's point: FI tests are likely to take a pretty
  hefty chunk of overall dev. time. Considering that people'd be
  facing a certain learning curve to master this new technology we  
are

  likely to need more time for the completion of the development of
  the code  tests.
 
  +1 for farther push back, e.g. until 9/18/09
 
  Konstantin (aka Cos)
 
  On 7/24/09 4:18 PM, Konstantin Shvachko wrote:
  I would like to clarify the append plans.
  Right now according to our planning schedule we need about
  8 weeks to complete the implementation of the design.
  This will include all the functionality and unit testing
  according to the test plan.
 
  September 4 deadline gives us about 6 weeks to commit the  
features.

  Which is 2 weeks short off our schedule.
  This of course if the implementation will not go much faster than
  we predicted :-).
 
  So, we can either move the freeze a couple of weeks further  
ahead.

  Or we can also use some help from the people familiar with the
  design.
  E.g. there is a lot of hours scheduled to unit test writing using
  fault
  injection, etc.
 
  Thanks,
  --Konstantin
 
  Amr Awadallah wrote:
  +1 for stable append.
 
  -- amr
 
  Dhruba Borthakur wrote:
  +1
 
 
 
 
 
  On 7/24/09, Jim Kellerman
  (POWERSET)jim.keller...@microsoft.com  wrote:
 
  +1
 
 
  -Original Message-
  From: Owen O'Malley [mailto:omal...@apache.org]
  Sent: Friday, July 24, 2009 1:11 PM
  To: common-dev@hadoop.apache.org
  Subject: [VOTE] Push back code freeze for 0.21
 
  I'd like to push the date for the code freeze back to 4
  September to
  give the file append more time to be finished well. Clearly,
  I'm +1.
 
  -- Owen
 
 
 
 
  --
  With best regards,
Konstantin Boudnik (aka Cos)
 
 Yahoo! Grid Computing
 +1 (408) 349-4049
 
  2CAC 8312 4870 D885 8616  6115 220F 6980 1F27 E622
  Attention! Streams of consciousness are disallowed
 



 -- End of Forwarded Message






[jira] Created: (HADOOP-6147) Hadoop Filesystem Plugin for Amazon Elastic Block Storage (EBS)

2009-07-13 Thread Sanjay Radia (JIRA)
Hadoop Filesystem Plugin for Amazon Elastic Block Storage (EBS)
---

 Key: HADOOP-6147
 URL: https://issues.apache.org/jira/browse/HADOOP-6147
 Project: Hadoop Common
  Issue Type: New Feature
Reporter: Sanjay Radia


Amazon has introduced a new storage mechanism called EBS. Supposedly it is a 
lot more efficient than S3 for Hadoop.
A filesystem plugin for hadoop would we be very useful for the community.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.