[jira] [Commented] (HDFS-6994) libhdfs3 - A native C/C++ HDFS client

2014-12-29 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14260874#comment-14260874
 ] 

Binglin Chang commented on HDFS-6994:
-

About adding more tests, we should add minidfscluster support, we can reuse 
native_mini_dfs.h in libhdfs, but it has some limitations:
1. it lacks some functionalities to do all the tests. e.g. start/stop datanode, 
corrupt file.
2. it add dependency of jni
3. add method support in native minidfscluster involve lot of work(get method 
id, type conversion etc.)
I have another idea of doing this:
1. add some cli like interface to MiniDFSCluster in java side. support most 
commonly used MiniDFSCluster method as cli commands should be easy using 
reflection and json
2. On libhdfs3 side, tests can start MiniDFSCluster cli process and call those 
method in a cli+json protocol
If you guys thinks its OK, I can create a task and work on this.

> libhdfs3 - A native C/C++ HDFS client
> -
>
> Key: HDFS-6994
> URL: https://issues.apache.org/jira/browse/HDFS-6994
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: hdfs-client
>Reporter: Zhanwei Wang
>Assignee: Zhanwei Wang
> Attachments: HDFS-6994-rpc-8.patch, HDFS-6994.patch
>
>
> Hi All
> I just got the permission to open source libhdfs3, which is a native C/C++ 
> HDFS client based on Hadoop RPC protocol and HDFS Data Transfer Protocol.
> libhdfs3 provide the libhdfs style C interface and a C++ interface. Support 
> both HADOOP RPC version 8 and 9. Support Namenode HA and Kerberos 
> authentication.
> libhdfs3 is currently used by HAWQ of Pivotal
> I'd like to integrate libhdfs3 into HDFS source code to benefit others.
> You can find libhdfs3 code from github
> https://github.com/PivotalRD/libhdfs3
> http://pivotalrd.github.io/libhdfs3/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6633) Support reading new data in a being written file until the file is closed

2014-12-29 Thread Vinayakumar B (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinayakumar B updated HDFS-6633:

Attachment: HDFS-6633-003.patch

Rebased again.

Please someone take a look , Thanks.

> Support reading new data in a being written file until the file is closed
> -
>
> Key: HDFS-6633
> URL: https://issues.apache.org/jira/browse/HDFS-6633
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: hdfs-client
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Vinayakumar B
> Attachments: HDFS-6633-001.patch, HDFS-6633-002.patch, 
> HDFS-6633-003.patch, h6633_20140707.patch, h6633_20140708.patch
>
>
> When a file is being written, the file length keeps increasing.  If the file 
> is opened for read, the reader first gets the file length and then read only 
> up to that length.  The reader will not be able to read the new data written 
> afterward.
> We propose adding a new feature so that readers will be able to read all the 
> data until the writer closes the file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work started] (HDFS-7574) Make cmake work in Windows Visual Studio 2010

2014-12-29 Thread Thanh Do (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDFS-7574 started by Thanh Do.
--
> Make cmake work in Windows Visual Studio 2010
> -
>
> Key: HDFS-7574
> URL: https://issues.apache.org/jira/browse/HDFS-7574
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
> Environment: Windows Visual Studio 2010
>Reporter: Thanh Do
>Assignee: Thanh Do
>
> Cmake should be able to generate a solution file in Windows Visual Studio 
> 2010. This is the first step in a series of steps making libhdfs3 built 
> successfully in Windows. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7574) Make cmake work in Windows Visual Studio 2010

2014-12-29 Thread Thanh Do (JIRA)
Thanh Do created HDFS-7574:
--

 Summary: Make cmake work in Windows Visual Studio 2010
 Key: HDFS-7574
 URL: https://issues.apache.org/jira/browse/HDFS-7574
 Project: Hadoop HDFS
  Issue Type: Sub-task
 Environment: Windows Visual Studio 2010
Reporter: Thanh Do
Assignee: Thanh Do


Cmake should be able to generate a solution file in Windows Visual Studio 2010. 
This is the first step in a series of steps making libhdfs3 built successfully 
in Windows. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6994) libhdfs3 - A native C/C++ HDFS client

2014-12-29 Thread Zhanwei Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14260647#comment-14260647
 ] 

Zhanwei Wang commented on HDFS-6994:


Sure, once HDFS-7018 is committed, I will start to add the test.

> libhdfs3 - A native C/C++ HDFS client
> -
>
> Key: HDFS-6994
> URL: https://issues.apache.org/jira/browse/HDFS-6994
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: hdfs-client
>Reporter: Zhanwei Wang
>Assignee: Zhanwei Wang
> Attachments: HDFS-6994-rpc-8.patch, HDFS-6994.patch
>
>
> Hi All
> I just got the permission to open source libhdfs3, which is a native C/C++ 
> HDFS client based on Hadoop RPC protocol and HDFS Data Transfer Protocol.
> libhdfs3 provide the libhdfs style C interface and a C++ interface. Support 
> both HADOOP RPC version 8 and 9. Support Namenode HA and Kerberos 
> authentication.
> libhdfs3 is currently used by HAWQ of Pivotal
> I'd like to integrate libhdfs3 into HDFS source code to benefit others.
> You can find libhdfs3 code from github
> https://github.com/PivotalRD/libhdfs3
> http://pivotalrd.github.io/libhdfs3/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7188) support build libhdfs3 on windows

2014-12-29 Thread Thanh Do (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14260493#comment-14260493
 ] 

Thanh Do commented on HDFS-7188:


Thanks for your comment, ~cmccabe.

README.txt is for my own notes, it was included by mistake. Same thing for 
krb5_32. These changes are not supposed to be in this patch. Sorry about that.

Regarding mman library, I think the code is MIT licence, but it doesn't hurt to 
rewrite this. 

Now, I am convinced that we should break this into small jiras. Few I could 
think of.
1. add additional header includes needed by Windows.
2. make cmake works on Windows Visual Studio 2010.
3. restructure platform specific functions.
4. Implement platform specific functions for Windows.

Thoughts?

> support build libhdfs3 on windows
> -
>
> Key: HDFS-7188
> URL: https://issues.apache.org/jira/browse/HDFS-7188
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
> Environment: Windows System, Visual Studio 2010
>Reporter: Zhanwei Wang
>Assignee: Thanh Do
> Attachments: HDFS-7188-branch-HDFS-6994-0.patch, 
> HDFS-7188-branch-HDFS-6994-1.patch
>
>
> libhdfs3 should work on windows



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7496) Fix FsVolume removal race conditions on the DataNode

2014-12-29 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14260446#comment-14260446
 ] 

Colin Patrick McCabe commented on HDFS-7496:


Thanks for looking at this.

{code}
@@ -125,6 +123,7 @@
 
   private boolean syncOnClose;
   private long restartBudget;
+  private boolean hasReference = false;
 
   /**
* for replaceBlock response
@@ -221,6 +220,11 @@
   " while receiving block " + block + " from " + inAddr);
 }
   }
+  if (replicaInfo instanceof ReplicaInfo) {
+// Hold a reference to protect IOs on the streams.
+((ReplicaInfo) replicaInfo).getVolume().reference();
+hasReference = true;
+  }
{code}
A few comments:
* Rather than having a {{boolean hasReference}}, let's have an actual pointer 
to the {{FsVolumeSpi}} object.  That makes it clear that we can access the 
volume whenever we want, because we're holding a refcount.
* We should release the reference count in {{close()}}, not in the finally 
block, I think.  This is consistent with how we release the streams and so 
forth.
* We don't need this code any more:

{code}
  if (lastPacketInBlock) {
// Finalize the block and close the block file
try {
  finalizeBlock(startTime);
} catch (ReplicaNotFoundException e) {
  // Verify that the exception is due to volume removal.
  FsVolumeSpi volume;
  synchronized (datanode.data) {
volume = datanode.data.getVolume(block);
  }
  if (volume == null) {
// ReplicaInfo has been removed due to the corresponding data
// volume has been removed. Don't need to check disk error.
LOG.info(myString
+ ": BlockReceiver is interrupted because the block pool "
+ block.getBlockPoolId() + " has been removed.", e);
sendAckUpstream(ack, expected, totalAckTimeNanos, 0,
Status.OOB_INTERRUPTED);
running = false;
receiverThread.interrupt();
continue;
  }
  throw e;
}
  }
{code}
Because it will no longer be possible for the volume to go away while we're 
using it, we can get rid of that whole code block in the "catch" block, right?

{code}
   @Override
-  public synchronized void removeVolumes(Collection volumes) {
-Set volumeSet = new HashSet();
+  public synchronized void removeVolumes(Collection volumes)
+  throws IOException {
+Set volumeSet = new HashSet<>();
 for (StorageLocation sl : volumes) {
-  volumeSet.add(sl.getFile());
+  volumeSet.add(sl.getFile().getCanonicalPath());
 }
 for (int idx = 0; idx < dataStorage.getNumStorageDirs(); idx++) {
   Storage.StorageDirectory sd = dataStorage.getStorageDir(idx);
-  if (volumeSet.contains(sd.getRoot())) {
+  if (volumeSet.contains(sd.getRoot().getCanonicalPath())) {
{code}
This change seems unrelated to this JIRA... am I missing something?  Also, as 
I've said in the past, I'm strongly against {{removeVolumes}} throwing an 
{{IOException}}.  I don't see how the code is supposed to proceed if removal 
fails with an exception.

{code}
176   DataNode getDatanode() {
177 return datanode;
178   }
{code}
This isn't necessary, since {{FsDatasetImpl#datanode}} already has 
package-private access and this accessor has the same level of access.

{code}
106 if (dataset.getDatanode() == null) {
107   // FsVolumeImpl is used in test.
108   return null;
109 }
{code}
How about using {{Preconditions.checkNonNull}} here... might look nicer

{code}
401   File createRbwFile(String bpid, Block b) throws IOException {
402 reference();
403 try {
404   reserveSpaceForRbw(b.getNumBytes());
405   return getBlockPoolSlice(bpid).createRbwFile(b);
406 } finally {
407   unreference();
408 }
409   }
{code}
This is kind of a weird approach, having the volume increment reference counts 
on itself.  What I was envisioning was having {{getNextVolume}} increment the 
reference count when it retrieved the volume, and having the caller decrement 
the reference count after the caller was done with the volume object.

> Fix FsVolume removal race conditions on the DataNode 
> -
>
> Key: HDFS-7496
> URL: https://issues.apache.org/jira/browse/HDFS-7496
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Colin Patrick McCabe
>Assignee: Lei (Eddy) Xu
> Attachments: HDFS-7496.000.patch
>
>
> We discussed a few FsVolume removal race conditions on the DataNode in 
> HDFS-7489.  We should figure out a way to make r

[jira] [Commented] (HDFS-7337) Configurable and pluggable Erasure Codec and schema

2014-12-29 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14260441#comment-14260441
 ] 

Zhe Zhang commented on HDFS-7337:
-

Great work [~drankye] ! I went over the design and have the following comments:
# I like the idea of creating an {{ec}} package under 
{{org.apache.hadoop.hdfs}}. It is a good place to host all codec classes.
# I think the {{ec}} package should focus on codec calculation based on a 
packet unit. Below is how I think the functions should be logically divided:
#* The {{ErasureCodec}} interface simply provide encode and decode functions 
that take a {{byte[][]}} and produce another {{byte[][]}}. It should be 
*unaware* of blocks. For example, I imagine our encode function should look 
similar to Jerasure's 
(https://github.com/tsuraan/Jerasure/blob/master/Manual.pdf): 
{code} void jerasure matrix encode(k, m, w, matrix, data_ptrs, coding_ptrs, 
size) {code}
#* {{BlockGroups}} should be formed by {{ECManager}}. In doing so it calls the 
encode and decode functions from {{ErasureCodec}}
# Logically, {{BlockGroup}} is applicable even without EC, because striping can 
be done without EC. So an alternative is to put it in the {{protocol}} package.
# I don't think we should reference the schema through a name (since it wastes 
space and is fragile). We should look at other configurable policies (e.g., 
block placement algorithm) and see how they are loaded. IIRC a factory class is 
used.
# It's great that we are considering LRC in advance. However, with LEGAL-211 
pending, I suggest we keep {{BlockGroup}} simpler for now. For example, it can 
contain only {{dataBlocks}} and {{parityBlocks}}. When we implement LRC we can 
subclass or extend it.
# I guess {{ECBlock}} is for testing purpose? An erasure coded block should 
have all properties of a regular block. I think we can just add a couple of 
flags to the {{Block}} class.
# It's not quite clear to me why we need {{ErasureCoderCallback}}. Is it for 
async codec calculation? If codec calculations are done on small packets, I 
think sync operations are fine.

Thanks!

> Configurable and pluggable Erasure Codec and schema
> ---
>
> Key: HDFS-7337
> URL: https://issues.apache.org/jira/browse/HDFS-7337
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Zhe Zhang
>Assignee: Kai Zheng
> Attachments: HDFS-7337-prototype-v1.patch, 
> HDFS-7337-prototype-v2.zip, HDFS-7337-prototype-v3.zip, 
> PluggableErasureCodec.pdf
>
>
> According to HDFS-7285 and the design, this considers to support multiple 
> Erasure Codecs via pluggable approach. It allows to define and configure 
> multiple codec schemas with different coding algorithms and parameters. The 
> resultant codec schemas can be utilized and specified via command tool for 
> different file folders. While design and implement such pluggable framework, 
> it’s also to implement a concrete codec by default (Reed Solomon) to prove 
> the framework is useful and workable. Separate JIRA could be opened for the 
> RS codec implementation.
> Note HDFS-7353 will focus on the very low level codec API and implementation 
> to make concrete vendor libraries transparent to the upper layer. This JIRA 
> focuses on high level stuffs that interact with configuration, schema and etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7188) support build libhdfs3 on windows

2014-12-29 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14260439#comment-14260439
 ] 

Colin Patrick McCabe commented on HDFS-7188:


Thanks for sticking with this.  The new patch looks better... good work.

I don't understand why you changed this to depend on krb5_32 instead of krb5, 
can you explain?  It seems like we might not want that change for UNIX, if it 
makes us use a 32-bit version of the library (is that what it does?)

Also, I would prefer not to edit ENABLE_BOOST in this JIRA if possible.

{{hadoop-hdfs-project/hadoop-hdfs/src/contrib/libhdfs3/README.txt}}: It seems a 
little odd to provide instructions only for Windows.  Perhaps let's do this in 
a follow-on JIRA when we can write a full README.  I also don't think we need 
to explain how to use "git diff"... that's explained in our HowToContribute 
entry in the wiki.

{code}
#define THREAD_LOCAL __thread
{code}
This doesn't work on MacOS X.  It's fine for us to deal with that in a 
follow-on JIRA, but we should be aware of it.  I mention this because I see you 
have some text about MacOS elsewhere in this file.

I would prefer that we rename {{platform.h.in}} to {{build.h.in}}, and have 
platform-specific stuff in normal header files like {{windows/platform.h}} and 
{{posix/platform.h}}.  It seems like there is no need to have cmake generate 
the platform-specific headers... the only purpose of having the .h.in files is 
to pass along CMake settings, but those should operate the same on each 
platform, right?

{code}
// Code is copied from here
// http://code.google.com/p/mman-win32/
// MIT License
{code}
I can see that this person's Google Code landing page says "BSD license," but 
it doesn't explain which BSD license it is (2-clause, 3-clause, etc.)  Can we 
either reimplement this (I would prefer this), or get this guy to add a 
LICENSE.txt with the license to his repo?  Or did I miss somewhere where he put 
the full license text?

{{InputStreamImpl.cc}}: there's still a huge #ifdefed lump of Windows-specific 
code calling {{GetAdaptersAddresses}} here.  Let's move this to a 
platform-specific file.  It seems like we should just move the whole function 
{{unordered_set BuildLocalAddrSet()}} into the platform directory.

{code}
131 #ifdef _WIN32
132 // In windows, vsnprintf(NULL, 0, fmp, ap) returns -1.
133 // Hence use _vscprintf to get the required size instead.
134 int size = _vscprintf(fmt, ap);
135 #else
136 int size = vsnprintf(NULL, 0, fmt, ap);
137 #endif
{code}
Rather than scattering this kind of ifdef around, I would prefer that we 
implemented a working version of vsnprintf for windows in the platform 
directory.  It shouldn't be too hard, right?  Then everyone could call 
{{platform_vsnprintf}} to get the same result on each platform.

One thing that might help you make progress here is to split off some of this 
into smaller patches.  For example, I can see a few of your changes just add 
additional standard header files.  These changes should be easy to get in very 
quickly.  Perhaps create a jira like "add additional header includes needed by 
windows" and just add that part.  Then I can get that in right away.

> support build libhdfs3 on windows
> -
>
> Key: HDFS-7188
> URL: https://issues.apache.org/jira/browse/HDFS-7188
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
> Environment: Windows System, Visual Studio 2010
>Reporter: Zhanwei Wang
>Assignee: Thanh Do
> Attachments: HDFS-7188-branch-HDFS-6994-0.patch, 
> HDFS-7188-branch-HDFS-6994-1.patch
>
>
> libhdfs3 should work on windows



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7556) HardLink.java should use the jdk7 createLink method

2014-12-29 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14260425#comment-14260425
 ] 

Colin Patrick McCabe commented on HDFS-7556:


+1.  Thanks, Akira.

> HardLink.java should use the jdk7 createLink method
> ---
>
> Key: HDFS-7556
> URL: https://issues.apache.org/jira/browse/HDFS-7556
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.7.0
>Reporter: Colin Patrick McCabe
>Assignee: Akira AJISAKA
> Attachments: HDFS-7556-001.patch
>
>
> Now that we are using jdk7, HardLink.java should use the jdk7 createLink 
> method rather than our shell commands or JNI methods.
> Note that we cannot remove all of the JNI / shell commands unless we remove 
> the code which is checking the link count, something that jdk7 doesn't 
> provide (at least, I don't think it does)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6994) libhdfs3 - A native C/C++ HDFS client

2014-12-29 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14260417#comment-14260417
 ] 

Colin Patrick McCabe commented on HDFS-6994:


bq. Suresh wrote: Coding guidelines - we should adopt one. We could perhaps 
start with google cpp guide

+1 for adopting the Google C\+\+ coding style.

bq. As we have started implementing large code base in c/cpp, we need unit 
tests. We should not be committing code without unit tests. We could adopt 
google test

There are some open JIRAs for adding gtest tests to libhdfs3... HDFS-7019 and 
HDFS-7020.  Perhaps we should prioritize those JIRAs and get them in soon, so 
that we can start attaching tests with all the patches we do, like we normally 
do.  [~wangzw], can you update the patches on these JIRAs so that we can get 
the unit test stuff in?  This should address Suresh's comment, I think.

bq. Zhanwei wrote: Yes, It is better to share the same code. But I think it is 
better to be done after resolving this jira. I do not think it will block your 
work, go ahead with the current code and let us separated it later.

Yeah, let's work on the YARN part once this part is complete.  One step at a 
time.

> libhdfs3 - A native C/C++ HDFS client
> -
>
> Key: HDFS-6994
> URL: https://issues.apache.org/jira/browse/HDFS-6994
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: hdfs-client
>Reporter: Zhanwei Wang
>Assignee: Zhanwei Wang
> Attachments: HDFS-6994-rpc-8.patch, HDFS-6994.patch
>
>
> Hi All
> I just got the permission to open source libhdfs3, which is a native C/C++ 
> HDFS client based on Hadoop RPC protocol and HDFS Data Transfer Protocol.
> libhdfs3 provide the libhdfs style C interface and a C++ interface. Support 
> both HADOOP RPC version 8 and 9. Support Namenode HA and Kerberos 
> authentication.
> libhdfs3 is currently used by HAWQ of Pivotal
> I'd like to integrate libhdfs3 into HDFS source code to benefit others.
> You can find libhdfs3 code from github
> https://github.com/PivotalRD/libhdfs3
> http://pivotalrd.github.io/libhdfs3/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7573) Consolidate the implementation of delete() into a single class

2014-12-29 Thread Charles Lamb (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14260383#comment-14260383
 ] 

Charles Lamb commented on HDFS-7573:


Hi [~wheat9],

Thanks for working on this.

What happened to the enforcePermission check? It looks like it disappeared.

Update all of the @param's in the javadoc.

In general, it would be good to set your IDE to indent 2 spaces for new lines 
and 4 spaces for line continuations.


> Consolidate the implementation of delete() into a single class
> --
>
> Key: HDFS-7573
> URL: https://issues.apache.org/jira/browse/HDFS-7573
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Attachments: HDFS-7573.000.patch
>
>
> This jira proposes to consolidate the implementation of delete() in 
> {{FSNamesystem}} and {{FSDirectory}} into a single class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7558) NFS gateway should throttle the data dumped on local storage

2014-12-29 Thread Brandon Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Li updated HDFS-7558:
-
Affects Version/s: 2.7.0

> NFS gateway should throttle the data dumped on local storage
> 
>
> Key: HDFS-7558
> URL: https://issues.apache.org/jira/browse/HDFS-7558
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: nfs
>Affects Versions: 2.7.0
>Reporter: Brandon Li
>Assignee: Arpit Agarwal
>
> During file uploading, NFS gateway could dump the reordered write on local 
> storage when the accumulated data size exceeds a limit.
> Currently there is no data throttle for the data dumping, which could easily 
> saturate the local disk especially when the client is on the same host as the 
> gateway. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7523) Setting a socket receive buffer size in DFSClient

2014-12-29 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14260254#comment-14260254
 ] 

stack commented on HDFS-7523:
-

Findbugs is not related.

Before I commit, you running this in production [~xieliang007] by chance?  If 
so, any noticeable improvement?  Thanks boss.

> Setting a socket receive buffer size in DFSClient
> -
>
> Key: HDFS-7523
> URL: https://issues.apache.org/jira/browse/HDFS-7523
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: dfsclient
>Affects Versions: 2.6.0
>Reporter: Liang Xie
>Assignee: Liang Xie
> Attachments: HDFS-7523-001 (1).txt, HDFS-7523-001 (1).txt, 
> HDFS-7523-001.txt, HDFS-7523-001.txt, HDFS-7523-001.txt
>
>
> It would be nice if we have a socket receive buffer size while creating 
> socket from client(HBase) view, in old version it should be in 
> DFSInputStream, in trunk it seems should be at:
> {code}
>   @Override // RemotePeerFactory
>   public Peer newConnectedPeer(InetSocketAddress addr,
>   Token blockToken, DatanodeID datanodeId)
>   throws IOException {
> Peer peer = null;
> boolean success = false;
> Socket sock = null;
> try {
>   sock = socketFactory.createSocket();
>   NetUtils.connect(sock, addr,
> getRandomLocalInterfaceAddr(),
> dfsClientConf.socketTimeout);
>   peer = TcpPeerServer.peerFromSocketAndKey(saslClient, sock, this,
>   blockToken, datanodeId);
>   peer.setReadTimeout(dfsClientConf.socketTimeout);
> {code}
> e.g: sock.setReceiveBufferSize(HdfsConstants.DEFAULT_DATA_SOCKET_SIZE);
> the default socket buffer size in Linux+JDK7 seems is 8k if i am not wrong, 
> this value sometimes is small for HBase 64k block reading in a 10G network(at 
> least, more system call)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-6662) [ UI ] Not able to open file from UI if file path contains "%"

2014-12-29 Thread Brahma Reddy Battula (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula reassigned HDFS-6662:
--

Assignee: Gerson Carlos

> [ UI ] Not able to open file from UI if file path contains "%"
> --
>
> Key: HDFS-6662
> URL: https://issues.apache.org/jira/browse/HDFS-6662
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.4.1
>Reporter: Brahma Reddy Battula
>Assignee: Gerson Carlos
>Priority: Critical
> Attachments: hdfs-6662.001.patch, hdfs-6662.002.patch, hdfs-6662.patch
>
>
> 1. write a file into HDFS is such a way that, file name is like 1%2%3%4
> 2. using NameNode UI browse the file
> throwing following Exception.
> "Path does not exist on HDFS or WebHDFS is disabled. Please check your path 
> or enable WebHDFS"
> HBase write its WAL  files data in HDFS using % contains in file name
> eg: 
> /hbase/WALs/HOST-,60020,1404731504691/HOST-***-130%2C60020%2C1404731504691.1404812663950.meta
>  
> the above file info is not opening in the UI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6994) libhdfs3 - A native C/C++ HDFS client

2014-12-29 Thread Zhanwei Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14259932#comment-14259932
 ] 

Zhanwei Wang commented on HDFS-6994:


hi [~decster]

Yes, It is better to share the same code. But I think it is better to be done 
after resolving this jira. I do not think it will block your work, go ahead 
with the current code and let us separated it later.

> libhdfs3 - A native C/C++ HDFS client
> -
>
> Key: HDFS-6994
> URL: https://issues.apache.org/jira/browse/HDFS-6994
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: hdfs-client
>Reporter: Zhanwei Wang
>Assignee: Zhanwei Wang
> Attachments: HDFS-6994-rpc-8.patch, HDFS-6994.patch
>
>
> Hi All
> I just got the permission to open source libhdfs3, which is a native C/C++ 
> HDFS client based on Hadoop RPC protocol and HDFS Data Transfer Protocol.
> libhdfs3 provide the libhdfs style C interface and a C++ interface. Support 
> both HADOOP RPC version 8 and 9. Support Namenode HA and Kerberos 
> authentication.
> libhdfs3 is currently used by HAWQ of Pivotal
> I'd like to integrate libhdfs3 into HDFS source code to benefit others.
> You can find libhdfs3 code from github
> https://github.com/PivotalRD/libhdfs3
> http://pivotalrd.github.io/libhdfs3/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6994) libhdfs3 - A native C/C++ HDFS client

2014-12-29 Thread Zhanwei Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14259928#comment-14259928
 ] 

Zhanwei Wang commented on HDFS-6994:


HI [~sureshms]

Good point. Actually libhdfs3 use google test as test framework and has 
implemented both unit test and function test. This jira and all its sub-jiras 
are created for code review. We have separated jiras to review test case. Maybe 
we can put the code and its test case together for review but 1) it makes the 
patch larger and hard to review; 2) test case will be modified after the 
changing of the source code.

Before this jira merge to master branch, all source code and test case should 
be reviewed, and should be merge to main together.

If you are intreated in the test case, please check out the code from the 
Github https://github.com/PivotalRD/libhdfs3

> libhdfs3 - A native C/C++ HDFS client
> -
>
> Key: HDFS-6994
> URL: https://issues.apache.org/jira/browse/HDFS-6994
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: hdfs-client
>Reporter: Zhanwei Wang
>Assignee: Zhanwei Wang
> Attachments: HDFS-6994-rpc-8.patch, HDFS-6994.patch
>
>
> Hi All
> I just got the permission to open source libhdfs3, which is a native C/C++ 
> HDFS client based on Hadoop RPC protocol and HDFS Data Transfer Protocol.
> libhdfs3 provide the libhdfs style C interface and a C++ interface. Support 
> both HADOOP RPC version 8 and 9. Support Namenode HA and Kerberos 
> authentication.
> libhdfs3 is currently used by HAWQ of Pivotal
> I'd like to integrate libhdfs3 into HDFS source code to benefit others.
> You can find libhdfs3 code from github
> https://github.com/PivotalRD/libhdfs3
> http://pivotalrd.github.io/libhdfs3/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7018) Implement C interface for libhdfs3

2014-12-29 Thread Zhanwei Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14259921#comment-14259921
 ] 

Zhanwei Wang commented on HDFS-7018:


New patch fixed all issues pointed out by [~cmccabe] in above comments.

> Implement C interface for libhdfs3
> --
>
> Key: HDFS-7018
> URL: https://issues.apache.org/jira/browse/HDFS-7018
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Zhanwei Wang
>Assignee: Zhanwei Wang
> Attachments: HDFS-7018-pnative.002.patch, 
> HDFS-7018-pnative.003.patch, HDFS-7018.patch
>
>
> Implement C interface for libhdfs3



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7018) Implement C interface for libhdfs3

2014-12-29 Thread Zhanwei Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhanwei Wang updated HDFS-7018:
---
Attachment: HDFS-7018-pnative.003.patch

> Implement C interface for libhdfs3
> --
>
> Key: HDFS-7018
> URL: https://issues.apache.org/jira/browse/HDFS-7018
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Zhanwei Wang
>Assignee: Zhanwei Wang
> Attachments: HDFS-7018-pnative.002.patch, 
> HDFS-7018-pnative.003.patch, HDFS-7018.patch
>
>
> Implement C interface for libhdfs3



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)