[jira] [Commented] (HADOOP-11505) hadoop-mapreduce-client-nativetask fails to use x86 optimizations in some cases

2015-05-15 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-11505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14545322#comment-14545322
 ] 

Binglin Chang commented on HADOOP-11505:


Yes, that's what HADOOP-11665 did

 hadoop-mapreduce-client-nativetask fails to use x86 optimizations in some 
 cases
 ---

 Key: HADOOP-11505
 URL: https://issues.apache.org/jira/browse/HADOOP-11505
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
  Labels: BB2015-05-TBR
 Attachments: HADOOP-11505.001.patch


 hadoop-mapreduce-client-nativetask fails to use x86 optimizations in some 
 cases.  Also, on some alternate, non-x86, non-ARM architectures the generated 
 code is incorrect.  Thanks to Steve Loughran and Edward Nevill for finding 
 this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-11665) Provide and unify cross platform byteorder support in native code

2015-03-12 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-11665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14359756#comment-14359756
 ] 

Binglin Chang commented on HADOOP-11665:


Looks like [~Ayappan] mark this as blocker, maybe he require this as bugfix?


 Provide and unify cross platform byteorder support in native code
 -

 Key: HADOOP-11665
 URL: https://issues.apache.org/jira/browse/HADOOP-11665
 Project: Hadoop Common
  Issue Type: Bug
  Components: util
Affects Versions: 2.4.1, 2.6.0
 Environment: PowerPC Big Endian  other Big Endian platforms
Reporter: Binglin Chang
Assignee: Binglin Chang
Priority: Blocker
 Attachments: HADOOP-11665.001.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-11665) Provide and unify cross platform byteorder support in native code

2015-03-12 Thread Binglin Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Binglin Chang updated HADOOP-11665:
---
Priority: Minor  (was: Blocker)

 Provide and unify cross platform byteorder support in native code
 -

 Key: HADOOP-11665
 URL: https://issues.apache.org/jira/browse/HADOOP-11665
 Project: Hadoop Common
  Issue Type: Bug
  Components: util
Affects Versions: 2.4.1, 2.6.0
 Environment: PowerPC Big Endian  other Big Endian platforms
Reporter: Binglin Chang
Assignee: Binglin Chang
Priority: Minor
 Attachments: HADOOP-11665.001.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-10846) DataChecksum#calculateChunkedSums not working for PPC when buffers not backed by array

2015-03-03 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14344759#comment-14344759
 ] 

Binglin Chang commented on HADOOP-10846:


Hi Ayappan, thanks for the patch, when validating on macosx, got compile error 
like:

{code}
 [exec] Building C object 
CMakeFiles/hadoop.dir/main/native/src/org/apache/hadoop/util/bulk_crc32.c.o
 [exec] /usr/bin/cc  -Dhadoop_EXPORTS -g 
-/Volumes/SSD/projects/hadoop-trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/util/bulk_crc32.c:3Wall
 -O2 -D_REENTRANT -D_GNU_SOURCE -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 
-isysroot /Applications/Xcode.app/Contents/Develo4:10: fatal error: 
'byteswap.h' file not found
 [exec] #include byteswap.h
 [exec]  ^
 [exec] 1 error generated.
 [exec] make[2]: *** 
[CMakeFiles/hadoop.dir/main/native/src/org/apache/hadoop/util/bulk_crc32.c.o] 
Error 1
 [exec] make[1]: *** [CMakeFiles/hadoop.dir/all] Error 2
 [exec] make: *** [all] Error 2
{code}

I think a more standard way of handling byteorder stuff is needed(not only in 
this jira), like google did in many open sourced code bases:

https://github.com/google/flatbuffers/blob/master/include/flatbuffers/flatbuffers.h

 

 DataChecksum#calculateChunkedSums not working for PPC when buffers not backed 
 by array
 --

 Key: HADOOP-10846
 URL: https://issues.apache.org/jira/browse/HADOOP-10846
 Project: Hadoop Common
  Issue Type: Bug
  Components: util
Affects Versions: 2.4.1, 2.5.2
 Environment: PowerPC platform
Reporter: Jinghui Wang
Assignee: Ayappan
 Attachments: HADOOP-10846-v1.patch, HADOOP-10846-v2.patch, 
 HADOOP-10846-v3.patch, HADOOP-10846-v4.patch, HADOOP-10846.patch


 Got the following exception when running Hadoop on Power PC. The 
 implementation for computing checksum when the data buffer and checksum 
 buffer are not backed by arrays.
 13/09/16 04:06:57 ERROR security.UserGroupInformation: 
 PriviledgedActionException as:biadmin (auth:SIMPLE) 
 cause:org.apache.hadoop.ipc.RemoteException(java.io.IOException): 
 org.apache.hadoop.fs.ChecksumException: Checksum error



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-11665) Provide and unify cross platform byteorder support in native code

2015-03-03 Thread Binglin Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Binglin Chang updated HADOOP-11665:
---
Assignee: Binglin Chang
  Status: Patch Available  (was: Open)

 Provide and unify cross platform byteorder support in native code
 -

 Key: HADOOP-11665
 URL: https://issues.apache.org/jira/browse/HADOOP-11665
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Binglin Chang
Assignee: Binglin Chang





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-11665) Provide and unify cross platform byteorder support in native code

2015-03-03 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-11665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14344983#comment-14344983
 ] 

Binglin Chang commented on HADOOP-11665:


links to related jiras

 Provide and unify cross platform byteorder support in native code
 -

 Key: HADOOP-11665
 URL: https://issues.apache.org/jira/browse/HADOOP-11665
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Binglin Chang
Assignee: Binglin Chang
 Attachments: HADOOP-11665.001.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-11665) Provide and unify cross platform byteorder support in native code

2015-03-03 Thread Binglin Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Binglin Chang updated HADOOP-11665:
---
Attachment: HADOOP-11665.001.patch

The idea is mostly borrowed from 
https://github.com/google/flatbuffers/blob/master/include/flatbuffers/flatbuffers.h

Compile passed on macosx and ubuntu, no ppc environment to test this.


 Provide and unify cross platform byteorder support in native code
 -

 Key: HADOOP-11665
 URL: https://issues.apache.org/jira/browse/HADOOP-11665
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Binglin Chang
Assignee: Binglin Chang
 Attachments: HADOOP-11665.001.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HADOOP-11665) Provide and unify cross platform byteorder support in native code

2015-03-03 Thread Binglin Chang (JIRA)
Binglin Chang created HADOOP-11665:
--

 Summary: Provide and unify cross platform byteorder support in 
native code
 Key: HADOOP-11665
 URL: https://issues.apache.org/jira/browse/HADOOP-11665
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Binglin Chang






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-11505) hadoop-mapreduce-client-nativetask fails to use x86 optimizations in some cases

2015-01-22 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-11505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14288833#comment-14288833
 ] 

Binglin Chang commented on HADOOP-11505:


Hi Colin, thanks for working on this. More background about this issue:
1. The nativetask code is only optimized for x86_64, so some function names are 
not ideal. like the name bswap are little confusing, it is used for ntoh 
purpose. 
2. so on all bigendian arch, bswap should be nop.
3. I guess the use of inline assembly is because very old compilers(like gcc3 I 
use back at baidu)  don't optimize bit shift style ntoh function as bswap 
asembly, but most compilers do that now, so not sure the inline assembly is 
needed any more. 

So I think a more clean way to fix this is to define ntoh32 and noth64 
properly(use bit shift rather than assembly should be OK for modern compilers) 
based on BYTE_ORDER macro, and replace all bswap. 

 hadoop-mapreduce-client-nativetask fails to use x86 optimizations in some 
 cases
 ---

 Key: HADOOP-11505
 URL: https://issues.apache.org/jira/browse/HADOOP-11505
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HADOOP-11505.001.patch


 hadoop-mapreduce-client-nativetask fails to use x86 optimizations in some 
 cases.  Also, on some alternate, non-x86, non-ARM architectures the generated 
 code is incorrect.  Thanks to Steve Loughran and Edward Nevill for finding 
 this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-11154) Update BUILDING.txt to state that CMake 3.0 or newer is required on Mac.

2014-09-29 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-11154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14152747#comment-14152747
 ] 

Binglin Chang commented on HADOOP-11154:


Hi Chris, which problem when build using cmake 2.6? I am using cmake 2.8 and 
macos10.9, and can build current trunk successfully.


 Update BUILDING.txt to state that CMake 3.0 or newer is required on Mac.
 

 Key: HADOOP-11154
 URL: https://issues.apache.org/jira/browse/HADOOP-11154
 Project: Hadoop Common
  Issue Type: Bug
  Components: documentation, native
Reporter: Chris Nauroth
Assignee: Chris Nauroth
Priority: Trivial
 Attachments: HADOOP-11154.1.patch


 The native code can be built on Mac now, but CMake 3.0 or newer is required.  
 This differs from our minimum stated version of 2.6 in BUILDING.txt.  I'd 
 like to update BUILDING.txt to state that 3.0 or newer is required if 
 building on Mac.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-11154) Update BUILDING.txt to state that CMake 3.0 or newer is required on Mac.

2014-09-29 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-11154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14152814#comment-14152814
 ] 

Binglin Chang commented on HADOOP-11154:


Ooh, java1.7 I'm currently using java1.6 so that's why cmake2.8 works for me. 
Thanks for the explanation, please commit. 


 Update BUILDING.txt to state that CMake 3.0 or newer is required on Mac.
 

 Key: HADOOP-11154
 URL: https://issues.apache.org/jira/browse/HADOOP-11154
 Project: Hadoop Common
  Issue Type: Bug
  Components: documentation, native
Reporter: Chris Nauroth
Assignee: Chris Nauroth
Priority: Trivial
 Attachments: HADOOP-11154.1.patch


 The native code can be built on Mac now, but CMake 3.0 or newer is required.  
 This differs from our minimum stated version of 2.6 in BUILDING.txt.  I'd 
 like to update BUILDING.txt to state that 3.0 or newer is required if 
 building on Mac.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-11132) checkHadoopHome still uses HADOOP_HOME

2014-09-27 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-11132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14150930#comment-14150930
 ] 

Binglin Chang commented on HADOOP-11132:


Hi Tsuyoshi, thank for the patch. Grep though the code, looks like 
checkHadoopHome is only used in windows env right now, I am not sure windows is 
change to HADOOP_PREFIX from HADOOP_HOME right now, at least the .cmd files are 
still have plenty of HADOOP_HOME vars. Maybe somebody familiar with hadoop 
windows version can have a look? Or have it tested on windows? 

 checkHadoopHome still uses HADOOP_HOME
 --

 Key: HADOOP-11132
 URL: https://issues.apache.org/jira/browse/HADOOP-11132
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Allen Wittenauer
 Attachments: HADOOP-11132.1.patch


 It should be using HADOOP_PREFIX.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-10389) Native RPCv9 client

2014-06-17 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14033486#comment-14033486
 ] 

Binglin Chang commented on HADOOP-10389:


I'd like to add more inputs on this: 
I wrote a c++ rpc/hdfs/yarn client(https://github.com/decster/libhadoopclient), 
it uses c++11, so it does not need boost(although many people use boost, they 
just for header only libraries, and public headers does not include boost, so 
there are no version issues).
c++ 's main concern is abi compatibility, this can be resolved by using c or 
simple c++ class public headers, hiding real implementation.

I think some issue using c++/pro using c are:
1. centos does not have enough support for c++11, c++11 is not generally 
available yet
2. remain libhdfs compatibility, since libhdfs is written in c, we might just 
continue using c as well

Also there are some concerns about using c:
1. the protobuf-c library is just not so reliable as official protobuf library 
which is maintained and verified by google and many other companies/projects, I 
read some of the protobuf-c code, it uses a reflection style implementation to 
do serializing/deserializing, so performance, security, compatibility may all 
at risk. I see https://github.com/protobuf-c/protobuf-c only have 92 stars.
2. malloc/free/memset can easily generate buggy code, need additional care and 
checks, I see many those kinds of code 
recently(HDFS-6534,HADOOP-10640,HADOOP-10706). it is OK to use c, but we may 
need more care and effort. 

About JNIFS,  why do we need jnifs if we already have nativefs? using 
dlopen/dlsym to replace jni apis is not trivial if both compile/runtime 
dependency is needed to be removed. 




 Native RPCv9 client
 ---

 Key: HADOOP-10389
 URL: https://issues.apache.org/jira/browse/HADOOP-10389
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: HADOOP-10388
Reporter: Binglin Chang
Assignee: Colin Patrick McCabe
 Attachments: HADOOP-10388.001.patch, 
 HADOOP-10389-alternative.000.patch, HADOOP-10389.002.patch, 
 HADOOP-10389.004.patch, HADOOP-10389.005.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10636) Native Hadoop Client:add unit test case for callclient_id

2014-06-17 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14034770#comment-14034770
 ] 

Binglin Chang commented on HADOOP-10636:


lgtm, +1, with minor formatting changes

 Native Hadoop Client:add unit test case for callclient_id
 --

 Key: HADOOP-10636
 URL: https://issues.apache.org/jira/browse/HADOOP-10636
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: HADOOP-10388
Reporter: Wenwu Peng
Assignee: Wenwu Peng
 Attachments: HADOOP-10636-pnative.001.patch, 
 HADOOP-10636-pnative.002.patch, HADOOP-10636-pnative.003.patch, 
 HADOOP-10636-pnative.004.patch, HADOOP-10636-pnative.005.patch, 
 HADOOP-10636-pnative.006.patch, HADOOP-10636-pnative.007.patch, 
 HADOOP-10636-pnative.008.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10636) Native Hadoop Client:add unit test case for callclient_id

2014-06-17 Thread Binglin Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Binglin Chang updated HADOOP-10636:
---

Attachment: HADOOP-10636-pnative.008-commit.patch

committed. Thanks, Wenwu.

 Native Hadoop Client:add unit test case for callclient_id
 --

 Key: HADOOP-10636
 URL: https://issues.apache.org/jira/browse/HADOOP-10636
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: HADOOP-10388
Reporter: Wenwu Peng
Assignee: Wenwu Peng
 Attachments: HADOOP-10636-pnative.001.patch, 
 HADOOP-10636-pnative.002.patch, HADOOP-10636-pnative.003.patch, 
 HADOOP-10636-pnative.004.patch, HADOOP-10636-pnative.005.patch, 
 HADOOP-10636-pnative.006.patch, HADOOP-10636-pnative.007.patch, 
 HADOOP-10636-pnative.008-commit.patch, HADOOP-10636-pnative.008.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (HADOOP-10636) Native Hadoop Client:add unit test case for callclient_id

2014-06-17 Thread Binglin Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Binglin Chang resolved HADOOP-10636.


   Resolution: Fixed
Fix Version/s: HADOOP-10388

 Native Hadoop Client:add unit test case for callclient_id
 --

 Key: HADOOP-10636
 URL: https://issues.apache.org/jira/browse/HADOOP-10636
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: HADOOP-10388
Reporter: Wenwu Peng
Assignee: Wenwu Peng
 Fix For: HADOOP-10388

 Attachments: HADOOP-10636-pnative.001.patch, 
 HADOOP-10636-pnative.002.patch, HADOOP-10636-pnative.003.patch, 
 HADOOP-10636-pnative.004.patch, HADOOP-10636-pnative.005.patch, 
 HADOOP-10636-pnative.006.patch, HADOOP-10636-pnative.007.patch, 
 HADOOP-10636-pnative.008-commit.patch, HADOOP-10636-pnative.008.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10699) Fix build native library on mac osx

2014-06-16 Thread Binglin Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Binglin Chang updated HADOOP-10699:
---

Attachment: HADOOP-10699-common.v3.patch

 Fix build native library on mac osx
 ---

 Key: HADOOP-10699
 URL: https://issues.apache.org/jira/browse/HADOOP-10699
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Kirill A. Korinskiy
Assignee: Binglin Chang
 Attachments: HADOOP-10699-common.v3.patch, 
 HADOOP-9648-native-osx.1.0.4.patch, HADOOP-9648-native-osx.1.1.2.patch, 
 HADOOP-9648-native-osx.1.2.0.patch, 
 HADOOP-9648-native-osx.2.0.5-alpha-rc1.patch, HADOOP-9648.v2.patch


 Some patches for fixing build a hadoop native library on os x 10.7/10.8.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HADOOP-10706) fix some bug related to hrpc_sync_ctx

2014-06-16 Thread Binglin Chang (JIRA)
Binglin Chang created HADOOP-10706:
--

 Summary: fix some bug related to hrpc_sync_ctx
 Key: HADOOP-10706
 URL: https://issues.apache.org/jira/browse/HADOOP-10706
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Binglin Chang
Assignee: Binglin Chang


1. 
{code}
memset(ctx, 0, sizeof(ctx));
 return ctx;
{code}

Doing this will alway make return value to 0

2.
hrpc_release_sync_ctx should changed to hrpc_proxy_release_sync_ctx, all the 
functions in this .h/.c file follow this rule






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10706) fix some bug related to hrpc_sync_ctx

2014-06-16 Thread Binglin Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Binglin Chang updated HADOOP-10706:
---

Attachment: HADOOP-10706.v1.patch

 fix some bug related to hrpc_sync_ctx
 -

 Key: HADOOP-10706
 URL: https://issues.apache.org/jira/browse/HADOOP-10706
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Binglin Chang
Assignee: Binglin Chang
 Attachments: HADOOP-10706.v1.patch


 1. 
 {code}
 memset(ctx, 0, sizeof(ctx));
  return ctx;
 {code}
 Doing this will alway make return value to 0
 2.
 hrpc_release_sync_ctx should changed to hrpc_proxy_release_sync_ctx, all the 
 functions in this .h/.c file follow this rule



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10668) TestZKFailoverControllerStress#testExpireBackAndForth occasionally fails

2014-06-16 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14032269#comment-14032269
 ] 

Binglin Chang commented on HADOOP-10668:


The test failed again in HDFS-5574
https://builds.apache.org/job/PreCommit-HDFS-Build/7127//testReport/org.apache.hadoop.ha/TestZKFailoverControllerStress/testExpireBackAndForth/


 TestZKFailoverControllerStress#testExpireBackAndForth occasionally fails
 

 Key: HADOOP-10668
 URL: https://issues.apache.org/jira/browse/HADOOP-10668
 Project: Hadoop Common
  Issue Type: Test
Reporter: Ted Yu
Priority: Minor
  Labels: test

 From 
 https://builds.apache.org/job/PreCommit-HADOOP-Build/4018//testReport/org.apache.hadoop.ha/TestZKFailoverControllerStress/testExpireBackAndForth/
  :
 {code}
 org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode
   at org.apache.zookeeper.server.DataTree.getData(DataTree.java:648)
   at org.apache.zookeeper.server.ZKDatabase.getData(ZKDatabase.java:371)
   at 
 org.apache.hadoop.ha.MiniZKFCCluster.expireActiveLockHolder(MiniZKFCCluster.java:199)
   at 
 org.apache.hadoop.ha.MiniZKFCCluster.expireAndVerifyFailover(MiniZKFCCluster.java:234)
   at 
 org.apache.hadoop.ha.TestZKFailoverControllerStress.testExpireBackAndForth(TestZKFailoverControllerStress.java:84)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Moved] (HADOOP-10699) Fix build native library on mac osx

2014-06-14 Thread Binglin Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Binglin Chang moved YARN-2160 to HADOOP-10699:
--

Key: HADOOP-10699  (was: YARN-2160)
Project: Hadoop Common  (was: Hadoop YARN)

 Fix build native library on mac osx
 ---

 Key: HADOOP-10699
 URL: https://issues.apache.org/jira/browse/HADOOP-10699
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Kirill A. Korinskiy
Assignee: Binglin Chang
 Attachments: HADOOP-9648-native-osx.1.0.4.patch, 
 HADOOP-9648-native-osx.1.1.2.patch, HADOOP-9648-native-osx.1.2.0.patch, 
 HADOOP-9648-native-osx.2.0.5-alpha-rc1.patch, HADOOP-9648.v2.patch


 Some patches for fixing build a hadoop native library on os x 10.7/10.8.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HADOOP-10700) Fix build on macosx: HDFS parts

2014-06-14 Thread Binglin Chang (JIRA)
Binglin Chang created HADOOP-10700:
--

 Summary: Fix build on macosx: HDFS parts
 Key: HADOOP-10700
 URL: https://issues.apache.org/jira/browse/HADOOP-10700
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Binglin Chang
Assignee: Binglin Chang
Priority: Minor


When compiling native code on macosx using clang, compiler find more warning 
and errors which gcc ignores, those should be fixed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10700) Fix build on macosx: HDFS parts

2014-06-14 Thread Binglin Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Binglin Chang updated HADOOP-10700:
---

Issue Type: Bug  (was: Sub-task)
Parent: (was: HADOOP-10699)

 Fix build on macosx: HDFS parts
 ---

 Key: HADOOP-10700
 URL: https://issues.apache.org/jira/browse/HADOOP-10700
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Binglin Chang
Assignee: Binglin Chang
Priority: Minor

 When compiling native code on macosx using clang, compiler find more warning 
 and errors which gcc ignores, those should be fixed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10699) Fix build native library on mac osx

2014-06-14 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14031628#comment-14031628
 ] 

Binglin Chang commented on HADOOP-10699:


Created HDFS-6534 for HDFS parts, and YARN-2161 for YARN parts, this jira is 
changed to on cover Common parts

 Fix build native library on mac osx
 ---

 Key: HADOOP-10699
 URL: https://issues.apache.org/jira/browse/HADOOP-10699
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Kirill A. Korinskiy
Assignee: Binglin Chang
 Attachments: HADOOP-9648-native-osx.1.0.4.patch, 
 HADOOP-9648-native-osx.1.1.2.patch, HADOOP-9648-native-osx.1.2.0.patch, 
 HADOOP-9648-native-osx.2.0.5-alpha-rc1.patch, HADOOP-9648.v2.patch


 Some patches for fixing build a hadoop native library on os x 10.7/10.8.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-9648) Fix build native library on mac osx

2014-06-09 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14026076#comment-14026076
 ] 

Binglin Chang commented on HADOOP-9648:
---

Hi [~vinodkv] or [~jlowe], I see the the latest container-executor changes are 
done by you, currently the main concern of this jira is yarn related native 
code changes, you seems the right person to ask for help, could you give some 
comments about this?


 Fix build native library on mac osx
 ---

 Key: HADOOP-9648
 URL: https://issues.apache.org/jira/browse/HADOOP-9648
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 1.0.4, 1.2.0, 1.1.2, 2.0.5-alpha
Reporter: Kirill A. Korinskiy
Assignee: Binglin Chang
 Attachments: HADOOP-9648-native-osx.1.0.4.patch, 
 HADOOP-9648-native-osx.1.1.2.patch, HADOOP-9648-native-osx.1.2.0.patch, 
 HADOOP-9648-native-osx.2.0.5-alpha-rc1.patch, HADOOP-9648.v2.patch


 Some patches for fixing build a hadoop native library on os x 10.7/10.8.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10640) Implement Namenode RPCs in HDFS native client

2014-06-08 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14021649#comment-14021649
 ] 

Binglin Chang commented on HADOOP-10640:


 sizeof(struct hrpc_proxy) always larger than RPC_PROXY_USERDATA_MAX?
{code}
void *hrpc_proxy_alloc_userdata(struct hrpc_proxy *proxy, size_t size)
{
if (size  RPC_PROXY_USERDATA_MAX) {
return NULL;
}
return proxy-userdata;
}

struct hrpc_sync_ctx *hrpc_proxy_alloc_sync_ctx(struct hrpc_proxy *proxy)
{
struct hrpc_sync_ctx *ctx = 
hrpc_proxy_alloc_userdata(proxy, sizeof(struct hrpc_proxy));
if (!ctx) {
return NULL;
}
if (uv_sem_init(ctx-sem, 0)) {
return NULL;
}
memset(ctx, 0, sizeof(ctx));
return ctx;
}
{code}

 Implement Namenode RPCs in HDFS native client
 -

 Key: HADOOP-10640
 URL: https://issues.apache.org/jira/browse/HADOOP-10640
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: native
Affects Versions: HADOOP-10388
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HADOOP-10640-pnative.001.patch, 
 HADOOP-10640-pnative.002.patch, HADOOP-10640-pnative.003.patch


 Implement the parts of libhdfs that just involve making RPCs to the Namenode, 
 such as mkdir, rename, etc.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10640) Implement Namenode RPCs in HDFS native client

2014-06-05 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14018861#comment-14018861
 ] 

Binglin Chang commented on HADOOP-10640:


In hdfs.h:
{code}
#if defined(unix) || defined(__MACH__)
{code}

bq. I don't really like the typedefs. They make it hard to forward-declare 
structures in header files.
I see some methods in ndfs/jnifs use typedef, some use structs, as long as they 
are uniform in impls.

 Implement Namenode RPCs in HDFS native client
 -

 Key: HADOOP-10640
 URL: https://issues.apache.org/jira/browse/HADOOP-10640
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: native
Affects Versions: HADOOP-10388
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HADOOP-10640-pnative.001.patch, 
 HADOOP-10640-pnative.002.patch


 Implement the parts of libhdfs that just involve making RPCs to the Namenode, 
 such as mkdir, rename, etc.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10640) Implement Namenode RPCs in HDFS native client

2014-06-03 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14016318#comment-14016318
 ] 

Binglin Chang commented on HADOOP-10640:


Thanks for the patch Colin. Have not finished reviewing, some comments:

CMakeLists.txt: #add_subdirectory(fs) fs/CMakeLists.txt redundant?
CMakeLists.txt:  #include(Libhdfs.cmake)  do not have this file
CMakeList.txt: -fvisibility=hidden macosx also support this(if 
(${CMAKE_SYSTEM_NAME} MATCHES Darwin) to detected), you can add this or I can 
add it later.
fs/fs.h:136 use hdfsFile/hdfsFs instead of struct hdfsFile_internal */struct 
hdfs_internal ?
config.h.cmake HCONF_XML_TEST_PATH we can set CLASSPATH env in tests, it better 
than static config macro

when compiling(looks like clang can find more code bugs than gcc...):

should add ${JNI_INCLUDE_DIRS} in include_directories:

In file included from 
/Users/decster/projects/hadoop-trunk/hadoop-native-core/test/native_mini_dfs.c:21:
/Users/decster/projects/hadoop-trunk/hadoop-native-core/jni/exception.h:37:10: 
fatal error: 'jni.h' file not found
#include jni.h


should unified to tTime:

/Users/decster/projects/hadoop-trunk/hadoop-native-core/ndfs/ndfs.c:1055:14: 
warning: incompatible pointer types initializing 'int (*)(struct hdfs_internal 
*, const char *, int64_t, int64_t)' with
  an expression of type 'int (hdfsFS, const char *, tTime, tTime)' 
[-Wincompatible-pointer-types]
.utime = ndfs_utime,
 ^~

wrong memset usage:

/Users/decster/projects/hadoop-trunk/hadoop-native-core/fs/common.c:39:36: 
warning: 'memset' call operates on objects of type 'hdfsFileInfo' (aka 'struct 
file_info') while the size is based on a
  different type 'hdfsFileInfo *' (aka 'struct file_info *') 
[-Wsizeof-pointer-memaccess]
memset(hdfsFileInfo, 0, sizeof(hdfsFileInfo));
   ^~~~
/Users/decster/projects/hadoop-trunk/hadoop-native-core/rpc/proxy.c:102:27: 
warning: 'memset' call operates on objects of type 'struct hrpc_sync_ctx' while 
the size is based on a different type
  'struct hrpc_sync_ctx *' [-Wsizeof-pointer-memaccess]
memset(ctx, 0, sizeof(ctx));
   ~~~^~~




 Implement Namenode RPCs in HDFS native client
 -

 Key: HADOOP-10640
 URL: https://issues.apache.org/jira/browse/HADOOP-10640
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: native
Affects Versions: HADOOP-10388
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HADOOP-10640-pnative.001.patch


 Implement the parts of libhdfs that just involve making RPCs to the Namenode, 
 such as mkdir, rename, etc.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10640) Implement Namenode RPCs in HDFS native client

2014-06-03 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14016319#comment-14016319
 ] 

Binglin Chang commented on HADOOP-10640:


bad format... re submit comments.

{noformat}
CMakeLists.txt: #add_subdirectory(fs) fs/CMakeLists.txt redundant?
CMakeLists.txt:  #include(Libhdfs.cmake)  do not have this file
CMakeList.txt: -fvisibility=hidden macosx also support this(if 
(${CMAKE_SYSTEM_NAME} MATCHES Darwin) to detected), you can add this or I can 
add it later.
fs/fs.h:136 use hdfsFile/hdfsFs instead of struct hdfsFile_internal */struct 
hdfs_internal ?
config.h.cmake HCONF_XML_TEST_PATH we can set CLASSPATH env in tests, it better 
than static config macro
{noformat}

when compiling(looks like clang can find more code bugs than gcc...):

{noformat}
should add ${JNI_INCLUDE_DIRS} in include_directories:

In file included from 
/Users/decster/projects/hadoop-trunk/hadoop-native-core/test/native_mini_dfs.c:21:
/Users/decster/projects/hadoop-trunk/hadoop-native-core/jni/exception.h:37:10: 
fatal error: 'jni.h' file not found
#include jni.h

{noformat}

should unified to tTime:

{noformat}
/Users/decster/projects/hadoop-trunk/hadoop-native-core/ndfs/ndfs.c:1055:14: 
warning: incompatible pointer types initializing 'int (*)(struct hdfs_internal 
*, const char *, int64_t, int64_t)' with
  an expression of type 'int (hdfsFS, const char *, tTime, tTime)' 
[-Wincompatible-pointer-types]
.utime = ndfs_utime,
 ^~
{noformat}

wrong memset usage:

{noformat}

/Users/decster/projects/hadoop-trunk/hadoop-native-core/fs/common.c:39:36: 
warning: 'memset' call operates on objects of type 'hdfsFileInfo' (aka 'struct 
file_info') while the size is based on a
  different type 'hdfsFileInfo *' (aka 'struct file_info *') 
[-Wsizeof-pointer-memaccess]
memset(hdfsFileInfo, 0, sizeof(hdfsFileInfo));
   ^~~~
/Users/decster/projects/hadoop-trunk/hadoop-native-core/rpc/proxy.c:102:27: 
warning: 'memset' call operates on objects of type 'struct hrpc_sync_ctx' while 
the size is based on a different type
  'struct hrpc_sync_ctx *' [-Wsizeof-pointer-memaccess]
memset(ctx, 0, sizeof(ctx));
   ~~~^~~

{noformat}

 Implement Namenode RPCs in HDFS native client
 -

 Key: HADOOP-10640
 URL: https://issues.apache.org/jira/browse/HADOOP-10640
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: native
Affects Versions: HADOOP-10388
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HADOOP-10640-pnative.001.patch


 Implement the parts of libhdfs that just involve making RPCs to the Namenode, 
 such as mkdir, rename, etc.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10631) Native Hadoop Client: make clean should remove pb-c.h.s files

2014-05-29 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14012125#comment-14012125
 ] 

Binglin Chang commented on HADOOP-10631:


Thanks for the review, Colin.

 Native Hadoop Client: make clean should remove pb-c.h.s files
 -

 Key: HADOOP-10631
 URL: https://issues.apache.org/jira/browse/HADOOP-10631
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: HADOOP-10388
Reporter: Binglin Chang
Assignee: Binglin Chang
Priority: Trivial
 Fix For: HADOOP-10388

 Attachments: HADOOP-10631.v1.patch


 In GenerateProtobufs.cmake, pb-c.h.s files are not added to output, so when 
 make clean is called, those files are not cleaned. 
 {code}
  add_custom_command(
 OUTPUT ${PB_C_FILE} ${PB_H_FILE} ${CALL_C_FILE} ${CALL_H_FILE}
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10444) add pom.xml infrastructure for hadoop-native-core

2014-05-29 Thread Binglin Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Binglin Chang updated HADOOP-10444:
---

Attachment: HADOOP-10444.v1.patch

Changes: 
1. change project structure to maven, move all code from hadoop-native-core to 
hadoop-native-core/src/main/native
2. change build dir from hadoop-native-core to hadoop-native-core/target/native
3. add pom artifact hadoop-native-core, make according changes in 
hadoop-dist/pom.xml hadoop-project/pom.xml and pom.xml
4. add a new assembly hadoop-native-core-dist to copy .h files only.
5. add a new profile: native-core to activate hadoop-native-core 
compile/test/package, I didn't reuse -Pnative, cause currently -Pnative can't 
work on MacOSX.

When invoke using: mvn package -Pdist -DskipTests -Pnative -Pnative-core
native client libraries are packaged in following locations:
{code}
decster@localhost:~/hadoop-trunk$ ll 
hadoop-dist/target/hadoop-3.0.0-SNAPSHOT/{include/common,include/rpc,lib/native}
hadoop-dist/target/hadoop-3.0.0-SNAPSHOT/include/common:
total 76
-rw-rw-r-- 1 decster decster  2380 2014-05-29 00:10 hadoop_err.h
-rw-rw-r-- 1 decster decster  1025 2014-05-29 00:10 net.h
-rw-rw-r-- 1 decster decster 23781 2014-05-29 00:10 queue.h
-rw-rw-r-- 1 decster decster  1349 2014-05-29 00:10 string.h
-rw-rw-r-- 1 decster decster  4936 2014-05-29 00:10 test.h
-rw-rw-r-- 1 decster decster 25776 2014-05-29 00:10 tree.h
-rw-rw-r-- 1 decster decster  1189 2014-05-29 00:10 user.h

hadoop-dist/target/hadoop-3.0.0-SNAPSHOT/include/rpc:
total 36
-rw-rw-r-- 1 decster decster 3479 2014-05-29 00:10 call.h
-rw-rw-r-- 1 decster decster 1561 2014-05-29 00:10 client_id.h
-rw-rw-r-- 1 decster decster 6392 2014-05-29 00:10 conn.h
-rw-rw-r-- 1 decster decster 2939 2014-05-29 00:10 messenger.h
-rw-rw-r-- 1 decster decster 5395 2014-05-29 00:10 proxy.h
-rw-rw-r-- 1 decster decster 3700 2014-05-29 00:10 reactor.h
-rw-rw-r-- 1 decster decster 2647 2014-05-29 00:10 varint.h

hadoop-dist/target/hadoop-3.0.0-SNAPSHOT/lib/native:
total 7464
-rw-rw-r-- 1 decster decster 1239398 2014-05-29 00:10 libhadoop.a
-rw-rw-r-- 1 decster decster 1374292 2014-05-29 00:10 libhadooppipes.a
lrwxrwxrwx 1 decster decster  18 2014-05-29 00:10 libhadoop.so - 
libhadoop.so.1.0.0*
-rwxrwxr-x 1 decster decster  721343 2014-05-29 00:10 libhadoop.so.1.0.0*
-rw-rw-r-- 1 decster decster  453186 2014-05-29 00:10 libhadooputils.a
-rw-rw-r-- 1 decster decster  317370 2014-05-29 00:10 libhdfs.a
lrwxrwxrwx 1 decster decster  17 2014-05-29 00:10 libhdfs-core.so - 
libhdfs-core.so.1*
lrwxrwxrwx 1 decster decster  21 2014-05-29 00:10 libhdfs-core.so.1 - 
libhdfs-core.so.1.0.0*
-rwxrwxr-x 1 decster decster 2747719 2014-05-29 00:10 libhdfs-core.so.1.0.0*
lrwxrwxrwx 1 decster decster  16 2014-05-29 00:10 libhdfs.so - 
libhdfs.so.0.0.0*
-rwxrwxr-x 1 decster decster  212967 2014-05-29 00:10 libhdfs.so.0.0.0*
lrwxrwxrwx 1 decster decster  17 2014-05-29 00:10 libyarn-core.so - 
libyarn-core.so.1*
lrwxrwxrwx 1 decster decster  21 2014-05-29 00:10 libyarn-core.so.1 - 
libyarn-core.so.1.0.0*
-rwxrwxr-x 1 decster decster  564923 2014-05-29 00:10 libyarn-core.so.1.0.0*
{code}

 add pom.xml infrastructure for hadoop-native-core
 -

 Key: HADOOP-10444
 URL: https://issues.apache.org/jira/browse/HADOOP-10444
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Colin Patrick McCabe
Assignee: Binglin Chang
 Attachments: HADOOP-10444.v1.patch


 Add pom.xml infrastructure for hadoop-native-core, so that it builds under 
 Maven.  We can look to how we integrated CMake into hadoop-hdfs-project and 
 hadoop-common-project for inspiration here.  In the long term, it would be 
 nice to use a Maven plugin here (see HADOOP-8887)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10444) add pom.xml infrastructure for hadoop-native-core

2014-05-29 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14012166#comment-14012166
 ] 

Binglin Chang commented on HADOOP-10444:


Hi Colin, the patch moves all the code, so it's better to get this review and 
committed soon to avoid conflicts.

 add pom.xml infrastructure for hadoop-native-core
 -

 Key: HADOOP-10444
 URL: https://issues.apache.org/jira/browse/HADOOP-10444
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Colin Patrick McCabe
Assignee: Binglin Chang
 Attachments: HADOOP-10444.v1.patch


 Add pom.xml infrastructure for hadoop-native-core, so that it builds under 
 Maven.  We can look to how we integrated CMake into hadoop-hdfs-project and 
 hadoop-common-project for inspiration here.  In the long term, it would be 
 nice to use a Maven plugin here (see HADOOP-8887)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10444) add pom.xml infrastructure for hadoop-native-core

2014-05-29 Thread Binglin Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Binglin Chang updated HADOOP-10444:
---

Attachment: HADOOP-10444.v2.patch

Thanks for the review Colin. I updated the patch addressing your comments. 
Changes:
1. change -Pnative-core to -Pnative
2. remove assembly,  we can add hdfs.h(or just share old one) and yarn.h latter 
when we have them. 


 add pom.xml infrastructure for hadoop-native-core
 -

 Key: HADOOP-10444
 URL: https://issues.apache.org/jira/browse/HADOOP-10444
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Colin Patrick McCabe
Assignee: Binglin Chang
 Attachments: HADOOP-10444.v1.patch, HADOOP-10444.v2.patch


 Add pom.xml infrastructure for hadoop-native-core, so that it builds under 
 Maven.  We can look to how we integrated CMake into hadoop-hdfs-project and 
 hadoop-common-project for inspiration here.  In the long term, it would be 
 nice to use a Maven plugin here (see HADOOP-8887)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10444) add pom.xml infrastructure for hadoop-native-core

2014-05-29 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013311#comment-14013311
 ] 

Binglin Chang commented on HADOOP-10444:


bq. If there is stuff that doesn't work on MacOS, we should just fix that stuff.
This reminds me of HADOOP-9648, as you have done lot of native library stuff, 
could you help take a look?


 add pom.xml infrastructure for hadoop-native-core
 -

 Key: HADOOP-10444
 URL: https://issues.apache.org/jira/browse/HADOOP-10444
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Colin Patrick McCabe
Assignee: Binglin Chang
 Attachments: HADOOP-10444.v1.patch, HADOOP-10444.v2.patch


 Add pom.xml infrastructure for hadoop-native-core, so that it builds under 
 Maven.  We can look to how we integrated CMake into hadoop-hdfs-project and 
 hadoop-common-project for inspiration here.  In the long term, it would be 
 nice to use a Maven plugin here (see HADOOP-8887)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HADOOP-10631) Native Hadoop Client: Add missing output in GenerateProtobufs.cmake

2014-05-27 Thread Binglin Chang (JIRA)
Binglin Chang created HADOOP-10631:
--

 Summary: Native Hadoop Client: Add missing output in 
GenerateProtobufs.cmake
 Key: HADOOP-10631
 URL: https://issues.apache.org/jira/browse/HADOOP-10631
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Binglin Chang
Assignee: Binglin Chang
Priority: Trivial


In GenerateProtobufs.cmake, pb-c.h.s files are not added to output, so when 
make clean is called, those files are not cleaned. 

{code}
 add_custom_command(
OUTPUT ${PB_C_FILE} ${PB_H_FILE} ${CALL_C_FILE} ${CALL_H_FILE}
{code}




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10631) Native Hadoop Client: Add missing output in GenerateProtobufs.cmake

2014-05-27 Thread Binglin Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Binglin Chang updated HADOOP-10631:
---

Affects Version/s: HADOOP-10388

 Native Hadoop Client: Add missing output in GenerateProtobufs.cmake
 ---

 Key: HADOOP-10631
 URL: https://issues.apache.org/jira/browse/HADOOP-10631
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: HADOOP-10388
Reporter: Binglin Chang
Assignee: Binglin Chang
Priority: Trivial
 Attachments: HADOOP-10631.v1.patch


 In GenerateProtobufs.cmake, pb-c.h.s files are not added to output, so when 
 make clean is called, those files are not cleaned. 
 {code}
  add_custom_command(
 OUTPUT ${PB_C_FILE} ${PB_H_FILE} ${CALL_C_FILE} ${CALL_H_FILE}
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10631) Native Hadoop Client: Add missing output in GenerateProtobufs.cmake

2014-05-27 Thread Binglin Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Binglin Chang updated HADOOP-10631:
---

Status: Patch Available  (was: Open)

 Native Hadoop Client: Add missing output in GenerateProtobufs.cmake
 ---

 Key: HADOOP-10631
 URL: https://issues.apache.org/jira/browse/HADOOP-10631
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Binglin Chang
Assignee: Binglin Chang
Priority: Trivial
 Attachments: HADOOP-10631.v1.patch


 In GenerateProtobufs.cmake, pb-c.h.s files are not added to output, so when 
 make clean is called, those files are not cleaned. 
 {code}
  add_custom_command(
 OUTPUT ${PB_C_FILE} ${PB_H_FILE} ${CALL_C_FILE} ${CALL_H_FILE}
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10631) Native Hadoop Client: Add missing output in GenerateProtobufs.cmake

2014-05-27 Thread Binglin Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Binglin Chang updated HADOOP-10631:
---

Target Version/s: HADOOP-10388

 Native Hadoop Client: Add missing output in GenerateProtobufs.cmake
 ---

 Key: HADOOP-10631
 URL: https://issues.apache.org/jira/browse/HADOOP-10631
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: HADOOP-10388
Reporter: Binglin Chang
Assignee: Binglin Chang
Priority: Trivial
 Attachments: HADOOP-10631.v1.patch


 In GenerateProtobufs.cmake, pb-c.h.s files are not added to output, so when 
 make clean is called, those files are not cleaned. 
 {code}
  add_custom_command(
 OUTPUT ${PB_C_FILE} ${PB_H_FILE} ${CALL_C_FILE} ${CALL_H_FILE}
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10631) Native Hadoop Client: Add missing output in GenerateProtobufs.cmake

2014-05-27 Thread Binglin Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Binglin Chang updated HADOOP-10631:
---

Attachment: HADOOP-10631.v1.patch

 Native Hadoop Client: Add missing output in GenerateProtobufs.cmake
 ---

 Key: HADOOP-10631
 URL: https://issues.apache.org/jira/browse/HADOOP-10631
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: HADOOP-10388
Reporter: Binglin Chang
Assignee: Binglin Chang
Priority: Trivial
 Attachments: HADOOP-10631.v1.patch


 In GenerateProtobufs.cmake, pb-c.h.s files are not added to output, so when 
 make clean is called, those files are not cleaned. 
 {code}
  add_custom_command(
 OUTPUT ${PB_C_FILE} ${PB_H_FILE} ${CALL_C_FILE} ${CALL_H_FILE}
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10564) Add username to native RPCv9 client

2014-05-16 Thread Binglin Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Binglin Chang updated HADOOP-10564:
---

Attachment: HADOOP-10564-pnative.006.patch

patch lgtm +1
update the patch one line to match latest branch HEAD. 

 Add username to native RPCv9 client
 ---

 Key: HADOOP-10564
 URL: https://issues.apache.org/jira/browse/HADOOP-10564
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: native
Affects Versions: HADOOP-10388
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Fix For: HADOOP-10388

 Attachments: HADOOP-10564-pnative.002.patch, 
 HADOOP-10564-pnative.003.patch, HADOOP-10564-pnative.004.patch, 
 HADOOP-10564-pnative.005.patch, HADOOP-10564-pnative.006.patch, 
 HADOOP-10564.001.patch


 Add the ability for the native RPCv9 client to set a username when initiating 
 a connection.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10564) Add username to native RPCv9 client

2014-05-15 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13992599#comment-13992599
 ] 

Binglin Chang commented on HADOOP-10564:


Hi Colin, about user.h, we may need a struct to represent user(like ugi in 
hadoop), so in the future more things can be added to it, like auth method, 
tokens, something like:
struct hadoop_user;
hadoop_user_(alloc|get_login|free)
It is better to add it when the change is small, thoughts?


 Add username to native RPCv9 client
 ---

 Key: HADOOP-10564
 URL: https://issues.apache.org/jira/browse/HADOOP-10564
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: native
Affects Versions: HADOOP-10388
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HADOOP-10564-pnative.002.patch, HADOOP-10564.001.patch


 Add the ability for the native RPCv9 client to set a username when initiating 
 a connection.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (HADOOP-10577) Fix some minors error and compile on macosx

2014-05-15 Thread Binglin Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Binglin Chang resolved HADOOP-10577.


Resolution: Fixed

 Fix some minors error and compile on macosx
 ---

 Key: HADOOP-10577
 URL: https://issues.apache.org/jira/browse/HADOOP-10577
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Binglin Chang
Assignee: Binglin Chang
Priority: Minor
 Attachments: HADOOP-10577.v1.patch, HADOOP-10577.v2.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10389) Native RPCv9 client

2014-05-15 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998406#comment-13998406
 ] 

Binglin Chang commented on HADOOP-10389:


I think TestRPC.testSlowRpc should only use one socket based on socket reuse 
logic, so the responses also go through one socket. The test basically do the 
following:
client call (id=0) - server
client call (id=1) - server
server response(id = 1) - client
client call (id=2) - server
server response(id = 2) - client
server response(id = 0) - client
So client should recognize callid in response handling logic, otherwise the 
response is mismatched(it will return response 1 for call 0)
bq. But last time I investigated it, each TCP socket could only do one request 
at once.
Do you mean the current native code? or java code? 



 Native RPCv9 client
 ---

 Key: HADOOP-10389
 URL: https://issues.apache.org/jira/browse/HADOOP-10389
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: HADOOP-10388
Reporter: Binglin Chang
Assignee: Colin Patrick McCabe
 Attachments: HADOOP-10388.001.patch, HADOOP-10389.002.patch, 
 HADOOP-10389.004.patch, HADOOP-10389.005.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10389) Native RPCv9 client

2014-05-15 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13997356#comment-13997356
 ] 

Binglin Chang commented on HADOOP-10389:


bq. I think the performance is actually going to be pretty good
I am not worried about performance, it just may cause more redundant code, wait 
to see some code then:)
Speak of redundant code, there are too many redundant codes in xxx.call.c, is 
it possible to do this using functions rather than generating repeat code?

bq. The rationale behind call id in general is that in some future version of 
the Java RPC system, we may want to allow multiple calls to be in flight at 
once
I looked closely to the rpc code, looks like concurrent rpc is supported, unit 
test in TestRPC.testSlowRpc test this.


 Native RPCv9 client
 ---

 Key: HADOOP-10389
 URL: https://issues.apache.org/jira/browse/HADOOP-10389
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: HADOOP-10388
Reporter: Binglin Chang
Assignee: Colin Patrick McCabe
 Attachments: HADOOP-10388.001.patch, HADOOP-10389.002.patch, 
 HADOOP-10389.004.patch, HADOOP-10389.005.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10389) Native RPCv9 client

2014-05-15 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998455#comment-13998455
 ] 

Binglin Chang commented on HADOOP-10389:


bq. The point of the generated functions is to provide type safety, so you 
can't pass the wrong request and response types to the functions. It also makes 
remote procedure calls look like a local function call, which is one of the 
main ideas in RPC.
We can keep functions, but the repeated code in these functions can be 
eliminated using abstraction, so as to reduce the binary code size.


 Native RPCv9 client
 ---

 Key: HADOOP-10389
 URL: https://issues.apache.org/jira/browse/HADOOP-10389
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: HADOOP-10388
Reporter: Binglin Chang
Assignee: Colin Patrick McCabe
 Attachments: HADOOP-10388.001.patch, HADOOP-10389.002.patch, 
 HADOOP-10389.004.patch, HADOOP-10389.005.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10577) Fix some minors error and compile on macosx

2014-05-15 Thread Binglin Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Binglin Chang updated HADOOP-10577:
---

Affects Version/s: HADOOP-10388

 Fix some minors error and compile on macosx
 ---

 Key: HADOOP-10577
 URL: https://issues.apache.org/jira/browse/HADOOP-10577
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: HADOOP-10388
Reporter: Binglin Chang
Assignee: Binglin Chang
Priority: Minor
 Attachments: HADOOP-10577.v1.patch, HADOOP-10577.v2.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10389) Native RPCv9 client

2014-05-14 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13992528#comment-13992528
 ] 

Binglin Chang commented on HADOOP-10389:


bq. Hmm. That's odd. No matter how many warning and pedantic options I pass to 
gcc
http://stackoverflow.com/questions/11869593/c99-printf-formatters-vs-c11-user-defined-literals
although it is c code, it could be used in c++



 Native RPCv9 client
 ---

 Key: HADOOP-10389
 URL: https://issues.apache.org/jira/browse/HADOOP-10389
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: HADOOP-10388
Reporter: Binglin Chang
Assignee: Colin Patrick McCabe
 Attachments: HADOOP-10388.001.patch, HADOOP-10389.002.patch, 
 HADOOP-10389.004.patch, HADOOP-10389.005.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10389) Native RPCv9 client

2014-05-13 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996017#comment-13996017
 ] 

Binglin Chang commented on HADOOP-10389:


bq. The rationale behind call id in general is that in some future version of 
the Java RPC system, we may want to allow multiple calls to be in flight at 
once
I guess I always thought this is already implemented, because client already 
can make parallel calls, and there are multiple rpc handler threads in server 
side already, doing this should be natural and easy, although I haven't test 
about this. Are you sure about this? If so I can try to add this in java... 

bq. From the library user's perspective, they are calling hdfsOpen, hdfsClose, 
etc. etc.
So those method all need to initialize hrpc_proxy again(which need server 
address, user and other configs), what I try to say is maybe proxy and call can 
be separated, proxy can be shared, call on stack for each call. Maybe it's to 
late to change that, just my two cents.

bq. You just can't de-allocate the proxy while it is in use.
So there should be a method for user to cancel an ongoing rpc(also need to make 
sure after cancel complete, no more memory access to hrpc_proxy and call), 
looks like hrpc_proxy_deactivate can't do this yet?



 Native RPCv9 client
 ---

 Key: HADOOP-10389
 URL: https://issues.apache.org/jira/browse/HADOOP-10389
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: HADOOP-10388
Reporter: Binglin Chang
Assignee: Colin Patrick McCabe
 Attachments: HADOOP-10388.001.patch, HADOOP-10389.002.patch, 
 HADOOP-10389.004.patch, HADOOP-10389.005.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10564) Add username to native RPCv9 client

2014-05-12 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13994932#comment-13994932
 ] 

Binglin Chang commented on HADOOP-10564:


Hi, Colin, thanks for the patch, some comments:
1. it's hard to remember which field needs free(some are stack alloc, some are 
heap) which didn't, could you add comments of each field's memory ownership?
2. in patch line: 690
{code}
690 +proxy-call.remote = *remote;
691 +proxy-call.remote = *remote;
{code}
3. reactor.c 71  RB_NFIND may always find nothing, based on what the RB tree 
compare method's content(only pointer equal means equal). I am not familiar 
with RB_tree's semantic and the header file doesn't provide any document. And 
hrpc_conn_usable may be redundant cause RB_NFIND already checks those fields.

 Add username to native RPCv9 client
 ---

 Key: HADOOP-10564
 URL: https://issues.apache.org/jira/browse/HADOOP-10564
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: native
Affects Versions: HADOOP-10388
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HADOOP-10564-pnative.002.patch, 
 HADOOP-10564-pnative.003.patch, HADOOP-10564.001.patch


 Add the ability for the native RPCv9 client to set a username when initiating 
 a connection.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10389) Native RPCv9 client

2014-05-12 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13994974#comment-13994974
 ] 

Binglin Chang commented on HADOOP-10389:


Hi Colin, I have some difficulty understanding your code and some comments, 
could you please help explain?
1. to my understanding, rpc client should have a mapcallid, call to record 
all unfinished calls, but I could not find any code assigning callids(only make 
them 0) and manage unfinished calls, could you help me located those logic?
2. in the demo namenode-rpc-unit, I see each proxy only have one call(the 
current call), does this mean client can only call one rpc at the same time? If 
so probably every rpc call will need it's own rpc_proxy, from users standing 
point, they may want what java's interface, multi-thread can concurrently call 
one proxy, this is very common in hdfs client. 
3. hrpc_proxy.call belongs to hrpc_proxy, but in hrpc_proxy_start, the call is 
passed to reactor-inbox.pending_calls, which may have longer life circle than 
hrpc_proxy, so there may be protential bug in hrpc_proxy.call?


 Native RPCv9 client
 ---

 Key: HADOOP-10389
 URL: https://issues.apache.org/jira/browse/HADOOP-10389
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: HADOOP-10388
Reporter: Binglin Chang
Assignee: Colin Patrick McCabe
 Attachments: HADOOP-10388.001.patch, HADOOP-10389.002.patch, 
 HADOOP-10389.004.patch, HADOOP-10389.005.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10389) Native RPCv9 client

2014-05-12 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996075#comment-13996075
 ] 

Binglin Chang commented on HADOOP-10389:


bq. So there should be a method for user to cancel an ongoing rpc
I thought more about this, adding timeout to call also works and seems like a 
better solution.


 Native RPCv9 client
 ---

 Key: HADOOP-10389
 URL: https://issues.apache.org/jira/browse/HADOOP-10389
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: HADOOP-10388
Reporter: Binglin Chang
Assignee: Colin Patrick McCabe
 Attachments: HADOOP-10388.001.patch, HADOOP-10389.002.patch, 
 HADOOP-10389.004.patch, HADOOP-10389.005.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10577) Fix some minors error and compile on macosx

2014-05-11 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13992519#comment-13992519
 ] 

Binglin Chang commented on HADOOP-10577:


Thanks for the review Luke! I have committed this.

 Fix some minors error and compile on macosx
 ---

 Key: HADOOP-10577
 URL: https://issues.apache.org/jira/browse/HADOOP-10577
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Binglin Chang
Assignee: Binglin Chang
Priority: Minor
 Attachments: HADOOP-10577.v1.patch, HADOOP-10577.v2.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10389) Native RPCv9 client

2014-05-07 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13991599#comment-13991599
 ] 

Binglin Chang commented on HADOOP-10389:


Hi Yongjun, thanks for trying the patch, the issue you mentioned is right, 

Actually there are more issues I want discuss, Colin:
1. string and macro need to be separated with space, e.g. % PRId64 xxx, 
some compiler are more strict about this
2. I think it is pretty safe to use %d instead of % PRId32, it seems 
unnecessary do more typing


 Native RPCv9 client
 ---

 Key: HADOOP-10389
 URL: https://issues.apache.org/jira/browse/HADOOP-10389
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: HADOOP-10388
Reporter: Binglin Chang
Assignee: Colin Patrick McCabe
 Attachments: HADOOP-10388.001.patch, HADOOP-10389.002.patch, 
 HADOOP-10389.004.patch, HADOOP-10389.005.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HADOOP-10577) Fix some minors error and compile on macosx

2014-05-06 Thread Binglin Chang (JIRA)
Binglin Chang created HADOOP-10577:
--

 Summary: Fix some minors error and compile on macosx
 Key: HADOOP-10577
 URL: https://issues.apache.org/jira/browse/HADOOP-10577
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Binglin Chang
Assignee: Binglin Chang
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10577) Fix some minors error and compile on macosx

2014-05-06 Thread Binglin Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Binglin Chang updated HADOOP-10577:
---

Attachment: HADOOP-10577.v1.patch

Changes:
1. find_library should not use .so suffix for cross platform compatibility
2. clang/libc++ does not have tr1/memory, just memory
3. wrong printf usage %Zd, should be %zu?

code on current branch can now compile at macosx on my laptop with the change.


 Fix some minors error and compile on macosx
 ---

 Key: HADOOP-10577
 URL: https://issues.apache.org/jira/browse/HADOOP-10577
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Binglin Chang
Assignee: Binglin Chang
Priority: Minor
 Attachments: HADOOP-10577.v1.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10577) Fix some minors error and compile on macosx

2014-05-06 Thread Binglin Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Binglin Chang updated HADOOP-10577:
---

Attachment: HADOOP-10577.v2.patch

update a little, found wrong usage of sem_post/sem_wait

 Fix some minors error and compile on macosx
 ---

 Key: HADOOP-10577
 URL: https://issues.apache.org/jira/browse/HADOOP-10577
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Binglin Chang
Assignee: Binglin Chang
Priority: Minor
 Attachments: HADOOP-10577.v1.patch, HADOOP-10577.v2.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (HADOOP-10497) Doc NodeGroup-aware(HADOOP Virtualization Extensisons)

2014-04-14 Thread Binglin Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Binglin Chang reassigned HADOOP-10497:
--

Assignee: Binglin Chang

 Doc NodeGroup-aware(HADOOP Virtualization Extensisons)
 --

 Key: HADOOP-10497
 URL: https://issues.apache.org/jira/browse/HADOOP-10497
 Project: Hadoop Common
  Issue Type: Task
  Components: documentation
Reporter: wenwupeng
Assignee: Binglin Chang
  Labels: documentation
 Fix For: site


 Most of patches from Umbrella JIRA HADOOP-8468  have committed, However there 
 is no site to introduce NodeGroup-aware(HADOOP Virtualization Extensisons) 
 and how to do configuration. so we need to doc it.
 1.  Doc NodeGroup-aware relate in http://hadoop.apache.org/docs/current 
 2.  Doc NodeGroup-aware properties in core-default.xml.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10497) Add document for node group related configs

2014-04-14 Thread Binglin Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Binglin Chang updated HADOOP-10497:
---

Summary: Add document for node group related configs  (was: Doc 
NodeGroup-aware(HADOOP Virtualization Extensisons))

 Add document for node group related configs
 ---

 Key: HADOOP-10497
 URL: https://issues.apache.org/jira/browse/HADOOP-10497
 Project: Hadoop Common
  Issue Type: Task
  Components: documentation
Reporter: wenwupeng
Assignee: Binglin Chang
  Labels: documentation
 Fix For: site


 Most of patches from Umbrella JIRA HADOOP-8468  have committed, However there 
 is no site to introduce NodeGroup-aware(HADOOP Virtualization Extensisons) 
 and how to do configuration. so we need to doc it.
 1.  Doc NodeGroup-aware relate in http://hadoop.apache.org/docs/current 
 2.  Doc NodeGroup-aware properties in core-default.xml.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10388) Pure native hadoop client

2014-04-01 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13956157#comment-13956157
 ] 

Binglin Chang commented on HADOOP-10388:


bq. We can even make the XML-reading code optional if you want.
Sure, if for compatibility I guess add xml support if fine. By keeping strict 
compatibility we may need to support all javax xml / hadoop config features, 
I'm afraid libexpact/libxml2 support all of those, a lot effort may be spent on 
this, it is better to make it optional and do it later I think.

bq. Thread pools and async I/O, I'm afraid, are something we can't live without.
I am also prefer to use async I/O and thread for performance reasons, the code 
I published on github already have a working HDFS client with read/write, and 
HDFSOuputstream uses an aditional thread. 
What I was saying is use of more threads should be limited, in java client, to 
simply read/write a HDFS file, too much threads are used(rpc socket read/write, 
data transfer socket read/write, other misc executors, lease renewer etc.) 
Since we use async i/o, thread number should be rapidly reduced


 Pure native hadoop client
 -

 Key: HADOOP-10388
 URL: https://issues.apache.org/jira/browse/HADOOP-10388
 Project: Hadoop Common
  Issue Type: New Feature
Affects Versions: HADOOP-10388
Reporter: Binglin Chang
Assignee: Colin Patrick McCabe

 A pure native hadoop client has following use case/advantages:
 1.  writing Yarn applications using c++
 2.  direct access to HDFS, without extra proxy overhead, comparing to web/nfs 
 interface.
 3.  wrap native library to support more languages, e.g. python
 4.  lightweight, small footprint compare to several hundred MB of JDK and 
 hadoop library with various dependencies.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10388) Pure native hadoop client

2014-03-27 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13949062#comment-13949062
 ] 

Binglin Chang commented on HADOOP-10388:


Thanks for posting this Colin, looking into the code right now. [~wenwu] and I 
both got branch committer invitation today. His is interest in providing more 
test for the feature. 
About the code and created sub-jiras, here are some initial questions:
# What will the project structure looks like? A separate top-level 
hadoop-native-client-project? Or seperate code files in common/hdfs/yarn 
existing dirs?
# Why the name libhdfs-core.so and libyarn-core.so? it's a client library, 
doesn't sounds like core.
# I'm surprised the code turn to pure c, it seems because of this, we are 
introducing strange libraries and tools(protobuf-c(last release in 2011) and 
the tool shorten),  about test library, cpp library gtest is not going to be 
used too? In short, what libraries are planned to be used?
# I like the library to be lightweight, some people just want a header file and 
a static linked library(a few MB in size), to be able to read/write from hdfs, 
so some heavy feature: xml library(config file parsing), uri parsing(cross 
FileSystem symlink), thread pool better be optional, not required.



 Pure native hadoop client
 -

 Key: HADOOP-10388
 URL: https://issues.apache.org/jira/browse/HADOOP-10388
 Project: Hadoop Common
  Issue Type: New Feature
Affects Versions: HADOOP-10388
Reporter: Binglin Chang
Assignee: Colin Patrick McCabe

 A pure native hadoop client has following use case/advantages:
 1.  writing Yarn applications using c++
 2.  direct access to HDFS, without extra proxy overhead, comparing to web/nfs 
 interface.
 3.  wrap native library to support more languages, e.g. python
 4.  lightweight, small footprint compare to several hundred MB of JDK and 
 hadoop library with various dependencies.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10388) Pure native hadoop client

2014-03-26 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1394#comment-1394
 ] 

Binglin Chang commented on HADOOP-10388:


Hi Colin, what's the status of the work now? Could you post some of the work so 
other's can cooperate? 

 Pure native hadoop client
 -

 Key: HADOOP-10388
 URL: https://issues.apache.org/jira/browse/HADOOP-10388
 Project: Hadoop Common
  Issue Type: New Feature
Reporter: Binglin Chang
Assignee: Colin Patrick McCabe

 A pure native hadoop client has following use case/advantages:
 1.  writing Yarn applications using c++
 2.  direct access to HDFS, without extra proxy overhead, comparing to web/nfs 
 interface.
 3.  wrap native library to support more languages, e.g. python
 4.  lightweight, small footprint compare to several hundred MB of JDK and 
 hadoop library with various dependencies.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HADOOP-10411) TestCacheDirectives.testExceedsCapacity fails occasionally

2014-03-18 Thread Binglin Chang (JIRA)
Binglin Chang created HADOOP-10411:
--

 Summary: TestCacheDirectives.testExceedsCapacity fails occasionally
 Key: HADOOP-10411
 URL: https://issues.apache.org/jira/browse/HADOOP-10411
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Binglin Chang
Priority: Minor


See this 
[link|https://issues.apache.org/jira/browse/HADOOP-10390?focusedCommentId=13932236page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13932236],
 error message:
{code}
Namenode should not send extra CACHE commands expected:0 but was:2
Stacktrace
java.lang.AssertionError: Namenode should not send extra CACHE commands 
expected:0 but was:2
  at org.junit.Assert.fail(Assert.java:93)
  at org.junit.Assert.failNotEquals(Assert.java:647)
  at org.junit.Assert.assertEquals(Assert.java:128)
  at org.junit.Assert.assertEquals(Assert.java:472)
  at 
org.apache.hadoop.hdfs.server.namenode.TestCacheDirectives.testExceedsCapacity(TestCacheDirectives.java:1413)
{code}




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10390) DFSCIOTest looks for the wrong version of libhdfs

2014-03-18 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938904#comment-13938904
 ] 

Binglin Chang commented on HADOOP-10390:


After apply the patch, libhdfs.so.0.0.0, test_libhdfs_read/write are correctly 
copied to dest location, but the DFSCIOTest job failed because the task failed 
due to libhdfs internal error, maybe it is my environment's problem, have not 
found the root cause yet, unlikely it is caused by the issue here.

 DFSCIOTest looks for the wrong version of libhdfs
 -

 Key: HADOOP-10390
 URL: https://issues.apache.org/jira/browse/HADOOP-10390
 Project: Hadoop Common
  Issue Type: Bug
  Components: test
Affects Versions: 2.0.0-alpha
Reporter: wenwupeng
Assignee: Binglin Chang
 Attachments: HADOOP-10390.v1.patch, HADOOP-10390.v2.patch, 
 HADOOP-10390.v3.patch


 Run benckmark DFSCIOTest failed at libhdfs.so.1
 hadoop jar 
 ./share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.3.0-tests.jar 
 DFSCIOTest -write -nrFiles 1 -fileSize 100
 DFSCIOTest.0.0.1
 14/03/06 02:52:55 INFO fs.DFSCIOTest: nrFiles = 1
 14/03/06 02:52:55 INFO fs.DFSCIOTest: fileSize (MB) = 100
 14/03/06 02:52:55 INFO fs.DFSCIOTest: bufferSize = 100
 14/03/06 02:52:55 WARN util.NativeCodeLoader: Unable to load native-hadoop 
 library for your platform... using builtin-java classes where applicable
 File /hadoop/hadoop-smoke/libhdfs/libhdfs.so.1 does not exist
 can get libhdfs.so.0.0.0 under ./lib/native
 [root@namenode hadoop-smoke]# find ./ -name libhdfs*
 ./lib/native/libhdfs.so
 ./lib/native/libhdfs.so.0.0.0
 ./lib/native/libhdfs.a



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10390) DFSCIOTest looks for the wrong version of libhdfs

2014-03-13 Thread Binglin Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Binglin Chang updated HADOOP-10390:
---

Attachment: HADOOP-10390.v3.patch

update patch, change test_libhdfs_read/write to HADOOP_HOME/bin


 DFSCIOTest looks for the wrong version of libhdfs
 -

 Key: HADOOP-10390
 URL: https://issues.apache.org/jira/browse/HADOOP-10390
 Project: Hadoop Common
  Issue Type: Bug
  Components: test
Affects Versions: 2.0.0-alpha
Reporter: wenwupeng
Assignee: Binglin Chang
 Attachments: HADOOP-10390.v1.patch, HADOOP-10390.v2.patch, 
 HADOOP-10390.v3.patch


 Run benckmark DFSCIOTest failed at libhdfs.so.1
 hadoop jar 
 ./share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.3.0-tests.jar 
 DFSCIOTest -write -nrFiles 1 -fileSize 100
 DFSCIOTest.0.0.1
 14/03/06 02:52:55 INFO fs.DFSCIOTest: nrFiles = 1
 14/03/06 02:52:55 INFO fs.DFSCIOTest: fileSize (MB) = 100
 14/03/06 02:52:55 INFO fs.DFSCIOTest: bufferSize = 100
 14/03/06 02:52:55 WARN util.NativeCodeLoader: Unable to load native-hadoop 
 library for your platform... using builtin-java classes where applicable
 File /hadoop/hadoop-smoke/libhdfs/libhdfs.so.1 does not exist
 can get libhdfs.so.0.0.0 under ./lib/native
 [root@namenode hadoop-smoke]# find ./ -name libhdfs*
 ./lib/native/libhdfs.so
 ./lib/native/libhdfs.so.0.0.0
 ./lib/native/libhdfs.a



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10388) Pure native hadoop client

2014-03-12 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13931544#comment-13931544
 ] 

Binglin Chang commented on HADOOP-10388:


Hi Colin, I see you assign all the jira to yourself now. Thanks for taking this 
effort. 
I create this jira mainly because I want to help some of the development here. 
Do you have plan/idea on how to proceed the work?


 Pure native hadoop client
 -

 Key: HADOOP-10388
 URL: https://issues.apache.org/jira/browse/HADOOP-10388
 Project: Hadoop Common
  Issue Type: New Feature
Reporter: Binglin Chang
Assignee: Colin Patrick McCabe

 A pure native hadoop client has following use case/advantages:
 1.  writing Yarn applications using c++
 2.  direct access to HDFS, without extra proxy overhead, comparing to web/nfs 
 interface.
 3.  wrap native library to support more languages, e.g. python
 4.  lightweight, small footprint compare to several hundred MB of JDK and 
 hadoop library with various dependencies.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10390) DFSCIOTest looks for the wrong version of libhdfs

2014-03-12 Thread Binglin Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Binglin Chang updated HADOOP-10390:
---

Attachment: HADOOP-10390.v2.patch

Attach new version of the patch addressing all 3 issues, I put 
test_libhdfs_read/write to HADOOP_HOME/lib/native as suggested.


 DFSCIOTest looks for the wrong version of libhdfs
 -

 Key: HADOOP-10390
 URL: https://issues.apache.org/jira/browse/HADOOP-10390
 Project: Hadoop Common
  Issue Type: Bug
  Components: test
Affects Versions: 2.0.0-alpha
Reporter: wenwupeng
Assignee: Binglin Chang
 Attachments: HADOOP-10390.v1.patch, HADOOP-10390.v2.patch


 Run benckmark DFSCIOTest failed at libhdfs.so.1
 hadoop jar 
 ./share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.3.0-tests.jar 
 DFSCIOTest -write -nrFiles 1 -fileSize 100
 DFSCIOTest.0.0.1
 14/03/06 02:52:55 INFO fs.DFSCIOTest: nrFiles = 1
 14/03/06 02:52:55 INFO fs.DFSCIOTest: fileSize (MB) = 100
 14/03/06 02:52:55 INFO fs.DFSCIOTest: bufferSize = 100
 14/03/06 02:52:55 WARN util.NativeCodeLoader: Unable to load native-hadoop 
 library for your platform... using builtin-java classes where applicable
 File /hadoop/hadoop-smoke/libhdfs/libhdfs.so.1 does not exist
 can get libhdfs.so.0.0.0 under ./lib/native
 [root@namenode hadoop-smoke]# find ./ -name libhdfs*
 ./lib/native/libhdfs.so
 ./lib/native/libhdfs.so.0.0.0
 ./lib/native/libhdfs.a



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10390) DFSCIOTest looks for the wrong version of libhdfs

2014-03-12 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13932079#comment-13932079
 ] 

Binglin Chang commented on HADOOP-10390:


Which position would you suggest? Do you mean we do not put those into package 
distribution?


 DFSCIOTest looks for the wrong version of libhdfs
 -

 Key: HADOOP-10390
 URL: https://issues.apache.org/jira/browse/HADOOP-10390
 Project: Hadoop Common
  Issue Type: Bug
  Components: test
Affects Versions: 2.0.0-alpha
Reporter: wenwupeng
Assignee: Binglin Chang
 Attachments: HADOOP-10390.v1.patch, HADOOP-10390.v2.patch


 Run benckmark DFSCIOTest failed at libhdfs.so.1
 hadoop jar 
 ./share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.3.0-tests.jar 
 DFSCIOTest -write -nrFiles 1 -fileSize 100
 DFSCIOTest.0.0.1
 14/03/06 02:52:55 INFO fs.DFSCIOTest: nrFiles = 1
 14/03/06 02:52:55 INFO fs.DFSCIOTest: fileSize (MB) = 100
 14/03/06 02:52:55 INFO fs.DFSCIOTest: bufferSize = 100
 14/03/06 02:52:55 WARN util.NativeCodeLoader: Unable to load native-hadoop 
 library for your platform... using builtin-java classes where applicable
 File /hadoop/hadoop-smoke/libhdfs/libhdfs.so.1 does not exist
 can get libhdfs.so.0.0.0 under ./lib/native
 [root@namenode hadoop-smoke]# find ./ -name libhdfs*
 ./lib/native/libhdfs.so
 ./lib/native/libhdfs.so.0.0.0
 ./lib/native/libhdfs.a



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10390) DFSCIOTest looks for the wrong version of libhdfs

2014-03-11 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13930117#comment-13930117
 ] 

Binglin Chang commented on HADOOP-10390:


bq. hdfs_read/hdfs_write no longer exists(at least not in hadoop distribution), 
but it is used for DFSCIOTest, should they also go into distribution?
I was trying to update the patch to solve all the issues in the jira, but have 
not found a proper place to put test_libhdfs_read/test_libhdfs_write, 
HADOOP_HOME/bin does not feel like a right place for test programs. 


 DFSCIOTest looks for the wrong version of libhdfs
 -

 Key: HADOOP-10390
 URL: https://issues.apache.org/jira/browse/HADOOP-10390
 Project: Hadoop Common
  Issue Type: Bug
  Components: test
Affects Versions: 2.0.0-alpha
Reporter: wenwupeng
Assignee: Binglin Chang
 Attachments: HADOOP-10390.v1.patch


 Run benckmark DFSCIOTest failed at libhdfs.so.1
 hadoop jar 
 ./share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.3.0-tests.jar 
 DFSCIOTest -write -nrFiles 1 -fileSize 100
 DFSCIOTest.0.0.1
 14/03/06 02:52:55 INFO fs.DFSCIOTest: nrFiles = 1
 14/03/06 02:52:55 INFO fs.DFSCIOTest: fileSize (MB) = 100
 14/03/06 02:52:55 INFO fs.DFSCIOTest: bufferSize = 100
 14/03/06 02:52:55 WARN util.NativeCodeLoader: Unable to load native-hadoop 
 library for your platform... using builtin-java classes where applicable
 File /hadoop/hadoop-smoke/libhdfs/libhdfs.so.1 does not exist
 can get libhdfs.so.0.0.0 under ./lib/native
 [root@namenode hadoop-smoke]# find ./ -name libhdfs*
 ./lib/native/libhdfs.so
 ./lib/native/libhdfs.so.0.0.0
 ./lib/native/libhdfs.a



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10390) DFSCIOTest looks for the wrong version of libhdfs

2014-03-10 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13925488#comment-13925488
 ] 

Binglin Chang commented on HADOOP-10390:


Looks like there is no libhdfs.so.0:
{code}
[root@namenode native]# ll
total 2184
-rw-r--r-- 1 67974 users 621326 Feb 11 08:55 libhadoop.a
-rw-r--r-- 1 67974 users 534024 Feb 11 08:55 libhadooppipes.a
lrwxrwxrwx 1 67974 users 18 Feb 17 22:37 libhadoop.so - libhadoop.so.1.0.0
-rwxr-xr-x 1 67974 users 446741 Feb 11 08:55 libhadoop.so.1.0.0
-rw-r--r-- 1 67974 users 226360 Feb 11 08:55 libhadooputils.a
-rw-r--r-- 1 67974 users 204586 Feb 11 08:55 libhdfs.a
lrwxrwxrwx 1 67974 users 16 Feb 17 22:37 libhdfs.so - libhdfs.so.0.0.0
-rwxr-xr-x 1 67974 users 167760 Feb 11 08:55 libhdfs.so.0.0.0
{code}

 DFSCIOTest looks for the wrong version of libhdfs
 -

 Key: HADOOP-10390
 URL: https://issues.apache.org/jira/browse/HADOOP-10390
 Project: Hadoop Common
  Issue Type: Bug
  Components: test
Affects Versions: 2.0.0-alpha
Reporter: wenwupeng
Assignee: Binglin Chang
 Attachments: HADOOP-10390.v1.patch


 Run benckmark DFSCIOTest failed at libhdfs.so.1
 hadoop jar 
 ./share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.3.0-tests.jar 
 DFSCIOTest -write -nrFiles 1 -fileSize 100
 DFSCIOTest.0.0.1
 14/03/06 02:52:55 INFO fs.DFSCIOTest: nrFiles = 1
 14/03/06 02:52:55 INFO fs.DFSCIOTest: fileSize (MB) = 100
 14/03/06 02:52:55 INFO fs.DFSCIOTest: bufferSize = 100
 14/03/06 02:52:55 WARN util.NativeCodeLoader: Unable to load native-hadoop 
 library for your platform... using builtin-java classes where applicable
 File /hadoop/hadoop-smoke/libhdfs/libhdfs.so.1 does not exist
 can get libhdfs.so.0.0.0 under ./lib/native
 [root@namenode hadoop-smoke]# find ./ -name libhdfs*
 ./lib/native/libhdfs.so
 ./lib/native/libhdfs.so.0.0.0
 ./lib/native/libhdfs.a



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10390) DFSCIOTest looks for the wrong version of libhdfs

2014-03-10 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13925499#comment-13925499
 ] 

Binglin Chang commented on HADOOP-10390:


I reviewed code carefully, related code:
{code}
fs.copyFromLocalFile(new Path(hadoopHome + /libhdfs/libhdfs.so. + 
HDFS_LIB_VERSION),
 HDFS_SHLIB);
fs.copyFromLocalFile(new Path(hadoopHome + /libhdfs/hdfs_read), 
HDFS_READ);
fs.copyFromLocalFile(new Path(hadoopHome + /libhdfs/hdfs_write), 
HDFS_WRITE);
{code}
the program try to copy hadoopHome/libhdfs/libhdfs.so to HDFS and failed, 
because file doesn't exists.
Actually the whole path is not right, I suspect the 
path(HADOOP_HOME/libhdfs/libhdfs.so|hdfs_read|hdfs_write) is already 
out-of-date now(maybe they exist in hadoop-v1 test environments), so in theory 
the test should fail for a long time ago.


 DFSCIOTest looks for the wrong version of libhdfs
 -

 Key: HADOOP-10390
 URL: https://issues.apache.org/jira/browse/HADOOP-10390
 Project: Hadoop Common
  Issue Type: Bug
  Components: test
Affects Versions: 2.0.0-alpha
Reporter: wenwupeng
Assignee: Binglin Chang
 Attachments: HADOOP-10390.v1.patch


 Run benckmark DFSCIOTest failed at libhdfs.so.1
 hadoop jar 
 ./share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.3.0-tests.jar 
 DFSCIOTest -write -nrFiles 1 -fileSize 100
 DFSCIOTest.0.0.1
 14/03/06 02:52:55 INFO fs.DFSCIOTest: nrFiles = 1
 14/03/06 02:52:55 INFO fs.DFSCIOTest: fileSize (MB) = 100
 14/03/06 02:52:55 INFO fs.DFSCIOTest: bufferSize = 100
 14/03/06 02:52:55 WARN util.NativeCodeLoader: Unable to load native-hadoop 
 library for your platform... using builtin-java classes where applicable
 File /hadoop/hadoop-smoke/libhdfs/libhdfs.so.1 does not exist
 can get libhdfs.so.0.0.0 under ./lib/native
 [root@namenode hadoop-smoke]# find ./ -name libhdfs*
 ./lib/native/libhdfs.so
 ./lib/native/libhdfs.so.0.0.0
 ./lib/native/libhdfs.a



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10390) DFSCIOTest looks for the wrong version of libhdfs

2014-03-10 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13925511#comment-13925511
 ] 

Binglin Chang commented on HADOOP-10390:


Currently I think there are 3 issues we need to fix: 
1. libhdfs has a different path and version now, they should be updated
2. hdfs_read/hdfs_write no longer exists(at least not in hadoop distribution), 
but it is used for DFSCIOTest, should they also go into distribution?
3. DFSCIOTest on exception only prints error message:
  System.err.print(e.getLocalizedMessage());
it is so confusing that I took a long time to find the root cause



 DFSCIOTest looks for the wrong version of libhdfs
 -

 Key: HADOOP-10390
 URL: https://issues.apache.org/jira/browse/HADOOP-10390
 Project: Hadoop Common
  Issue Type: Bug
  Components: test
Affects Versions: 2.0.0-alpha
Reporter: wenwupeng
Assignee: Binglin Chang
 Attachments: HADOOP-10390.v1.patch


 Run benckmark DFSCIOTest failed at libhdfs.so.1
 hadoop jar 
 ./share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.3.0-tests.jar 
 DFSCIOTest -write -nrFiles 1 -fileSize 100
 DFSCIOTest.0.0.1
 14/03/06 02:52:55 INFO fs.DFSCIOTest: nrFiles = 1
 14/03/06 02:52:55 INFO fs.DFSCIOTest: fileSize (MB) = 100
 14/03/06 02:52:55 INFO fs.DFSCIOTest: bufferSize = 100
 14/03/06 02:52:55 WARN util.NativeCodeLoader: Unable to load native-hadoop 
 library for your platform... using builtin-java classes where applicable
 File /hadoop/hadoop-smoke/libhdfs/libhdfs.so.1 does not exist
 can get libhdfs.so.0.0.0 under ./lib/native
 [root@namenode hadoop-smoke]# find ./ -name libhdfs*
 ./lib/native/libhdfs.so
 ./lib/native/libhdfs.so.0.0.0
 ./lib/native/libhdfs.a



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10388) Pure native hadoop client

2014-03-10 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13925585#comment-13925585
 ] 

Binglin Chang commented on HADOOP-10388:


About coding standard, google is mostly fine.
I mention c+\+11 mostly for the new std libraries(thread, lock/condition, 
random, unique_ptr/shared_ptr, regex), so we can avoid writing lot of common 
utility code, it's fine we use boost instead, and provide  typedefs so c+\+11 
or boost can both be an option, old compiler can use boost instead, new 
compiler can avoid boost dependency.
Agree with Colin, I tend to avoid using fancy language features such as 
lambda, template, std::function. 
For compatibility the code should be plain simple, especially for public api, 
c+\+ does not have good binary compatibility(mainly virtual method) issue, so 
we need to be careful.


 Pure native hadoop client
 -

 Key: HADOOP-10388
 URL: https://issues.apache.org/jira/browse/HADOOP-10388
 Project: Hadoop Common
  Issue Type: New Feature
Reporter: Binglin Chang

 A pure native hadoop client has following use case/advantages:
 1.  writing Yarn applications using c++
 2.  direct access to HDFS, without extra proxy overhead, comparing to web/nfs 
 interface.
 3.  wrap native library to support more languages, e.g. python
 4.  lightweight, small footprint compare to several hundred MB of JDK and 
 hadoop library with various dependencies.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10388) Pure native hadoop client

2014-03-10 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13925772#comment-13925772
 ] 

Binglin Chang commented on HADOOP-10388:


bq. I'd like it to build on OS/X so that mac builds catch regressions, even if 
isn't for production.
Agree. MacOSX is more like freebsd, I do most of my coding in mac, can help 
make sure mac build and test.

bq. I'm not up to date with C++ test frameworks
Although I havent try other test frameworks, I would recommend gtest, it is 
small and convenient(just a .cc file can embed into test program). If we are 
using google c++ coding standard, protobuf, using another google framework 
seems natural.


 Pure native hadoop client
 -

 Key: HADOOP-10388
 URL: https://issues.apache.org/jira/browse/HADOOP-10388
 Project: Hadoop Common
  Issue Type: New Feature
Reporter: Binglin Chang

 A pure native hadoop client has following use case/advantages:
 1.  writing Yarn applications using c++
 2.  direct access to HDFS, without extra proxy overhead, comparing to web/nfs 
 interface.
 3.  wrap native library to support more languages, e.g. python
 4.  lightweight, small footprint compare to several hundred MB of JDK and 
 hadoop library with various dependencies.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10388) Pure native hadoop client

2014-03-09 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13925187#comment-13925187
 ] 

Binglin Chang commented on HADOOP-10388:


Although c is viable, I would suggest using c++11, that will help us get rid of 
lot of dependencies and make the code smaller.
I was writing a client just for fun, it uses c++11, depends on protobuf, 
json-c, sasl2, gtest and cmake, about 8k LOC. It is on github now: 
https://github.com/decster/libhadoopclient.
Hope some of the code can be useful here. 


 Pure native hadoop client
 -

 Key: HADOOP-10388
 URL: https://issues.apache.org/jira/browse/HADOOP-10388
 Project: Hadoop Common
  Issue Type: New Feature
Reporter: Binglin Chang

 A pure native hadoop client has following use case/advantages:
 1.  writing Yarn applications using c++
 2.  direct access to HDFS, without extra proxy overhead, comparing to web/nfs 
 interface.
 3.  wrap native library to support more languages, e.g. python
 4.  lightweight, small footprint compare to several hundred MB of JDK and 
 hadoop library with various dependencies.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10390) libhdfs.so.1 does not exist in Hadoop 2.3.0

2014-03-06 Thread Binglin Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Binglin Chang updated HADOOP-10390:
---

Status: Patch Available  (was: Open)

 libhdfs.so.1 does not exist in Hadoop 2.3.0
 ---

 Key: HADOOP-10390
 URL: https://issues.apache.org/jira/browse/HADOOP-10390
 Project: Hadoop Common
  Issue Type: Bug
  Components: tools
Affects Versions: 2.3.0
Reporter: wenwupeng
Assignee: Binglin Chang

 Run benckmark DFSCIOTest failed at libhdfs.so.1
 hadoop jar 
 ./share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.3.0-tests.jar 
 DFSCIOTest -write -nrFiles 1 -fileSize 100
 DFSCIOTest.0.0.1
 14/03/06 02:52:55 INFO fs.DFSCIOTest: nrFiles = 1
 14/03/06 02:52:55 INFO fs.DFSCIOTest: fileSize (MB) = 100
 14/03/06 02:52:55 INFO fs.DFSCIOTest: bufferSize = 100
 14/03/06 02:52:55 WARN util.NativeCodeLoader: Unable to load native-hadoop 
 library for your platform... using builtin-java classes where applicable
 File /hadoop/hadoop-smoke/libhdfs/libhdfs.so.1 does not exist
 can get libhdfs.so.0.0.0 under ./lib/native
 [root@namenode hadoop-smoke]# find ./ -name libhdfs*
 ./lib/native/libhdfs.so
 ./lib/native/libhdfs.so.0.0.0
 ./lib/native/libhdfs.a



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10390) libhdfs.so.1 does not exist in Hadoop 2.3.0

2014-03-06 Thread Binglin Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Binglin Chang updated HADOOP-10390:
---

Attachment: HADOOP-10390.v1.patch

Looks like the problem is caused by version miss match in libhdfs CMakefile and 
DFSIOTest.HDFS_LIB_VERSION. One of them needs to be changed.
The patch changes DFSIOTest.HDFS_LIB_VERSION

 libhdfs.so.1 does not exist in Hadoop 2.3.0
 ---

 Key: HADOOP-10390
 URL: https://issues.apache.org/jira/browse/HADOOP-10390
 Project: Hadoop Common
  Issue Type: Bug
  Components: tools
Affects Versions: 2.3.0
Reporter: wenwupeng
Assignee: Binglin Chang
 Attachments: HADOOP-10390.v1.patch


 Run benckmark DFSCIOTest failed at libhdfs.so.1
 hadoop jar 
 ./share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.3.0-tests.jar 
 DFSCIOTest -write -nrFiles 1 -fileSize 100
 DFSCIOTest.0.0.1
 14/03/06 02:52:55 INFO fs.DFSCIOTest: nrFiles = 1
 14/03/06 02:52:55 INFO fs.DFSCIOTest: fileSize (MB) = 100
 14/03/06 02:52:55 INFO fs.DFSCIOTest: bufferSize = 100
 14/03/06 02:52:55 WARN util.NativeCodeLoader: Unable to load native-hadoop 
 library for your platform... using builtin-java classes where applicable
 File /hadoop/hadoop-smoke/libhdfs/libhdfs.so.1 does not exist
 can get libhdfs.so.0.0.0 under ./lib/native
 [root@namenode hadoop-smoke]# find ./ -name libhdfs*
 ./lib/native/libhdfs.so
 ./lib/native/libhdfs.so.0.0.0
 ./lib/native/libhdfs.a



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HADOOP-10388) Pure native hadoop client

2014-03-05 Thread Binglin Chang (JIRA)
Binglin Chang created HADOOP-10388:
--

 Summary: Pure native hadoop client
 Key: HADOOP-10388
 URL: https://issues.apache.org/jira/browse/HADOOP-10388
 Project: Hadoop Common
  Issue Type: New Feature
Reporter: Binglin Chang


A pure native hadoop client has following use case/advantages:
1.  writing Yarn applications using c++
2.  direct access to HDFS, without extra proxy overhead, comparing to web/nfs 
interface.
3.  wrap native library to support more languages, e.g. python
4.  lightweight, small footprint compare to several hundred MB of JDK and 
hadoop library with various dependencies.




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HADOOP-10389) Native RPCv9 client

2014-03-05 Thread Binglin Chang (JIRA)
Binglin Chang created HADOOP-10389:
--

 Summary: Native RPCv9 client
 Key: HADOOP-10389
 URL: https://issues.apache.org/jira/browse/HADOOP-10389
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Binglin Chang






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (HADOOP-10390) libhdfs.so.1 does not exist in Hadoop 2.3.0

2014-03-05 Thread Binglin Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Binglin Chang reassigned HADOOP-10390:
--

Assignee: Binglin Chang

 libhdfs.so.1 does not exist in Hadoop 2.3.0
 ---

 Key: HADOOP-10390
 URL: https://issues.apache.org/jira/browse/HADOOP-10390
 Project: Hadoop Common
  Issue Type: Bug
  Components: tools
Affects Versions: 2.3.0
Reporter: wenwupeng
Assignee: Binglin Chang

 Run benckmark DFSCIOTest failed at libhdfs.so.1
 hadoop jar 
 ./share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.3.0-tests.jar 
 DFSCIOTest -write -nrFiles 1 -fileSize 100
 DFSCIOTest.0.0.1
 14/03/06 02:52:55 INFO fs.DFSCIOTest: nrFiles = 1
 14/03/06 02:52:55 INFO fs.DFSCIOTest: fileSize (MB) = 100
 14/03/06 02:52:55 INFO fs.DFSCIOTest: bufferSize = 100
 14/03/06 02:52:55 WARN util.NativeCodeLoader: Unable to load native-hadoop 
 library for your platform... using builtin-java classes where applicable
 File /hadoop/hadoop-smoke/libhdfs/libhdfs.so.1 
 can get libhdfs.so.0.0.0 under ./lib/native
 [root@namenode hadoop-smoke]# find ./ -name libhdfs*
 ./lib/native/libhdfs.so
 ./lib/native/libhdfs.so.0.0.0
 ./lib/native/libhdfs.a



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-9648) Fix build native library on mac osx

2013-12-06 Thread Binglin Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Binglin Chang updated HADOOP-9648:
--

Status: Patch Available  (was: Open)

 Fix build native library on mac osx
 ---

 Key: HADOOP-9648
 URL: https://issues.apache.org/jira/browse/HADOOP-9648
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 2.0.5-alpha, 1.1.2, 1.2.0, 1.0.4
Reporter: Kirill A. Korinskiy
 Attachments: HADOOP-9648-native-osx.1.0.4.patch, 
 HADOOP-9648-native-osx.1.1.2.patch, HADOOP-9648-native-osx.1.2.0.patch, 
 HADOOP-9648-native-osx.2.0.5-alpha-rc1.patch


 Some patches for fixing build a hadoop native library on os x 10.7/10.8.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HADOOP-9648) Fix build native library on mac osx

2013-12-06 Thread Binglin Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Binglin Chang updated HADOOP-9648:
--

Attachment: HADOOP-9648.v2.patch

Fix some issues of the original patch and fix related test failures in 
test-container-executor, changes:
1. Issue: setnetgrent is deleted in original code, this is not right, we don't 
need if  test but still need to call setnetgrent
2. Issue: mkdirs skip create dir if the path exists, but if the path is a file, 
it can still succeed. Fix: change the whole implementation, mkdirat, openat is 
not needed anymore
3. LOGIN_NAME_MAX is not present in macos, changed to use sysconf
4. fcloseall is not present in macos, changed to close opened fds(stdin, 
stdout, stderr)
5. macos/freebsd do not have cgroup, disable and print error message
test-container-executer issues:
6. macosx do not have user bin, skip a test
7. macosx /etc/passwd is not real path(have symlink), changed to /bin/ls

Now compile with native and test-container-executor can run successfully in my 
macbook.

 Fix build native library on mac osx
 ---

 Key: HADOOP-9648
 URL: https://issues.apache.org/jira/browse/HADOOP-9648
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 1.0.4, 1.2.0, 1.1.2, 2.0.5-alpha
Reporter: Kirill A. Korinskiy
 Attachments: HADOOP-9648-native-osx.1.0.4.patch, 
 HADOOP-9648-native-osx.1.1.2.patch, HADOOP-9648-native-osx.1.2.0.patch, 
 HADOOP-9648-native-osx.2.0.5-alpha-rc1.patch, HADOOP-9648.v2.patch


 Some patches for fixing build a hadoop native library on os x 10.7/10.8.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HADOOP-10130) RawLocalFS::LocalFSFileInputStream.pread does not track FS::Statistics

2013-12-02 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13837404#comment-13837404
 ] 

Binglin Chang commented on HADOOP-10130:


Thanks for the review and commit, Colin!

 RawLocalFS::LocalFSFileInputStream.pread does not track FS::Statistics
 --

 Key: HADOOP-10130
 URL: https://issues.apache.org/jira/browse/HADOOP-10130
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Binglin Chang
Assignee: Binglin Chang
Priority: Minor
 Fix For: 2.3.0

 Attachments: HADOOP-10130.v1.patch, HADOOP-10130.v2.patch, 
 HADOOP-10130.v2.patch, HDFS-5575.v1.patch


 RawLocalFS::LocalFSFileInputStream.pread does not track FS::Statistics



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HADOOP-10130) RawLocalFS::LocalFSFileInputStream.pread does not track FS::Statistics

2013-11-28 Thread Binglin Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Binglin Chang updated HADOOP-10130:
---

Attachment: HADOOP-10130.v2.patch

Build crashed somehow, resubmit.

 RawLocalFS::LocalFSFileInputStream.pread does not track FS::Statistics
 --

 Key: HADOOP-10130
 URL: https://issues.apache.org/jira/browse/HADOOP-10130
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Binglin Chang
Assignee: Binglin Chang
Priority: Minor
 Attachments: HADOOP-10130.v1.patch, HADOOP-10130.v2.patch, 
 HADOOP-10130.v2.patch, HDFS-5575.v1.patch


 RawLocalFS::LocalFSFileInputStream.pread does not track FS::Statistics



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (HADOOP-10130) RawLocalFS::LocalFSFileInputStream.pread does not track FS::Statistics

2013-11-27 Thread Binglin Chang (JIRA)
Binglin Chang created HADOOP-10130:
--

 Summary: RawLocalFS::LocalFSFileInputStream.pread does not track 
FS::Statistics
 Key: HADOOP-10130
 URL: https://issues.apache.org/jira/browse/HADOOP-10130
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Binglin Chang
Assignee: Binglin Chang
Priority: Minor


RawLocalFS::LocalFSFileInputStream.pread does not track FS::Statistics



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HADOOP-10130) RawLocalFS::LocalFSFileInputStream.pread does not track FS::Statistics

2013-11-27 Thread Binglin Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Binglin Chang updated HADOOP-10130:
---

Attachment: HDFS-5575.v1.patch

Attach patch

 RawLocalFS::LocalFSFileInputStream.pread does not track FS::Statistics
 --

 Key: HADOOP-10130
 URL: https://issues.apache.org/jira/browse/HADOOP-10130
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Binglin Chang
Assignee: Binglin Chang
Priority: Minor
 Attachments: HDFS-5575.v1.patch


 RawLocalFS::LocalFSFileInputStream.pread does not track FS::Statistics



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HADOOP-10130) RawLocalFS::LocalFSFileInputStream.pread does not track FS::Statistics

2013-11-27 Thread Binglin Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Binglin Chang updated HADOOP-10130:
---

Status: Patch Available  (was: Open)

 RawLocalFS::LocalFSFileInputStream.pread does not track FS::Statistics
 --

 Key: HADOOP-10130
 URL: https://issues.apache.org/jira/browse/HADOOP-10130
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Binglin Chang
Assignee: Binglin Chang
Priority: Minor
 Attachments: HDFS-5575.v1.patch


 RawLocalFS::LocalFSFileInputStream.pread does not track FS::Statistics



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HADOOP-10130) RawLocalFS::LocalFSFileInputStream.pread does not track FS::Statistics

2013-11-27 Thread Binglin Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Binglin Chang updated HADOOP-10130:
---

Attachment: HADOOP-10130.v1.patch

Wrong patch file name, rename and submit again

 RawLocalFS::LocalFSFileInputStream.pread does not track FS::Statistics
 --

 Key: HADOOP-10130
 URL: https://issues.apache.org/jira/browse/HADOOP-10130
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Binglin Chang
Assignee: Binglin Chang
Priority: Minor
 Attachments: HADOOP-10130.v1.patch, HDFS-5575.v1.patch


 RawLocalFS::LocalFSFileInputStream.pread does not track FS::Statistics



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HADOOP-10130) RawLocalFS::LocalFSFileInputStream.pread does not track FS::Statistics

2013-11-27 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13833987#comment-13833987
 ] 

Binglin Chang commented on HADOOP-10130:


RawLocalFileSystem is a public class, I was not sure I can remove the class in 
it, it looks safe to do so. 
Attaching new patch.

 RawLocalFS::LocalFSFileInputStream.pread does not track FS::Statistics
 --

 Key: HADOOP-10130
 URL: https://issues.apache.org/jira/browse/HADOOP-10130
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Binglin Chang
Assignee: Binglin Chang
Priority: Minor
 Attachments: HADOOP-10130.v1.patch, HADOOP-10130.v2.patch, 
 HDFS-5575.v1.patch


 RawLocalFS::LocalFSFileInputStream.pread does not track FS::Statistics



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HADOOP-10130) RawLocalFS::LocalFSFileInputStream.pread does not track FS::Statistics

2013-11-27 Thread Binglin Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Binglin Chang updated HADOOP-10130:
---

Attachment: HADOOP-10130.v2.patch

 RawLocalFS::LocalFSFileInputStream.pread does not track FS::Statistics
 --

 Key: HADOOP-10130
 URL: https://issues.apache.org/jira/browse/HADOOP-10130
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Binglin Chang
Assignee: Binglin Chang
Priority: Minor
 Attachments: HADOOP-10130.v1.patch, HADOOP-10130.v2.patch, 
 HDFS-5575.v1.patch


 RawLocalFS::LocalFSFileInputStream.pread does not track FS::Statistics



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HADOOP-9897) Add method to get path start position without drive specifier in o.a.h.fs.Path

2013-10-15 Thread Binglin Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Binglin Chang updated HADOOP-9897:
--

Attachment: HADOOP-9897.v3.patch

Thanks for the review Chris. Attaching new patch addressing your comments.

 Add method to get path start position without drive specifier in 
 o.a.h.fs.Path  
 

 Key: HADOOP-9897
 URL: https://issues.apache.org/jira/browse/HADOOP-9897
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Binglin Chang
Assignee: Binglin Chang
Priority: Trivial
 Attachments: HADOOP-9897.v1.patch, HADOOP-9897.v2.patch, 
 HADOOP-9897.v2.patch, HADOOP-9897.v3.patch


 There are a lot of code in Path to get start position after skipping drive 
 specifier, like:
 {code}
 int start = hasWindowsDrive(uri.getPath()) ? 3 : 0;
 {code}
 Also there is a minor bug in mergePaths:
 mergePath(/, /foo) will yield Path(//foo) which will be parsed as uri 
 authority, not path.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HADOOP-9897) Add method to get path start position without drive specifier in o.a.h.fs.Path

2013-10-15 Thread Binglin Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Binglin Chang updated HADOOP-9897:
--

Attachment: HADOOP-9897.v4.patch

 Add method to get path start position without drive specifier in 
 o.a.h.fs.Path  
 

 Key: HADOOP-9897
 URL: https://issues.apache.org/jira/browse/HADOOP-9897
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Binglin Chang
Assignee: Binglin Chang
Priority: Trivial
 Attachments: HADOOP-9897.v1.patch, HADOOP-9897.v2.patch, 
 HADOOP-9897.v2.patch, HADOOP-9897.v3.patch, HADOOP-9897.v4.patch


 There are a lot of code in Path to get start position after skipping drive 
 specifier, like:
 {code}
 int start = hasWindowsDrive(uri.getPath()) ? 3 : 0;
 {code}
 Also there is a minor bug in mergePaths:
 mergePath(/, /foo) will yield Path(//foo) which will be parsed as uri 
 authority, not path.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HADOOP-9897) Add method to get path start position without drive specifier in o.a.h.fs.Path

2013-10-15 Thread Binglin Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Binglin Chang updated HADOOP-9897:
--

Attachment: HADOOP-9897.v5.patch

Thanks for the review Chris! Attach new version of the patch.

 Add method to get path start position without drive specifier in 
 o.a.h.fs.Path  
 

 Key: HADOOP-9897
 URL: https://issues.apache.org/jira/browse/HADOOP-9897
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Binglin Chang
Assignee: Binglin Chang
Priority: Trivial
 Attachments: HADOOP-9897.v1.patch, HADOOP-9897.v2.patch, 
 HADOOP-9897.v2.patch, HADOOP-9897.v3.patch, HADOOP-9897.v4.patch, 
 HADOOP-9897.v5.patch


 There are a lot of code in Path to get start position after skipping drive 
 specifier, like:
 {code}
 int start = hasWindowsDrive(uri.getPath()) ? 3 : 0;
 {code}
 Also there is a minor bug in mergePaths:
 mergePath(/, /foo) will yield Path(//foo) which will be parsed as uri 
 authority, not path.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HADOOP-9897) Add method to get path start position without drive specifier in o.a.h.fs.Path

2013-10-15 Thread Binglin Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Binglin Chang updated HADOOP-9897:
--

Attachment: HADOOP-9897.v6.patch

Really sorry for that, attach new patch.

 Add method to get path start position without drive specifier in 
 o.a.h.fs.Path  
 

 Key: HADOOP-9897
 URL: https://issues.apache.org/jira/browse/HADOOP-9897
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs
Affects Versions: 3.0.0, 2.2.0
Reporter: Binglin Chang
Assignee: Binglin Chang
Priority: Trivial
 Attachments: HADOOP-9897.v1.patch, HADOOP-9897.v2.patch, 
 HADOOP-9897.v2.patch, HADOOP-9897.v3.patch, HADOOP-9897.v4.patch, 
 HADOOP-9897.v5.patch, HADOOP-9897.v6.patch


 There are a lot of code in Path to get start position after skipping drive 
 specifier, like:
 {code}
 int start = hasWindowsDrive(uri.getPath()) ? 3 : 0;
 {code}
 Also there is a minor bug in mergePaths:
 mergePath(/, /foo) will yield Path(//foo) which will be parsed as uri 
 authority, not path.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HADOOP-9897) Add method to get path start position without drive specifier in o.a.h.fs.Path

2013-10-10 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13791328#comment-13791328
 ] 

Binglin Chang commented on HADOOP-9897:
---

Hi [~cnauroth],
Could you help review the patch again and get this committed?  Thanks.

 Add method to get path start position without drive specifier in 
 o.a.h.fs.Path  
 

 Key: HADOOP-9897
 URL: https://issues.apache.org/jira/browse/HADOOP-9897
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Binglin Chang
Assignee: Binglin Chang
Priority: Trivial
 Attachments: HADOOP-9897.v1.patch, HADOOP-9897.v2.patch, 
 HADOOP-9897.v2.patch


 There are a lot of code in Path to get start position after skipping drive 
 specifier, like:
 {code}
 int start = hasWindowsDrive(uri.getPath()) ? 3 : 0;
 {code}
 Also there is a minor bug in mergePaths:
 mergePath(/, /foo) will yield Path(//foo) which will be parsed as uri 
 authority, not path.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HADOOP-9972) new APIs for listStatus and globStatus to deal with symlinks

2013-09-24 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776870#comment-13776870
 ] 

Binglin Chang commented on HADOOP-9972:
---

bq. Also, if we want to add more options in the future, we don't want to create 
listLinkStatusWithFoo and listLinkStatusWithFooAndBar. Just listStatus(Path, 
PathOption).
That is exactly why I propose listStatus(Path, PathOption) implemented in 
FileSystem using more primitive listLinkStatus(Path), so If we add an option, 
we don't end up modify all sub FileSystems code. 

bq. we don't want to create listLinkStatusWithFoo and 
listLinkStatusWithFooAndBar. Just listStatus(Path, PathOption).
I am not against listStatus(Path, PathOption) API, just its implementation 
detail, this issue can be solved by listStatus(Path, PathOption). 

bq. Hadoop and HDFS exist in an environment where there are unreliable networks.
I don't think ignore all error including network issues, it is like disk 
failure/temporary unreadable issues in linux, globbing can't ignore that 
either, in that case error should just be passed all the way up to user, most 
user don't want to handle this error in ErrorHandler too.

bq. So if globStatus swallows unresolved symlink errors.
Are you saying network issue can cause unresolved symlink error? If dead link 
error is already mixed up with network errors, plus compatibility reasons, I 
agree with you, we can't follow linux practice.



 new APIs for listStatus and globStatus to deal with symlinks
 

 Key: HADOOP-9972
 URL: https://issues.apache.org/jira/browse/HADOOP-9972
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs
Affects Versions: 2.1.1-beta
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe

 Based on the discussion in HADOOP-9912, we need new APIs for FileSystem to 
 deal with symlinks.  The issue is that code has been written which is 
 incompatible with the existence of things which are not files or directories. 
  For example,
 there is a lot of code out there that looks at FileStatus#isFile, and
 if it returns false, assumes that what it is looking at is a
 directory.  In the case of a symlink, this assumption is incorrect.
 It seems reasonable to make the default behavior of {{FileSystem#listStatus}} 
 and {{FileSystem#globStatus}} be fully resolving symlinks, and ignoring 
 dangling ones.  This will prevent incompatibility with existing MR jobs and 
 other HDFS users.  We should also add new versions of listStatus and 
 globStatus that allow new, symlink-aware code to deal with symlinks as 
 symlinks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9972) new APIs for listStatus and globStatus to deal with symlinks

2013-09-20 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13773584#comment-13773584
 ] 

Binglin Chang commented on HADOOP-9972:
---

bq. Hmm. We could have a convenience method called listLinkStatus which just 
called into listStatus with the correct PathOptions. I sort of lean towards 
fewer APIs rather than more, but maybe it makes sense.
I mean listStatus(Path, PathOption) should call into listLinkStatus(it is 
HDFS::listStatus which is a primitive RPC call), not the other way around. I 
wonder how can we implement listStatus(Path, PathOption) without the primitive 
of listLinkStatus(Path)?

bq. Shell globbing doesn't ignore all errors
What I say of globbing is just shell wildcard substitution, it indeed ignore 
all errors, glob just substitute a string with wildcard to some string. 
http://www.linuxjournal.com/content/bash-extended-globbing
http://tldp.org/LDP/abs/html/globbingref.html
{code}
drwxr-xr-x  2 decster  staff  68 Sep 19 17:09 aa
drwxr-xr-x  2 decster  staff  68 Sep 19 17:12 bb
decster:~/projects/test echo *
aa bb
decster:~/projects/test echo */cc
*/cc
{code}

In your example:

{code}
cmccabe@keter:~/mydir ls b/c
ls: cannot access b/c: Permission denied
# this error is thrown by ls, not globbing

cmccabe@keter:~/mydir ls *
a:
c
ls: cannot open directory b: Permission denied
# ls * first become ls a c
# then ls throw the error when process c
{code}
 

 new APIs for listStatus and globStatus to deal with symlinks
 

 Key: HADOOP-9972
 URL: https://issues.apache.org/jira/browse/HADOOP-9972
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs
Affects Versions: 2.1.1-beta
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe

 Based on the discussion in HADOOP-9912, we need new APIs for FileSystem to 
 deal with symlinks.  The issue is that code has been written which is 
 incompatible with the existence of things which are not files or directories. 
  For example,
 there is a lot of code out there that looks at FileStatus#isFile, and
 if it returns false, assumes that what it is looking at is a
 directory.  In the case of a symlink, this assumption is incorrect.
 It seems reasonable to make the default behavior of {{FileSystem#listStatus}} 
 and {{FileSystem#globStatus}} be fully resolving symlinks, and ignoring 
 dangling ones.  This will prevent incompatibility with existing MR jobs and 
 other HDFS users.  We should also add new versions of listStatus and 
 globStatus that allow new, symlink-aware code to deal with symlinks as 
 symlinks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9972) new APIs for listStatus and globStatus to deal with symlinks

2013-09-19 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13772566#comment-13772566
 ] 

Binglin Chang commented on HADOOP-9972:
---

There are two issues we are talking about, one is new API:

bq. The discussion about whether HDFS should replace listStatus with something 
more like POSIX readdir seems like a tangent.
I think there is a confusion here, I didn't propose to use POSIX readdir. The 
API name readdir is probably causing confusion here so I changed to the 
listLinkStatus instead, it's semantics is the same as current hdfs listStatus 
which doesn't resolve links.

bq. To prevent this scenario, we want to change FileStatus#listStatus and 
FileStatus#globStatus to resolve all symlinks
I'am fully aware of this, and my proposal do not break this.

Frankly I don't see any conflict in the two proposals. I order to implement 
listStatus(Path, PathOption), a listLinkStatus(or something with the same 
semantics) primitive/core API is required, and it is mostly there(HDFS, other 
fs doesn't support symlink, except LocalFS). Since there is no conflict from my 
side, I think you can just submit the patch or give the implementation detail 
of listStatus(Path, PathOption) first. 

Another issue is globbing didn't follow linux practice:
It is probably a tangent, it is brought up just because the example about usage 
of PathErrorHandler. I say that Linux shell globbing ignore all errors, the 
example can be solved by following linux practice. If we decide not to follow 
linux practice and solve it another way, that is OK, although I prefer linux 
practice.



 new APIs for listStatus and globStatus to deal with symlinks
 

 Key: HADOOP-9972
 URL: https://issues.apache.org/jira/browse/HADOOP-9972
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs
Affects Versions: 2.1.1-beta
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe

 Based on the discussion in HADOOP-9912, we need new APIs for FileSystem to 
 deal with symlinks.  The issue is that code has been written which is 
 incompatible with the existence of things which are not files or directories. 
  For example,
 there is a lot of code out there that looks at FileStatus#isFile, and
 if it returns false, assumes that what it is looking at is a
 directory.  In the case of a symlink, this assumption is incorrect.
 It seems reasonable to make the default behavior of {{FileSystem#listStatus}} 
 and {{FileSystem#globStatus}} be fully resolving symlinks, and ignoring 
 dangling ones.  This will prevent incompatibility with existing MR jobs and 
 other HDFS users.  We should also add new versions of listStatus and 
 globStatus that allow new, symlink-aware code to deal with symlinks as 
 symlinks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9972) new APIs for listStatus and globStatus to deal with symlinks

2013-09-19 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13772570#comment-13772570
 ] 

Binglin Chang commented on HADOOP-9972:
---

You probably are confused by my earlier comments. I did not mean listLinkStatus 
only return filename and type. 

bq. Most linux/bsd system, readdir return filename and type.
I mean linux readdir in my comments, not the core API listLinkStatus. 


 new APIs for listStatus and globStatus to deal with symlinks
 

 Key: HADOOP-9972
 URL: https://issues.apache.org/jira/browse/HADOOP-9972
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs
Affects Versions: 2.1.1-beta
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe

 Based on the discussion in HADOOP-9912, we need new APIs for FileSystem to 
 deal with symlinks.  The issue is that code has been written which is 
 incompatible with the existence of things which are not files or directories. 
  For example,
 there is a lot of code out there that looks at FileStatus#isFile, and
 if it returns false, assumes that what it is looking at is a
 directory.  In the case of a symlink, this assumption is incorrect.
 It seems reasonable to make the default behavior of {{FileSystem#listStatus}} 
 and {{FileSystem#globStatus}} be fully resolving symlinks, and ignoring 
 dangling ones.  This will prevent incompatibility with existing MR jobs and 
 other HDFS users.  We should also add new versions of listStatus and 
 globStatus that allow new, symlink-aware code to deal with symlinks as 
 symlinks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9972) new APIs for listStatus and globStatus to deal with symlinks

2013-09-18 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13770471#comment-13770471
 ] 

Binglin Chang commented on HADOOP-9972:
---

Hi Colin, 
About globStatus example, if we follow linux practice, globStatus(p) = 
glob(pattern).map(path = getFileStatus(path))
String [] glob(pattern):
  if matches none, return pattern
  else return matched paths
  ignore all exceptions

I did some experiments, you can see ls * indeed should error message, but ls 
*/stuff should not show error message.
{code}
[root@master01 test]# mkdir -p aa/cc/foo
[root@master01 test]# mkdir -p bb/cc/foo
[root@master01 test]# chmod 700 bb
[root@master01 test]# ll /home/serengeti/.bash
[root@master01 test]# su serengeti
[serengeti@master01 test]$ ll
total 8
drwxr-xr-x 3 root root 4096 Sep 18 08:30 aa
drwx-- 3 root root 4096 Sep 18 08:31 bb
[serengeti@master01 test]$ ls *
aa:
cc
ls: bb: Permission denied
[serengeti@master01 test]$ ls */cc
foo
{code}

Separate globStatus to glob and getFileStatus seems a more proper way of doing 
globStatus rather than add new classes/interface and callback handler, and this 
is linux practice, should be more robust.







 new APIs for listStatus and globStatus to deal with symlinks
 

 Key: HADOOP-9972
 URL: https://issues.apache.org/jira/browse/HADOOP-9972
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs
Affects Versions: 2.1.1-beta
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe

 Based on the discussion in HADOOP-9912, we need new APIs for FileSystem to 
 deal with symlinks.  The issue is that code has been written which is 
 incompatible with the existence of things which are not files or directories. 
  For example,
 there is a lot of code out there that looks at FileStatus#isFile, and
 if it returns false, assumes that what it is looking at is a
 directory.  In the case of a symlink, this assumption is incorrect.
 It seems reasonable to make the default behavior of {{FileSystem#listStatus}} 
 and {{FileSystem#globStatus}} be fully resolving symlinks, and ignoring 
 dangling ones.  This will prevent incompatibility with existing MR jobs and 
 other HDFS users.  We should also add new versions of listStatus and 
 globStatus that allow new, symlink-aware code to deal with symlinks as 
 symlinks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9972) new APIs for listStatus and globStatus to deal with symlinks

2013-09-18 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13770496#comment-13770496
 ] 

Binglin Chang commented on HADOOP-9972:
---

Regarding API, I think we should differentiate core API and extend/legacy API, 
IMO, there should be 3 core API:

getFileStatus  resolve symlink
getFileLinkStatus don't resolve symlink
readdir   don't resolve symlink, just like current HDFS listStatus

These core API should be implemented in each FS

All other related APIs can be build based on core API and implemented in 
FSContext/FileSystem once for all:
{code}
FS.listStatus(path):
  readdir(path).map(s = if (s.isSymlink) getFileStatus ignore Exception else s)

FS.listStatus(path, PathOptions):
   readdir(path).map(process PathOptions)

glob(pattern):
  if pattern matches none, return pattern
  else return matched paths
  ignore all exceptions

globStatus(pattern):
  glob(pattern).map(getFileStatus)
{code}


 new APIs for listStatus and globStatus to deal with symlinks
 

 Key: HADOOP-9972
 URL: https://issues.apache.org/jira/browse/HADOOP-9972
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs
Affects Versions: 2.1.1-beta
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe

 Based on the discussion in HADOOP-9912, we need new APIs for FileSystem to 
 deal with symlinks.  The issue is that code has been written which is 
 incompatible with the existence of things which are not files or directories. 
  For example,
 there is a lot of code out there that looks at FileStatus#isFile, and
 if it returns false, assumes that what it is looking at is a
 directory.  In the case of a symlink, this assumption is incorrect.
 It seems reasonable to make the default behavior of {{FileSystem#listStatus}} 
 and {{FileSystem#globStatus}} be fully resolving symlinks, and ignoring 
 dangling ones.  This will prevent incompatibility with existing MR jobs and 
 other HDFS users.  We should also add new versions of listStatus and 
 globStatus that allow new, symlink-aware code to deal with symlinks as 
 symlinks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


  1   2   3   >