[jira] [Updated] (HDFS-8707) Implement an async pure c++ HDFS client

2016-10-06 Thread Bob Hansen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bob Hansen updated HDFS-8707:
-
Assignee: James Clampffer  (was: Bob Hansen)

> Implement an async pure c++ HDFS client
> ---
>
> Key: HDFS-8707
> URL: https://issues.apache.org/jira/browse/HDFS-8707
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: hdfs-client
>Reporter: Owen O'Malley
>Assignee: James Clampffer
>
> As part of working on the C++ ORC reader at ORC-3, we need an HDFS pure C++ 
> client that lets us do async io to HDFS. We want to start from the code that 
> Haohui's been working on at https://github.com/haohui/libhdfspp .



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10898) libhdfs++: Make log levels consistent

2016-10-05 Thread Bob Hansen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15549483#comment-15549483
 ] 

Bob Hansen commented on HDFS-10898:
---

+1

> libhdfs++: Make log levels consistent
> -
>
> Key: HDFS-10898
> URL: https://issues.apache.org/jira/browse/HDFS-10898
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: James Clampffer
>Priority: Trivial
> Attachments: HDFS-10898.HDFS-8707.000.patch, 
> HDFS-10898.HDFS-8707.001.patch
>
>
> Most of the public C++ FileHandle/FileSystem operations have a LOG_TRACE 
> level message about parameters passed in etc.  However many methods use 
> LOG_DEBUG and a couple use LOG_INFO.
> We most likely want FS operations that happen a lot (read/open/seek/stat) to 
> stick to LOG_DEBUG consistently and only use LOG_INFO for things like 
> FileSystem::Connect or RpcConnection:: that don't get called often and are 
> important enough to warrant showing up in the log.  LOG_TRACE can be reserved 
> for things happening deeper inside public methods and methods that aren't 
> part of the public API.
> Related improvements that could be brought into this to avoid opening a ton 
> of small Jiras:
> -Print the "this" pointer address in the log message to make it easier to 
> correlate objects when there's concurrent work being done.  This has been 
> very helpful in the past but often got stripped out before patches went in.  
> People just need be aware that operator new may eventually place an object of 
> the same type at the same address sometime in the future.
> -For objects owned by other objects, but created on the fly, include a 
> pointer back to the parent/creator object if that pointer is already being 
> tracked (See the nested stucts in BlockReaderImpl).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10898) libhdfs++: Make logs more informative and consistent

2016-10-03 Thread Bob Hansen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15543417#comment-15543417
 ] 

Bob Hansen commented on HDFS-10898:
---

What would you think of keeping SEEK and READ at the TRACE level since 
consumers will tend to do those gazillions of times?
We should move the async FileHandleImpl::PositionRead to DEBUG also.

While we're here, can we make the INFO level message for Connect be more 
human-readable ("Connecting to :"), perhaps with a DEBUG 
level that includes debugging-appropriate data (this pointer, etc.).  Have that 
one only in the async version, so the consumer gets one INFO-level message per 
connect.  Not critical, but since whe're here...

> libhdfs++: Make logs more informative and consistent
> 
>
> Key: HDFS-10898
> URL: https://issues.apache.org/jira/browse/HDFS-10898
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: James Clampffer
>Priority: Trivial
> Attachments: HDFS-10898.HDFS-8707.000.patch
>
>
> Most of the public C++ FileHandle/FileSystem operations have a LOG_TRACE 
> level message about parameters passed in etc.  However many methods use 
> LOG_DEBUG and a couple use LOG_INFO.
> We most likely want FS operations that happen a lot (read/open/seek/stat) to 
> stick to LOG_DEBUG consistently and only use LOG_INFO for things like 
> FileSystem::Connect or RpcConnection:: that don't get called often and are 
> important enough to warrant showing up in the log.  LOG_TRACE can be reserved 
> for things happening deeper inside public methods and methods that aren't 
> part of the public API.
> Related improvements that could be brought into this to avoid opening a ton 
> of small Jiras:
> -Print the "this" pointer address in the log message to make it easier to 
> correlate objects when there's concurrent work being done.  This has been 
> very helpful in the past but often got stripped out before patches went in.  
> People just need be aware that operator new may eventually place an object of 
> the same type at the same address sometime in the future.
> -For objects owned by other objects, but created on the fly, include a 
> pointer back to the parent/creator object if that pointer is already being 
> tracked (See the nested stucts in BlockReaderImpl).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10931) libhdfs++: Fix object lifecycle issues in the BlockReader

2016-10-03 Thread Bob Hansen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15543401#comment-15543401
 ] 

Bob Hansen commented on HDFS-10931:
---

Code is a definite improvement.  I am very uncomfortable with accepting 
failures in the mini stress test with errors.  Can we increase the retry count 
to ensure that the false negative rate is very low, and file another bug to 
track down where it started regressing?

If we can mitigate that, +1

> libhdfs++: Fix object lifecycle issues in the BlockReader
> -
>
> Key: HDFS-10931
> URL: https://issues.apache.org/jira/browse/HDFS-10931
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: James Clampffer
>Priority: Critical
> Attachments: HDFS-10931.HDFS-8707.000.patch, 
> HDFS-10931.HDFS-8707.001.patch
>
>
> The BlockReader can work itself into a a state during AckRead (possibly other 
> stages as well) where the pipeline posts a task for asio with a pointer back 
> into itself, then promptly calls "delete this" without canceling the asio 
> request.  The asio task finishes and tries to acquire the lock at the address 
> where the DataNodeConnection used to live - but the DN connection is no 
> longer valid so it's scribbling on some arbitrary bit of memory.  On some 
> platforms the underlying address used by the mutex state will be handed out 
> to future mutexes so the scribble breaks that state and all the locks in that 
> process start misbehaving.
> This can be reproduced by using the patch from HDFS-8790 and adding more 
> worker threads + a lot more reader threads.
> I'm going to fix this in two parts:
> 1) Duct tape + superglue patch to make sure that all top level continuations 
> in the block reader pipeline hold a shared_ptr to the DataNodeConnection.  
> Nested continuations also get a copy of the shared_ptr to make sure the 
> connection is alive.  This at least keeps the connection alive so that it can 
> keep returning asio::operation_aborted.
> 2) The continuation stuff needs a lot of work to make sure this type of bug 
> doesn't keep popping up.  We've already fixed these issues in the RPC code.  
> This will most likely need to be split into a few jiras.
> - Continuation "framework" can be slimmed down quite a bit, perhaps even 
> removed.  Near zero documentation + many implied contracts = constant bug 
> chasing.
> - Add comments to actually describe what's going on in the networking code.  
> This bug took significantly longer than it should have to track down because 
> I hadn't worked on the BlockReader in a while.
> - No more "delete this".
> - Flatten out nested continuations e.g. the guts of BlockReaderImpl::AckRead. 
>  It's unclear why they were implemented like this in the first place and 
> there's no comments to indicate that this was intentional.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-10937) libhdfs++: hdfsRead return -1 at eof

2016-09-30 Thread Bob Hansen (JIRA)
Bob Hansen created HDFS-10937:
-

 Summary: libhdfs++: hdfsRead return -1 at eof
 Key: HDFS-10937
 URL: https://issues.apache.org/jira/browse/HDFS-10937
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Bob Hansen


The libhdfs++ implementation of hdfsRead appears to be out-of-spec.  The header 
says it will return 0 at eof, but the current implementation returns -1 with an 
errno of 261 (invalid offset).

The basic posix-y read loop of
while ( (bytesRead = hdsfRead(...)) != 0) {...}
won't work with with libhdfs++'s hdfsRead method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10595) libhdfs++: Client Name Protobuf Error

2016-09-30 Thread Bob Hansen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bob Hansen updated HDFS-10595:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Landed with feba09b

> libhdfs++: Client Name Protobuf Error
> -
>
> Key: HDFS-10595
> URL: https://issues.apache.org/jira/browse/HDFS-10595
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Anatoli Shein
>Assignee: Bob Hansen
> Attachments: HDFS-10595.HDFS-8707.patch.000
>
>
> When running a cat tool 
> (/hadoop-hdfs-native-client/src/main/native/libhdfspp/examples/cat/c/cat.c) I 
> get the following error:
> [libprotobuf ERROR google/protobuf/wire_format.cc:1053] String field contains 
> invalid UTF-8 data when serializing a protocol buffer. Use the 'bytes' type 
> if you intend to send raw bytes.
> However it executes correctly. Looks like this error happens when trying to 
> serialize Client name in ClientOperationHeaderProto::SerializeWithCachedSizes 
> (/hadoop-hdfs-native-client/target/main/native/libhdfspp/lib/proto/datatransfer.pb.cc)
> Possibly the problem is caused by generating client name as a UUID in 
> GetRandomClientName 
> (/hadoop-hdfs-native-client/src/main/native/libhdfspp/lib/common/util.cc)
> In Java client it looks like there are two different unique client 
> identifiers: ClientName and ClientId:
> Client name is generated as:
> clientName = "DFSClient_" + dfsClientConf.getTaskId() + "_" + 
> ThreadLocalRandom.current().nextInt()  + "_" + 
> Thread.currentThread().getId(); 
> (/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSClient.java)
> ClientId is generated as a UUID in 
> (/hadoop-common/src/main/java/org/apache/hadoop/ipc/ClientId.java)
> In libhdfs++ we need to possibly also have two unique client identifiers, or 
> fix the current client name to work without protobuf warnings/errors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10595) libhdfs++: Client Name Protobuf Error

2016-09-29 Thread Bob Hansen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bob Hansen updated HDFS-10595:
--
Status: Patch Available  (was: Reopened)

> libhdfs++: Client Name Protobuf Error
> -
>
> Key: HDFS-10595
> URL: https://issues.apache.org/jira/browse/HDFS-10595
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Anatoli Shein
>Assignee: Bob Hansen
> Attachments: HDFS-10595.HDFS-8707.patch.000
>
>
> When running a cat tool 
> (/hadoop-hdfs-native-client/src/main/native/libhdfspp/examples/cat/c/cat.c) I 
> get the following error:
> [libprotobuf ERROR google/protobuf/wire_format.cc:1053] String field contains 
> invalid UTF-8 data when serializing a protocol buffer. Use the 'bytes' type 
> if you intend to send raw bytes.
> However it executes correctly. Looks like this error happens when trying to 
> serialize Client name in ClientOperationHeaderProto::SerializeWithCachedSizes 
> (/hadoop-hdfs-native-client/target/main/native/libhdfspp/lib/proto/datatransfer.pb.cc)
> Possibly the problem is caused by generating client name as a UUID in 
> GetRandomClientName 
> (/hadoop-hdfs-native-client/src/main/native/libhdfspp/lib/common/util.cc)
> In Java client it looks like there are two different unique client 
> identifiers: ClientName and ClientId:
> Client name is generated as:
> clientName = "DFSClient_" + dfsClientConf.getTaskId() + "_" + 
> ThreadLocalRandom.current().nextInt()  + "_" + 
> Thread.currentThread().getId(); 
> (/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSClient.java)
> ClientId is generated as a UUID in 
> (/hadoop-common/src/main/java/org/apache/hadoop/ipc/ClientId.java)
> In libhdfs++ we need to possibly also have two unique client identifiers, or 
> fix the current client name to work without protobuf warnings/errors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10595) libhdfs++: Client Name Protobuf Error

2016-09-29 Thread Bob Hansen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bob Hansen updated HDFS-10595:
--
Attachment: HDFS-10595.HDFS-8707.patch.000

> libhdfs++: Client Name Protobuf Error
> -
>
> Key: HDFS-10595
> URL: https://issues.apache.org/jira/browse/HDFS-10595
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Anatoli Shein
>Assignee: Bob Hansen
> Attachments: HDFS-10595.HDFS-8707.patch.000
>
>
> When running a cat tool 
> (/hadoop-hdfs-native-client/src/main/native/libhdfspp/examples/cat/c/cat.c) I 
> get the following error:
> [libprotobuf ERROR google/protobuf/wire_format.cc:1053] String field contains 
> invalid UTF-8 data when serializing a protocol buffer. Use the 'bytes' type 
> if you intend to send raw bytes.
> However it executes correctly. Looks like this error happens when trying to 
> serialize Client name in ClientOperationHeaderProto::SerializeWithCachedSizes 
> (/hadoop-hdfs-native-client/target/main/native/libhdfspp/lib/proto/datatransfer.pb.cc)
> Possibly the problem is caused by generating client name as a UUID in 
> GetRandomClientName 
> (/hadoop-hdfs-native-client/src/main/native/libhdfspp/lib/common/util.cc)
> In Java client it looks like there are two different unique client 
> identifiers: ClientName and ClientId:
> Client name is generated as:
> clientName = "DFSClient_" + dfsClientConf.getTaskId() + "_" + 
> ThreadLocalRandom.current().nextInt()  + "_" + 
> Thread.currentThread().getId(); 
> (/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSClient.java)
> ClientId is generated as a UUID in 
> (/hadoop-common/src/main/java/org/apache/hadoop/ipc/ClientId.java)
> In libhdfs++ we need to possibly also have two unique client identifiers, or 
> fix the current client name to work without protobuf warnings/errors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Reopened] (HDFS-10595) libhdfs++: Client Name Protobuf Error

2016-09-29 Thread Bob Hansen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bob Hansen reopened HDFS-10595:
---
  Assignee: Bob Hansen

On second thought, let me fix this issue here and see if the rest of the errors 
go away in HDFS-9453.

> libhdfs++: Client Name Protobuf Error
> -
>
> Key: HDFS-10595
> URL: https://issues.apache.org/jira/browse/HDFS-10595
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Anatoli Shein
>Assignee: Bob Hansen
>
> When running a cat tool 
> (/hadoop-hdfs-native-client/src/main/native/libhdfspp/examples/cat/c/cat.c) I 
> get the following error:
> [libprotobuf ERROR google/protobuf/wire_format.cc:1053] String field contains 
> invalid UTF-8 data when serializing a protocol buffer. Use the 'bytes' type 
> if you intend to send raw bytes.
> However it executes correctly. Looks like this error happens when trying to 
> serialize Client name in ClientOperationHeaderProto::SerializeWithCachedSizes 
> (/hadoop-hdfs-native-client/target/main/native/libhdfspp/lib/proto/datatransfer.pb.cc)
> Possibly the problem is caused by generating client name as a UUID in 
> GetRandomClientName 
> (/hadoop-hdfs-native-client/src/main/native/libhdfspp/lib/common/util.cc)
> In Java client it looks like there are two different unique client 
> identifiers: ClientName and ClientId:
> Client name is generated as:
> clientName = "DFSClient_" + dfsClientConf.getTaskId() + "_" + 
> ThreadLocalRandom.current().nextInt()  + "_" + 
> Thread.currentThread().getId(); 
> (/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSClient.java)
> ClientId is generated as a UUID in 
> (/hadoop-common/src/main/java/org/apache/hadoop/ipc/ClientId.java)
> In libhdfs++ we need to possibly also have two unique client identifiers, or 
> fix the current client name to work without protobuf warnings/errors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-10595) libhdfs++: Client Name Protobuf Error

2016-09-29 Thread Bob Hansen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bob Hansen resolved HDFS-10595.
---
Resolution: Duplicate

Dupe of HDFS-9453

> libhdfs++: Client Name Protobuf Error
> -
>
> Key: HDFS-10595
> URL: https://issues.apache.org/jira/browse/HDFS-10595
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Anatoli Shein
>
> When running a cat tool 
> (/hadoop-hdfs-native-client/src/main/native/libhdfspp/examples/cat/c/cat.c) I 
> get the following error:
> [libprotobuf ERROR google/protobuf/wire_format.cc:1053] String field contains 
> invalid UTF-8 data when serializing a protocol buffer. Use the 'bytes' type 
> if you intend to send raw bytes.
> However it executes correctly. Looks like this error happens when trying to 
> serialize Client name in ClientOperationHeaderProto::SerializeWithCachedSizes 
> (/hadoop-hdfs-native-client/target/main/native/libhdfspp/lib/proto/datatransfer.pb.cc)
> Possibly the problem is caused by generating client name as a UUID in 
> GetRandomClientName 
> (/hadoop-hdfs-native-client/src/main/native/libhdfspp/lib/common/util.cc)
> In Java client it looks like there are two different unique client 
> identifiers: ClientName and ClientId:
> Client name is generated as:
> clientName = "DFSClient_" + dfsClientConf.getTaskId() + "_" + 
> ThreadLocalRandom.current().nextInt()  + "_" + 
> Thread.currentThread().getId(); 
> (/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSClient.java)
> ClientId is generated as a UUID in 
> (/hadoop-common/src/main/java/org/apache/hadoop/ipc/ClientId.java)
> In libhdfs++ we need to possibly also have two unique client identifiers, or 
> fix the current client name to work without protobuf warnings/errors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10874) libhdfs++: Public API headers should not depend on internal implementation

2016-09-21 Thread Bob Hansen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15509886#comment-15509886
 ] 

Bob Hansen commented on HDFS-10874:
---

+1

> libhdfs++: Public API headers should not depend on internal implementation
> --
>
> Key: HDFS-10874
> URL: https://issues.apache.org/jira/browse/HDFS-10874
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: James Clampffer
> Attachments: HDFS-10874.HDFS-8707.000.patch
>
>
> Public headers need to do some combination of the following: stop including 
> parts of the implementation, forward declare bits of the implementation where 
> absolutely needed, or pull the implementation into include/hdfspp if it's 
> inseparable.
> Example:
> If you want to use the C++ API and only stick include/hdfspp in the include 
> path you'll get an error when you include include/hdfspp/options.h because 
> that goes and includes common/uri.h.
> Related to the work described in HDFS-10787.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10874) libhdfs++: Public API headers should not depend on internal implementation

2016-09-20 Thread Bob Hansen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15506847#comment-15506847
 ] 

Bob Hansen commented on HDFS-10874:
---

Perhaps as another task, we should ensure that the tools and examples build 
using just the public headers.

> libhdfs++: Public API headers should not depend on internal implementation
> --
>
> Key: HDFS-10874
> URL: https://issues.apache.org/jira/browse/HDFS-10874
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: James Clampffer
> Attachments: HDFS-10874.HDFS-8707.000.patch
>
>
> Public headers need to do some combination of the following: stop including 
> parts of the implementation, forward declare bits of the implementation where 
> absolutely needed, or pull the implementation into include/hdfspp if it's 
> inseparable.
> Example:
> If you want to use the C++ API and only stick include/hdfspp in the include 
> path you'll get an error when you include include/hdfspp/options.h because 
> that goes and includes common/uri.h.
> Related to the work described in HDFS-10787.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10860) Switch HttpFS to use Jetty

2016-09-13 Thread Bob Hansen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15487626#comment-15487626
 ] 

Bob Hansen commented on HDFS-10860:
---

[~wheat9] converted the DN side of webhdfs to use Netty for performance and 
stability.  He may have some experience to share.

> Switch HttpFS to use Jetty
> --
>
> Key: HDFS-10860
> URL: https://issues.apache.org/jira/browse/HDFS-10860
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: httpfs
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>
> The Tomcat 6 we are using will reach EOL at the end of 2017. While there are 
> other good options, I would propose switching to {{Jetty 9}} for the 
> following reasons:
> * Easier migration. Both Tomcat and Jetty are based on {{Servlet 
> Containers}}, so we don't have to change client code that much. It would 
> require more work to switch to {{JAX-RS}}.
> * Well established.
> * Good performance and scalability.
> Other alternatives:
> * Jersey + Grizzly
> * Tomcat 8
> Your opinions will be greatly appreciated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10685) libhdfs++: return explicit error when non-secured client connects to secured server

2016-09-10 Thread Bob Hansen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15480699#comment-15480699
 ] 

Bob Hansen commented on HDFS-10685:
---

Thanks for putting that together, [~vectorijk]!

I'll have to check when I get back to the office, but I think we may have a 
specific Status instance for an Authentication error which we should use in 
this case.  If we don't, we should add one for this.

Since Apache has the Hadoop Jira loked down at the moment, feel free to put the 
patch up on github onto a fork of 
https://github.com/apache/hadoop/tree/HDFS-8707.  Also, I think the patch you 
posted may be reversed.  :-)

> libhdfs++: return explicit error when non-secured client connects to secured 
> server
> ---
>
> Key: HDFS-10685
> URL: https://issues.apache.org/jira/browse/HDFS-10685
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
>
> When a non-secured client tries to connect to a secured server, the first 
> indication is an error from RpcConnection::HandleRpcRespose complaining about 
> "RPC response with Unknown call id -33".
> We should insert code in HandleRpcResponse to detect if the unknown call id 
> == RpcEngine::kCallIdSasl and return an informative error that you have an 
> unsecured client connecting to a secured server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10450) libhdfs++: Implement Cyrus SASL implementation in sasl_enigne.cc

2016-09-09 Thread Bob Hansen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bob Hansen updated HDFS-10450:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> libhdfs++: Implement Cyrus SASL implementation in sasl_enigne.cc
> 
>
> Key: HDFS-10450
> URL: https://issues.apache.org/jira/browse/HDFS-10450
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
>Assignee: Bob Hansen
> Attachments: HDFS-10450.HDFS-8707.000.patch, 
> HDFS-10450.HDFS-8707.001.patch, HDFS-10450.HDFS-8707.002.patch, 
> HDFS-10450.HDFS-8707.003.patch, HDFS-10450.HDFS-8707.004.patch, 
> HDFS-10450.HDFS-8707.004.patch, HDFS-10450.HDFS-8707.005.patch
>
>
> The current sasl_engine implementation was proven out using GSASL, which is 
> does not have an ASF-approved license.  It included a framework to use Cyrus 
> SASL (libsasl2.so) instead; we should complete that implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10450) libhdfs++: Implement Cyrus SASL implementation in sasl_enigne.cc

2016-09-08 Thread Bob Hansen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15475099#comment-15475099
 ] 

Bob Hansen commented on HDFS-10450:
---

Error comes from contrib/webhdfs, not from anywhere in HDFS-8707.

> libhdfs++: Implement Cyrus SASL implementation in sasl_enigne.cc
> 
>
> Key: HDFS-10450
> URL: https://issues.apache.org/jira/browse/HDFS-10450
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
>Assignee: Bob Hansen
> Attachments: HDFS-10450.HDFS-8707.000.patch, 
> HDFS-10450.HDFS-8707.001.patch, HDFS-10450.HDFS-8707.002.patch, 
> HDFS-10450.HDFS-8707.003.patch, HDFS-10450.HDFS-8707.004.patch, 
> HDFS-10450.HDFS-8707.004.patch, HDFS-10450.HDFS-8707.005.patch
>
>
> The current sasl_engine implementation was proven out using GSASL, which is 
> does not have an ASF-approved license.  It included a framework to use Cyrus 
> SASL (libsasl2.so) instead; we should complete that implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10450) libhdfs++: Implement Cyrus SASL implementation in sasl_enigne.cc

2016-09-08 Thread Bob Hansen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bob Hansen updated HDFS-10450:
--
Attachment: HDFS-10450.HDFS-8707.005.patch

> libhdfs++: Implement Cyrus SASL implementation in sasl_enigne.cc
> 
>
> Key: HDFS-10450
> URL: https://issues.apache.org/jira/browse/HDFS-10450
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
>Assignee: Bob Hansen
> Attachments: HDFS-10450.HDFS-8707.000.patch, 
> HDFS-10450.HDFS-8707.001.patch, HDFS-10450.HDFS-8707.002.patch, 
> HDFS-10450.HDFS-8707.003.patch, HDFS-10450.HDFS-8707.004.patch, 
> HDFS-10450.HDFS-8707.004.patch, HDFS-10450.HDFS-8707.005.patch
>
>
> The current sasl_engine implementation was proven out using GSASL, which is 
> does not have an ASF-approved license.  It included a framework to use Cyrus 
> SASL (libsasl2.so) instead; we should complete that implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10450) libhdfs++: Implement Cyrus SASL implementation in sasl_enigne.cc

2016-09-08 Thread Bob Hansen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bob Hansen updated HDFS-10450:
--
Attachment: HDFS-10450.HDFS-8707.004.patch

> libhdfs++: Implement Cyrus SASL implementation in sasl_enigne.cc
> 
>
> Key: HDFS-10450
> URL: https://issues.apache.org/jira/browse/HDFS-10450
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
>Assignee: Bob Hansen
> Attachments: HDFS-10450.HDFS-8707.000.patch, 
> HDFS-10450.HDFS-8707.001.patch, HDFS-10450.HDFS-8707.002.patch, 
> HDFS-10450.HDFS-8707.003.patch, HDFS-10450.HDFS-8707.004.patch, 
> HDFS-10450.HDFS-8707.004.patch
>
>
> The current sasl_engine implementation was proven out using GSASL, which is 
> does not have an ASF-approved license.  It included a framework to use Cyrus 
> SASL (libsasl2.so) instead; we should complete that implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10450) libhdfs++: Implement Cyrus SASL implementation in sasl_enigne.cc

2016-09-08 Thread Bob Hansen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15474643#comment-15474643
 ] 

Bob Hansen commented on HDFS-10450:
---

bq. Still a bunch of places with "using namespace hdfs".
Fixed.

bq. Still places with the "// end " at the back of fairly small 
blocks e.g. srrStr in cyrus_sasl_engine.cc. This isn't really a blocker for me 
though.
Left most of them in there.  While I don't think they're ncessary, the original 
author found them useful.

bq. Any reason not to keep the status in the output here? Same thing in 
SaslProtocol::OnServerResponse:
Nope.  Removed all cases where we reduced the information in the DEBUG and 
TRACE messages.  Fixed.

bq. In general I wonder if it's worth keeping a file around as a staging area 
for holding things like getsecret in cyrus_sasl_engine.cc (it's currently ~40 
lines commented out). 
No, I don't think it is.  I wanted to keep them for when we did the token work 
in the future, but they're captured in these patches.  They shouldn't hit the 
main codebase.  Fixed.



> libhdfs++: Implement Cyrus SASL implementation in sasl_enigne.cc
> 
>
> Key: HDFS-10450
> URL: https://issues.apache.org/jira/browse/HDFS-10450
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
>Assignee: Bob Hansen
> Attachments: HDFS-10450.HDFS-8707.000.patch, 
> HDFS-10450.HDFS-8707.001.patch, HDFS-10450.HDFS-8707.002.patch, 
> HDFS-10450.HDFS-8707.003.patch, HDFS-10450.HDFS-8707.004.patch
>
>
> The current sasl_engine implementation was proven out using GSASL, which is 
> does not have an ASF-approved license.  It included a framework to use Cyrus 
> SASL (libsasl2.so) instead; we should complete that implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10450) libhdfs++: Implement Cyrus SASL implementation in sasl_enigne.cc

2016-09-08 Thread Bob Hansen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bob Hansen updated HDFS-10450:
--
Attachment: HDFS-10450.HDFS-8707.004.patch

> libhdfs++: Implement Cyrus SASL implementation in sasl_enigne.cc
> 
>
> Key: HDFS-10450
> URL: https://issues.apache.org/jira/browse/HDFS-10450
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
>Assignee: Bob Hansen
> Attachments: HDFS-10450.HDFS-8707.000.patch, 
> HDFS-10450.HDFS-8707.001.patch, HDFS-10450.HDFS-8707.002.patch, 
> HDFS-10450.HDFS-8707.003.patch, HDFS-10450.HDFS-8707.004.patch
>
>
> The current sasl_engine implementation was proven out using GSASL, which is 
> does not have an ASF-approved license.  It included a framework to use Cyrus 
> SASL (libsasl2.so) instead; we should complete that implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10450) libhdfs++: Implement Cyrus SASL implementation in sasl_enigne.cc

2016-09-07 Thread Bob Hansen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bob Hansen updated HDFS-10450:
--
Attachment: HDFS-10450.HDFS-8707.003.patch

> libhdfs++: Implement Cyrus SASL implementation in sasl_enigne.cc
> 
>
> Key: HDFS-10450
> URL: https://issues.apache.org/jira/browse/HDFS-10450
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
>Assignee: Bob Hansen
> Attachments: HDFS-10450.HDFS-8707.000.patch, 
> HDFS-10450.HDFS-8707.001.patch, HDFS-10450.HDFS-8707.002.patch, 
> HDFS-10450.HDFS-8707.003.patch
>
>
> The current sasl_engine implementation was proven out using GSASL, which is 
> does not have an ASF-approved license.  It included a framework to use Cyrus 
> SASL (libsasl2.so) instead; we should complete that implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10450) libhdfs++: Implement Cyrus SASL implementation in sasl_enigne.cc

2016-09-07 Thread Bob Hansen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15471315#comment-15471315
 ] 

Bob Hansen commented on HDFS-10450:
---

Thanks, [~James C].  Lots of good feedback.

{quote}
  chosen_mech_.mechanism = std::string("", 0);
  chosen_mech_.protocol  = std::string("", 0);
  chosen_mech_.serverid  = std::string("", 0);
  chosen_mech_.challenge = std::string("", 0);
Any reason to do explicit string instantiation with count here rather than just 
assign ""?
{quote}
Replaced with chosen_mech_ = SaslMethod();

{quote}
I'm guessing the "// for()" was for your use while debugging or something and 
could be removed.
{quote}
Copypasta dirt.  Fixed.

{quote}
Are we adding authors to files now? Or was this something added by an IDE? 
{quote}
The latter.  Fixed.

{quote}
  using namespace hdfs;
Let's get rid of the "using namespace hdfs" in the header. Doesn't look like 
it's required anymore anyway.
{quote}
More copypasta that slipped by.  Fixed.

bq. Nit: make the assignment to challenge line up with the rest of the member 
assignments.
Done

{quote}
+  for (int i = 0; i < pb_auths.size(); ++i) {
+  auto  pb_auth = pb_auths.Get(i);
+  AuthInfo::AuthMethod method = ParseMethod(pb_auth.method())
Could you use the full type name for pb_auth or typedef it here? 
{quote}
Fixed.  It's RpcSaslProto_SaslAuth, which starts getting a bit wordy in the 
source.

{quote}
friend int getrealm(void *, int, const char **availrealms 
__attribute__((unused)),
const char **);
Why do we need the attribute unused here?
{quote}
More dirt left over from the originator.  I fixed it in the implementation, but 
not the declaration.

{quote}
Make CyCaslEngine::per_connection_callbacks_ a struct of function pointers 
rather than an vector so the compiler can statically check member access and we 
can avoid tracing down issues where we key in with the wrong thing. Was the 
plan to use this as a jump table in a DFA? If that's the case a vector (or 
preferably an array) is fine I can't find anything to indicate that.
{quote}
per_connection_callbacks is required by the Cyrus SASL library to be a 
specially-terminated array of structs.  We're (without heroic effort) limited 
by the public interface of the library we're using.

bq. Nit: CySaslEngine::InitCyrusSasl is only indented by 1 space so it's a bit 
harder to read.
Fixed

{quote}
   // Initialize the sasl_li  brary with per-connection configuration:
   const char * fqdn  = chosen_mech_.serverid.c_str();
   const char * proto = chosen_mech_.protocol.c_str();
  
   rc = sasl_client_new( proto, fqdn, NULL, NULL, \_connection\_callbacks\_ 
[0], 0, _);
   if (rc != SASL_OK) return SaslError(rc);
Is CySaslEngine::InitCyrusSasl guarded by a lock at a higher level? My concern 
is holding onto the result of c_str() calls and if one of those strings was to 
change in another 
InitCyrusSasl call.
{quote} 
The strings are copied by the engine in the call to sasl_client_new, so I think 
we're good on that one.


> libhdfs++: Implement Cyrus SASL implementation in sasl_enigne.cc
> 
>
> Key: HDFS-10450
> URL: https://issues.apache.org/jira/browse/HDFS-10450
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
>Assignee: Bob Hansen
> Attachments: HDFS-10450.HDFS-8707.000.patch, 
> HDFS-10450.HDFS-8707.001.patch, HDFS-10450.HDFS-8707.002.patch
>
>
> The current sasl_engine implementation was proven out using GSASL, which is 
> does not have an ASF-approved license.  It included a framework to use Cyrus 
> SASL (libsasl2.so) instead; we should complete that implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10450) libhdfs++: Implement Cyrus SASL implementation in sasl_enigne.cc

2016-09-06 Thread Bob Hansen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468056#comment-15468056
 ] 

Bob Hansen commented on HDFS-10450:
---

Cleaned up whitespace

> libhdfs++: Implement Cyrus SASL implementation in sasl_enigne.cc
> 
>
> Key: HDFS-10450
> URL: https://issues.apache.org/jira/browse/HDFS-10450
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
>Assignee: Bob Hansen
> Attachments: HDFS-10450.HDFS-8707.000.patch, 
> HDFS-10450.HDFS-8707.001.patch, HDFS-10450.HDFS-8707.002.patch
>
>
> The current sasl_engine implementation was proven out using GSASL, which is 
> does not have an ASF-approved license.  It included a framework to use Cyrus 
> SASL (libsasl2.so) instead; we should complete that implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10450) libhdfs++: Implement Cyrus SASL implementation in sasl_enigne.cc

2016-09-06 Thread Bob Hansen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bob Hansen updated HDFS-10450:
--
Attachment: HDFS-10450.HDFS-8707.002.patch

> libhdfs++: Implement Cyrus SASL implementation in sasl_enigne.cc
> 
>
> Key: HDFS-10450
> URL: https://issues.apache.org/jira/browse/HDFS-10450
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
>Assignee: Bob Hansen
> Attachments: HDFS-10450.HDFS-8707.000.patch, 
> HDFS-10450.HDFS-8707.001.patch, HDFS-10450.HDFS-8707.002.patch
>
>
> The current sasl_engine implementation was proven out using GSASL, which is 
> does not have an ASF-approved license.  It included a framework to use Cyrus 
> SASL (libsasl2.so) instead; we should complete that implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10450) libhdfs++: Implement Cyrus SASL implementation in sasl_enigne.cc

2016-09-06 Thread Bob Hansen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bob Hansen updated HDFS-10450:
--
Attachment: HDFS-10450.HDFS-8707.001.patch

Added CyrusSASL to the docker image

> libhdfs++: Implement Cyrus SASL implementation in sasl_enigne.cc
> 
>
> Key: HDFS-10450
> URL: https://issues.apache.org/jira/browse/HDFS-10450
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
>Assignee: Bob Hansen
> Attachments: HDFS-10450.HDFS-8707.000.patch, 
> HDFS-10450.HDFS-8707.001.patch
>
>
> The current sasl_engine implementation was proven out using GSASL, which is 
> does not have an ASF-approved license.  It included a framework to use Cyrus 
> SASL (libsasl2.so) instead; we should complete that implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-8707) Implement an async pure c++ HDFS client

2016-09-03 Thread Bob Hansen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bob Hansen updated HDFS-8707:
-
Status: Patch Available  (was: Reopened)

> Implement an async pure c++ HDFS client
> ---
>
> Key: HDFS-8707
> URL: https://issues.apache.org/jira/browse/HDFS-8707
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: hdfs-client
>Reporter: Owen O'Malley
>Assignee: Bob Hansen
>
> As part of working on the C++ ORC reader at ORC-3, we need an HDFS pure C++ 
> client that lets us do async io to HDFS. We want to start from the code that 
> Haohui's been working on at https://github.com/haohui/libhdfspp .



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-10450) libhdfs++: Implement Cyrus SASL implementation in sasl_enigne.cc

2016-09-03 Thread Bob Hansen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bob Hansen reassigned HDFS-10450:
-

Assignee: Bob Hansen

> libhdfs++: Implement Cyrus SASL implementation in sasl_enigne.cc
> 
>
> Key: HDFS-10450
> URL: https://issues.apache.org/jira/browse/HDFS-10450
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
>Assignee: Bob Hansen
> Attachments: HDFS-10450.HDFS-8707.000.patch
>
>
> The current sasl_engine implementation was proven out using GSASL, which is 
> does not have an ASF-approved license.  It included a framework to use Cyrus 
> SASL (libsasl2.so) instead; we should complete that implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-8707) Implement an async pure c++ HDFS client

2016-09-03 Thread Bob Hansen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bob Hansen updated HDFS-8707:
-
Status: Open  (was: Patch Available)

> Implement an async pure c++ HDFS client
> ---
>
> Key: HDFS-8707
> URL: https://issues.apache.org/jira/browse/HDFS-8707
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: hdfs-client
>Reporter: Owen O'Malley
>Assignee: Bob Hansen
>
> As part of working on the C++ ORC reader at ORC-3, we need an HDFS pure C++ 
> client that lets us do async io to HDFS. We want to start from the code that 
> Haohui's been working on at https://github.com/haohui/libhdfspp .



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10450) libhdfs++: Implement Cyrus SASL implementation in sasl_enigne.cc

2016-09-03 Thread Bob Hansen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bob Hansen updated HDFS-10450:
--
Status: Patch Available  (was: Open)

> libhdfs++: Implement Cyrus SASL implementation in sasl_enigne.cc
> 
>
> Key: HDFS-10450
> URL: https://issues.apache.org/jira/browse/HDFS-10450
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
> Attachments: HDFS-10450.HDFS-8707.000.patch
>
>
> The current sasl_engine implementation was proven out using GSASL, which is 
> does not have an ASF-approved license.  It included a framework to use Cyrus 
> SASL (libsasl2.so) instead; we should complete that implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-10836) libhdfs++: Add a NO_TOOLS and NO_EXAMPLES flag to cmake

2016-09-02 Thread Bob Hansen (JIRA)
Bob Hansen created HDFS-10836:
-

 Summary: libhdfs++: Add a NO_TOOLS and NO_EXAMPLES flag to cmake
 Key: HDFS-10836
 URL: https://issues.apache.org/jira/browse/HDFS-10836
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Bob Hansen


For some instances, our consumers will want just the library, and not want to 
compile (and filgure out linking) for the tools and examples.  Let's add a 
CMake flag to turn those off.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10787) libhdfs++: hdfs_configuration and configuration_loader should be accessible from our public API

2016-09-02 Thread Bob Hansen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15459611#comment-15459611
 ] 

Bob Hansen commented on HDFS-10787:
---

For the sake of simplicity, I think we should introduce a public facade for 
them that only exposes two functions: LoadConfigs() and 
LoadConfigsFromDirectory(const char * dir), each returning an Options object.

Do we have a compelling use case for more than that?  Anything that wants to 
poke values in can poke values into the returned Options rather than the XML, I 
think.

> libhdfs++: hdfs_configuration and configuration_loader should be accessible 
> from our public API
> ---
>
> Key: HDFS-10787
> URL: https://issues.apache.org/jira/browse/HDFS-10787
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Anatoli Shein
>
> Currently, libhdfspp examples and tools all have this:
> #include "hdfspp/hdfspp.h"
> #include "common/hdfs_configuration.h"
> #include "common/configuration_loader.h"
> This is done in order to read configs and connect. We want  
> hdfs_configuration and configuration_loader to be accessible just by 
> including our hdfspp.h. One way to achieve that would be to create a builder, 
> and would include the above libraries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10450) libhdfs++: Implement Cyrus SASL implementation in sasl_enigne.cc

2016-09-02 Thread Bob Hansen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bob Hansen updated HDFS-10450:
--
Attachment: HDFS-10450.HDFS-8707.000.patch

> libhdfs++: Implement Cyrus SASL implementation in sasl_enigne.cc
> 
>
> Key: HDFS-10450
> URL: https://issues.apache.org/jira/browse/HDFS-10450
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
> Attachments: HDFS-10450.HDFS-8707.000.patch
>
>
> The current sasl_engine implementation was proven out using GSASL, which is 
> does not have an ASF-approved license.  It included a framework to use Cyrus 
> SASL (libsasl2.so) instead; we should complete that implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10705) libhdfs++: FileSystem should have a convenience no-args ctor

2016-09-02 Thread Bob Hansen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15459400#comment-15459400
 ] 

Bob Hansen commented on HDFS-10705:
---

+1

> libhdfs++: FileSystem should have a convenience no-args ctor
> 
>
> Key: HDFS-10705
> URL: https://issues.apache.org/jira/browse/HDFS-10705
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
>Assignee: James Clampffer
> Attachments: HDFS-10705.HDFS-8707.000.patch, 
> HDFS-10705.HDFS-8707.001.patch
>
>
> Our examples demonstrate that the common use case is "use default options, 
> default username, and default IOService."  Let's make a FileSystem::New() 
> that helps users with that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10705) libhdfs++: FileSystem should have a convenience no-args ctor

2016-09-02 Thread Bob Hansen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15459023#comment-15459023
 ] 

Bob Hansen commented on HDFS-10705:
---

One change, if we could: remove "convenience factory" from the comment on 
New().  Just make it "Returns a new instance with default user and option, with 
the default IOService" or somesuch.

> libhdfs++: FileSystem should have a convenience no-args ctor
> 
>
> Key: HDFS-10705
> URL: https://issues.apache.org/jira/browse/HDFS-10705
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
>Assignee: James Clampffer
> Attachments: HDFS-10705.HDFS-8707.000.patch
>
>
> Our examples demonstrate that the common use case is "use default options, 
> default username, and default IOService."  Let's make a FileSystem::New() 
> that helps users with that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-10796) libhdfs++: rationalize ioservice interactions

2016-08-25 Thread Bob Hansen (JIRA)
Bob Hansen created HDFS-10796:
-

 Summary: libhdfs++: rationalize ioservice interactions
 Key: HDFS-10796
 URL: https://issues.apache.org/jira/browse/HDFS-10796
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Bob Hansen


Firstly, we should be pulling the number of threads from options.io_threads 
(which should default to std::thread::hardware_concurrency()).  The library 
should pass all tests always with io_threads set to 1 or to 

Secondly, we should have _a_ constructor where the consumer doesn't need to 
manage the IOService explicitly, and the FileSystemImpl should create its own 
internally.

Since the FileSystem is defined as being for a particular user/identity, there 
is a valid use case for the consumer to be constructing many FileSystem 
instances to represent many authenticated users in the same process, but want 
to share resources (notably have a single io_service shared amongst them all).  
In this case, the consumer would want to own the IOService and pass the same 
instance to multiple FileSystem instances.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10754) libhdfs++: Create tools directory and implement hdfs_cat, hdfs_chgrp, hdfs_chown, hdfs_chmod and hdfs_find.

2016-08-24 Thread Bob Hansen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435817#comment-15435817
 ] 

Bob Hansen commented on HDFS-10754:
---

Looks like there was a compilation failure in tools: could not find 
hdfs_find.cpp.

> libhdfs++: Create tools directory and implement hdfs_cat, hdfs_chgrp, 
> hdfs_chown, hdfs_chmod and hdfs_find.
> ---
>
> Key: HDFS-10754
> URL: https://issues.apache.org/jira/browse/HDFS-10754
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Anatoli Shein
>Assignee: Anatoli Shein
> Attachments: HDFS-10754.HDFS-8707.000.patch, 
> HDFS-10754.HDFS-8707.001.patch, HDFS-10754.HDFS-8707.002.patch, 
> HDFS-10754.HDFS-8707.003.patch, HDFS-10754.HDFS-8707.004.patch, 
> HDFS-10754.HDFS-8707.005.patch, HDFS-10754.HDFS-8707.006.patch, 
> HDFS-10754.HDFS-8707.007.patch, HDFS-10754.HDFS-8707.008.patch, 
> HDFS-10754.HDFS-8707.009.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-10754) libhdfs++: Create tools directory and implement hdfs_cat, hdfs_chgrp, hdfs_chown, hdfs_chmod and hdfs_find.

2016-08-24 Thread Bob Hansen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435809#comment-15435809
 ] 

Bob Hansen edited comment on HDFS-10754 at 8/24/16 10:10 PM:
-

Thanks for your hard work, [~anatoli.shein].  I'm sorry to have to keep 
dragging you back, but...

The new recursive methods should not use future/promises internally.  That 
blocks one of the asio threads waiting for more data; if a consumer tried to do 
one of these with a single thread in the threadpool, it would deadlock waiting 
for the subtasks to complete, but they'd all get queued up behind the initial 
handler.

Instead, whenever they're all done (request_count == 0), the last one out the 
door (the handler that dropped the request_count to 0) should call into the 
consumer's handler directly with the final status.  If any of the other threads 
has received an error, all of the subsequent deliveries to the handler should 
return false, telling find "I'm going to report an error anyway, so don't 
bother recursing any more."  It's good to wait until request_count==0, even in 
an error state, so the consumer doesn't have any lame duck requests queued up 
to take care of?

Also because all of this is asynchronous, you can't allocate the lock and state 
variables on the stack.  When the consumer calls SetOwner('/', true, handler), 
the function is going to return as soon as the find operation is kicked off, 
destroying all of the elements on the stack.  We'll need to create a little 
struct for SetOwner that is maintained with a shared_ptr and cleaned up when 
the last request is done.



Minor points:
In find, perhaps recursion_counter is a bit of a misnomer at this point.  It's 
more outstanding_requests, since for big directories, we'll have more requests 
without recursing.

Perhaps FindOperationState is a better name than CurrentState, and 
SharedFindState is better than just SharedState (since we might have many 
shared states in the FileSystem class).

In CurrentState, perhaps "depth" is more accurate than position?

Do we support a globbing find without recursion?  Can I find "/dir?/path*/" 
"\*.db", and not have it recurse to the sub-directories of path*?

Can we push the shims and state into the .cpp file and keep them out of the 
itnerface (even if private)?

We're very close, now.



was (Author: bobhansen):
Thanks for your hard work, [~anatoli.shein].  I'm sorry to have to keep 
dragging you back, but...

The new recursive methods should not use future/promises internally.  That 
blocks one of the asio threads waiting for more data; if a consumer tried to do 
one of these with a single thread in the threadpool, it would deadlock waiting 
for the subtasks to complete, but they'd all get queued up behind the initial 
handler.

Instead, whenever they're all done (request_count == 0), the last one out the 
door (the handler that dropped the request_count to 0) should call into the 
consumer's handler directly with the final status.  If any of the other threads 
has received an error, all of the subsequent deliveries to the handler should 
return false, telling find "I'm going to report an error anyway, so don't 
bother recursing any more."  It's good to wait until request_count==0, even in 
an error state, so the consumer doesn't have any lame duck requests queued up 
to take care of?

Also because all of this is asynchronous, you can't allocate the lock and state 
variables on the stack.  When the consumer calls SetOwner('/', true, handler), 
the function is going to return as soon as the find operation is kicked off, 
destroying all of the elements on the stack.  We'll need to create a little 
struct for SetOwner that is maintained with a shared_ptr and cleaned up when 
the last request is done.



Minor points:
In find, perhaps recursion_counter is a bit of a misnomer at this point.  It's 
more outstanding_requests, since for big directories, we'll have more requests 
without recursing.

Perhaps FindOperationState is a better name than CurrentState, and 
SharedFindState is better than just SharedState (since we might have many 
shared states in the FileSystem class).

In CurrentState, perhaps "depth" is more accurate than position?

Do we support a globbing find without recursion?  Can I find "/dir?/path*/" 
"*.db", and not have it recurse to the sub-directories of path*?

Can we push the shims and state into the .cpp file and keep them out of the 
itnerface (even if private)?

We're very close, now.


> libhdfs++: Create tools directory and implement hdfs_cat, hdfs_chgrp, 
> hdfs_chown, hdfs_chmod and hdfs_find.
> ---
>
> Key: HDFS-10754
> URL: https://issues.apache.org/jira/browse/HDFS-10754
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  

[jira] [Commented] (HDFS-10754) libhdfs++: Create tools directory and implement hdfs_cat, hdfs_chgrp, hdfs_chown, hdfs_chmod and hdfs_find.

2016-08-24 Thread Bob Hansen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435809#comment-15435809
 ] 

Bob Hansen commented on HDFS-10754:
---

Thanks for your hard work, [~anatoli.shein].  I'm sorry to have to keep 
dragging you back, but...

The new recursive methods should not use future/promises internally.  That 
blocks one of the asio threads waiting for more data; if a consumer tried to do 
one of these with a single thread in the threadpool, it would deadlock waiting 
for the subtasks to complete, but they'd all get queued up behind the initial 
handler.

Instead, whenever they're all done (request_count == 0), the last one out the 
door (the handler that dropped the request_count to 0) should call into the 
consumer's handler directly with the final status.  If any of the other threads 
has received an error, all of the subsequent deliveries to the handler should 
return false, telling find "I'm going to report an error anyway, so don't 
bother recursing any more."  It's good to wait until request_count==0, even in 
an error state, so the consumer doesn't have any lame duck requests queued up 
to take care of?

Also because all of this is asynchronous, you can't allocate the lock and state 
variables on the stack.  When the consumer calls SetOwner('/', true, handler), 
the function is going to return as soon as the find operation is kicked off, 
destroying all of the elements on the stack.  We'll need to create a little 
struct for SetOwner that is maintained with a shared_ptr and cleaned up when 
the last request is done.



Minor points:
In find, perhaps recursion_counter is a bit of a misnomer at this point.  It's 
more outstanding_requests, since for big directories, we'll have more requests 
without recursing.

Perhaps FindOperationState is a better name than CurrentState, and 
SharedFindState is better than just SharedState (since we might have many 
shared states in the FileSystem class).

In CurrentState, perhaps "depth" is more accurate than position?

Do we support a globbing find without recursion?  Can I find "/dir?/path*/" 
"*.db", and not have it recurse to the sub-directories of path*?

Can we push the shims and state into the .cpp file and keep them out of the 
itnerface (even if private)?

We're very close, now.


> libhdfs++: Create tools directory and implement hdfs_cat, hdfs_chgrp, 
> hdfs_chown, hdfs_chmod and hdfs_find.
> ---
>
> Key: HDFS-10754
> URL: https://issues.apache.org/jira/browse/HDFS-10754
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Anatoli Shein
>Assignee: Anatoli Shein
> Attachments: HDFS-10754.HDFS-8707.000.patch, 
> HDFS-10754.HDFS-8707.001.patch, HDFS-10754.HDFS-8707.002.patch, 
> HDFS-10754.HDFS-8707.003.patch, HDFS-10754.HDFS-8707.004.patch, 
> HDFS-10754.HDFS-8707.005.patch, HDFS-10754.HDFS-8707.006.patch, 
> HDFS-10754.HDFS-8707.007.patch, HDFS-10754.HDFS-8707.008.patch, 
> HDFS-10754.HDFS-8707.009.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-10790) libhdfs++: split recursive versions of SetPermission and SetOwner to SetAllPermissions and SetAllOwner

2016-08-24 Thread Bob Hansen (JIRA)
Bob Hansen created HDFS-10790:
-

 Summary: libhdfs++: split recursive versions of SetPermission and 
SetOwner to SetAllPermissions and SetAllOwner
 Key: HDFS-10790
 URL: https://issues.apache.org/jira/browse/HDFS-10790
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Bob Hansen


We currently have a flag that we pass in to SetPermission and SetOwner that 
change the semantics of the call.  We should split it into two functions, one 
that does an efficient, direct version, and the other that does globbing and 
optionally recursion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10679) libhdfs++: Implement parallel find with wildcards tool

2016-08-23 Thread Bob Hansen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15432974#comment-15432974
 ] 

Bob Hansen commented on HDFS-10679:
---

[~anatoli.shein]: this looks very usesful.  I like demonstrating both sync and 
async find.  Nice.

FS::Find:
* Async - Callback deliver results with a const std::vector &, not a 
shared_ptr.  This is a signal to the consumer to use the data delivered during 
the callback, but don't use the passed-in container.
* Likewise, the synchronous call should take a non-const std::vector 
* for an output parameter, signaling to the consumer that we are going to 
mutate their input vector
* We need a very clear threading model.  Will the handler be called 
concurrently from multiple threads (currently, yes.  If we ever get on asio 
fibers, we should make it a no, because we love our consumers)
* We're doing a _lot_ of dynamic memory allocation during recursion.  Could we 
restructure things a little to not copy the entirety of the FindState and 
RecursionState on each call?  It appears that they each have one element that 
is being updated for each recursive call
* We need to hold the lock while incrementing the recursion_counter also
* If the handler returns false (don't want more) at the end of the function, do 
we do anything to prevent more from being delivered?  Should we push that into 
the shared find_state and bail out for any subsequent NN responses?


find.cpp: 
* Like the cat examples, simplify as much as possible.  Nuke URI parsing, etc.
* Expand smth_found to something_found to prevent confusion (especially in an 
example)
* We have race conditions if one thread is outputting the previous block while 
another thread gets a final block (or error).  

FS::GetFileInfo should populate the full_path member also

> libhdfs++: Implement parallel find with wildcards tool
> --
>
> Key: HDFS-10679
> URL: https://issues.apache.org/jira/browse/HDFS-10679
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Anatoli Shein
>Assignee: Anatoli Shein
> Attachments: HDFS-10679.HDFS-8707.000.patch, 
> HDFS-10679.HDFS-8707.001.patch, HDFS-10679.HDFS-8707.002.patch, 
> HDFS-10679.HDFS-8707.003.patch, HDFS-10679.HDFS-8707.004.patch, 
> HDFS-10679.HDFS-8707.005.patch, HDFS-10679.HDFS-8707.006.patch, 
> HDFS-10679.HDFS-8707.007.patch, HDFS-10679.HDFS-8707.008.patch, 
> HDFS-10679.HDFS-8707.009.patch, HDFS-10679.HDFS-8707.010.patch, 
> HDFS-10679.HDFS-8707.011.patch, HDFS-10679.HDFS-8707.012.patch, 
> HDFS-10679.HDFS-8707.013.patch
>
>
> The find tool will issue the GetListing namenode operation on a given 
> directory, and filter the results using posix globbing library.
> If the recursive option is selected, for each returned entry that is a 
> directory the tool will issue another asynchronous call GetListing and repeat 
> the result processing in a recursive fashion.
> One implementation issue that needs to be addressed is the way how results 
> are returned back to the user: we can either buffer the results and return 
> them to the user in bulk, or we can return results continuously as they 
> arrive. While buffering would be an easier solution, returning results as 
> they arrive would be more beneficial to the user in terms of performance, 
> since the result processing can start as soon as the first results arrive 
> without any delay. In order to do that we need the user to use a loop to 
> process arriving results, and we need to send a special message back to the 
> user when the search is over.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10754) libhdfs++: Create tools directory and implement hdfs_cat, hdfs_chgrp, hdfs_chown, and hdfs_chmod

2016-08-23 Thread Bob Hansen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15432872#comment-15432872
 ] 

Bob Hansen commented on HDFS-10754:
---

One more thing...

Should we add a logger to the tools?  If we get a NN fault and issue a warning, 
will it get logged to stderr?  Is that the desired behavior?

> libhdfs++: Create tools directory and implement hdfs_cat, hdfs_chgrp, 
> hdfs_chown, and hdfs_chmod
> 
>
> Key: HDFS-10754
> URL: https://issues.apache.org/jira/browse/HDFS-10754
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Anatoli Shein
>Assignee: Anatoli Shein
> Attachments: HDFS-10754.HDFS-8707.000.patch, 
> HDFS-10754.HDFS-8707.001.patch, HDFS-10754.HDFS-8707.002.patch, 
> HDFS-10754.HDFS-8707.003.patch, HDFS-10754.HDFS-8707.004.patch, 
> HDFS-10754.HDFS-8707.005.patch, HDFS-10754.HDFS-8707.006.patch, 
> HDFS-10754.HDFS-8707.007.patch, HDFS-10754.HDFS-8707.008.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10754) libhdfs++: Create tools directory and implement hdfs_cat, hdfs_chgrp, hdfs_chown, and hdfs_chmod

2016-08-23 Thread Bob Hansen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15432868#comment-15432868
 ] 

Bob Hansen commented on HDFS-10754:
---

This is lots of good work, [~anatoli.shein].  I have a handful of mostly minor 
things I might propose:

cat:
* Why are we including "google/protobuf/stubs/common.h?
* Comment says "wrapping fs in unique_ptr", but then we wrap it in a shared_ptr.
* We shouldn't check the port on defaultFS before reporting error.  The 
defaultFS might point to an HA name.  Just check that defaultFS isn't empty, 
and report it if it is.
* For simplicity, cat should just use file->Read rather than PositionRead

gendirs:
* As an example, gendirs should _definitely_ not need to use 
fs/namenode_operations
* It should not need to use common/* either; we should push the configuration 
stuff into hdfspp/ (but perhaps that can be a different bug)
* Again, why include the protobuf header?
* We generally don't make values on the stack const (e.g. path, depth, fanout). 
 It's not wrong, just generally redundant (unless it's important they be const 
for some reason)
* Put a comment on setting the timeout referring to HDFS-10781 (and vice-versa)

configuration_loader.cc:
* Add a comment on why we're calling setDefaultSearchPath in the ctor

configuration_loader.h:
* I might make the comment something along the lines of "Creates a 
configuration loader with the default search path ().  If you want to 
explicitly set the entire search path, call ClearSearchPath() first"
* Filesystem.cc: can SetOwner/SetPermission be written as a call to 
::Find(recursive=true) with the SetOwner/SetPerimission implemented in the 
callback?  Then we wouldn't need three separate implementations of the 
recursion logic
* Does recursive SetOwner/SetPermissions accept globs both for the recursive 
and non-recursive versions?  We should be consistent.  Perhaps 
SetOwner(explicit filename, fast) and SetOwner(glob/recursive, slower) should 
be different methods

Tools impl:
* Make a usage() function that prints out the usage of the tool.  Call it if 
"--help", "-h", or a parameter problem occurs.
* Keen gendirs in the examples.  I don't think we need a tool version of it.

Include a comment on HDFS-9539 to fix up the tools and examples as part of the 
scope.

hdfsNewBuilderFromDirectory (in hdfs.cc) should call ClearSearchPath rather 
than inheriting the default.  Are there any other instances in our codebase 
where we're currently constructing loaders whose behavior we need to 
double-check?


> libhdfs++: Create tools directory and implement hdfs_cat, hdfs_chgrp, 
> hdfs_chown, and hdfs_chmod
> 
>
> Key: HDFS-10754
> URL: https://issues.apache.org/jira/browse/HDFS-10754
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Anatoli Shein
>Assignee: Anatoli Shein
> Attachments: HDFS-10754.HDFS-8707.000.patch, 
> HDFS-10754.HDFS-8707.001.patch, HDFS-10754.HDFS-8707.002.patch, 
> HDFS-10754.HDFS-8707.003.patch, HDFS-10754.HDFS-8707.004.patch, 
> HDFS-10754.HDFS-8707.005.patch, HDFS-10754.HDFS-8707.006.patch, 
> HDFS-10754.HDFS-8707.007.patch, HDFS-10754.HDFS-8707.008.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-10781) libhdfs++: redefine NN timeout to be "time without a response"

2016-08-19 Thread Bob Hansen (JIRA)
Bob Hansen created HDFS-10781:
-

 Summary: libhdfs++: redefine NN timeout to be "time without a 
response"
 Key: HDFS-10781
 URL: https://issues.apache.org/jira/browse/HDFS-10781
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Bob Hansen


In the find tool, we submit a zillion requests to the NameNode asynchronously.  
As the queue on the NameNode grows, the time to response for each individual 
message will increase.  In the find tool, we were eventually getting timeouts 
on requests, even though the NN was respoinding as fast as its little feet 
could carry it.

I propose that we should redefine timeouts to be on a per-connection basis 
rather than per-request.  If a client has an outstanding request to the NN but 
hasn't gotten a response back within n msec, it should declare the connection 
dead and retry.  As long as the NameNode is being responsive to the best of its 
ability and providing data, we will not declare the link dead.

One potential for Failure of Least Astonishment here is that it will mean any 
particular request from a client cannot be depended on to get a positive or 
negative response within a fixed amount of time, but I think that may be a good 
trade to make.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10739) libhdfs++: In RPC engine replace vector with deque for pending requests

2016-08-09 Thread Bob Hansen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413929#comment-15413929
 ] 

Bob Hansen commented on HDFS-10739:
---

Looks good.  +1 if it passes tests.

> libhdfs++: In RPC engine replace vector with deque for pending requests
> ---
>
> Key: HDFS-10739
> URL: https://issues.apache.org/jira/browse/HDFS-10739
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Anatoli Shein
>Assignee: Anatoli Shein
> Attachments: HDFS-10739.HDFS-8707.000.patch
>
>
> Needs to be added in order to improve performance



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10672) libhdfs++: reorder directories in src/main/libhdfspp/examples, and add C++ version of cat tool

2016-07-29 Thread Bob Hansen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399164#comment-15399164
 ] 

Bob Hansen commented on HDFS-10672:
---

This is a better arrangement than what was in before, [~anatoli.shein].  Thanks.

In the C++ cat:
* The last_bytes_read parameter should be set to sizeof(buf) each time.  That 
makes this a very messy and error-prone API; buf_size and bytes_read should be 
different parameters.  Let's fix it as part of this bug.
* We need to fix ownership issues with the IOService.  Filed HDFS-10704 to 
capture that.
* We should have a no-args FileSystem::New.  Filed HDFS-10705 to capture that.
* We should distinguish between normal end-of-file failures to read and "there 
was an error" status.  What should we expect when we get to EOF?  We should 
document that in the headers.

When the C cat was originally written, we didn't have configuration parsing in 
libhdfs++.  Now that we do, should we get rid of the URI stuff in the cat 
examples and just check that they have a properly configured config directory 
(with a nice message with how to set HADOOP_CONF_DIR if they don't)?  I think 
it will make this example much simpler and cleaner, if less useful (let's put 
the useful one in the utility app).


> libhdfs++: reorder directories in src/main/libhdfspp/examples, and add C++ 
> version of cat tool
> --
>
> Key: HDFS-10672
> URL: https://issues.apache.org/jira/browse/HDFS-10672
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Anatoli Shein
>Assignee: Anatoli Shein
> Attachments: HDFS-10672.HDFS-8707.000.patch, 
> HDFS-10672.HDFS-8707.001.patch
>
>
> src/main/libhdfspp/examples should be structured like 
> examples/language/utility instead of examples/utility/language for easier 
> access by different developers.
> Additionally implementing C++ version of cat tool.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-10705) libhdfs++: FileSystem should have a convenience no-args ctor

2016-07-29 Thread Bob Hansen (JIRA)
Bob Hansen created HDFS-10705:
-

 Summary: libhdfs++: FileSystem should have a convenience no-args 
ctor
 Key: HDFS-10705
 URL: https://issues.apache.org/jira/browse/HDFS-10705
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Bob Hansen


Our examples demonstrate that the common use case is "use default options, 
default username, and default IOService."  Let's make a FileSystem::New() that 
helps users with that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-10704) libhdfs++: FileSystem should not take ownership of IOService

2016-07-29 Thread Bob Hansen (JIRA)
Bob Hansen created HDFS-10704:
-

 Summary: libhdfs++: FileSystem should not take ownership of 
IOService
 Key: HDFS-10704
 URL: https://issues.apache.org/jira/browse/HDFS-10704
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Bob Hansen


The ctor for the filesystem currently takes ownership of the IOService passed 
in.  There is a valid use case for a single IOService for multiple FileSystems 
(e.g. for different users).

If an IOService is passed in, the consumer should be responsible for its 
lifetime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10441) libhdfs++: HA namenode support

2016-07-28 Thread Bob Hansen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15397547#comment-15397547
 ] 

Bob Hansen commented on HDFS-10441:
---

James - thanks for the responses.  

+1

> libhdfs++: HA namenode support
> --
>
> Key: HDFS-10441
> URL: https://issues.apache.org/jira/browse/HDFS-10441
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: James Clampffer
> Attachments: HDFS-10441.HDFS-8707.000.patch, 
> HDFS-10441.HDFS-8707.002.patch, HDFS-10441.HDFS-8707.003.patch, 
> HDFS-10441.HDFS-8707.004.patch, HDFS-10441.HDFS-8707.005.patch, 
> HDFS-10441.HDFS-8707.006.patch, HDFS-10441.HDFS-8707.007.patch, 
> HDFS-10441.HDFS-8707.008.patch, HDFS-10441.HDFS-8707.009.patch, 
> HDFS-10441.HDFS-8707.010.patch, HDFS-10441.HDFS-8707.011.patch, 
> HDFS-10441.HDFS-8707.012.patch, HDFS-10441.HDFS-8707.013.patch, 
> HDFS-10441.HDFS-8707.014.patch, HDFS-8707.HDFS-10441.001.patch
>
>
> If a cluster is HA enabled then do proper failover.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9271) Implement basic NN operations

2016-07-25 Thread Bob Hansen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15391922#comment-15391922
 ] 

Bob Hansen commented on HDFS-9271:
--

Thanks, [~anatoli.shein].  That's looking very close now.

test_libhdfs_threaded.c: why did we move doTestGetDefaultBlockSize?

hdfds.cc: hdfsFileIsOpenForWrite and hdfsUnbufferFile should simply return 
false without an error code.  We've implemented the functions, we just haven't 
implemented writing or buffering.  
hdfds.cc:  hdfsAvailalable should set errno to 0
filesystem.h: Why does mkdirs have uint64_t permissions and SetPermissions has 
int16_t?  Should they both be uint16_t since negative values are invalid?

Not part of your refactoring, but looks like a problem:  
hdfs_shim.c: Shouldn't hdfsSeek/hdfsTell do a seek/tell on both the libhdfs and 
libhdfspp file handles?

Also not part of your refactoring, but necessary to get to "no libhdfs in 
non-write paths":
hdfs_shim.c: hdfsConfXXX should go through the libhdfspp interface

Minor nits:
filesystem.cc: Why can't we set the high bits in GetBlockLocations?  Include 
some comments with your experience/reasoning

Notes to self:
does hdfsUtime mess up the epoch?




> Implement basic NN operations
> -
>
> Key: HDFS-9271
> URL: https://issues.apache.org/jira/browse/HDFS-9271
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
>Assignee: Anatoli Shein
> Attachments: HDFS-9271.HDFS-8707.000.patch, 
> HDFS-9271.HDFS-8707.001.patch, HDFS-9271.HDFS-8707.002.patch, 
> HDFS-9271.HDFS-8707.003.patch, HDFS-9271.HDFS-8707.004.patch, 
> HDFS-9271.HDFS-8707.005.patch
>
>
> Expose via C and C++ API:
> * mkdirs
> * rename
> * delete
> * stat
> * chmod
> * chown
> * getListing
> * setOwner



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10672) libhdfs++: reorder directories in src/main/libhdfspp/examples, and add C++ version of cat tool

2016-07-25 Thread Bob Hansen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15391826#comment-15391826
 ] 

Bob Hansen commented on HDFS-10672:
---

Hey!  I found them!  Right where you included them in the patch.  Sorry about 
that.

Let's make both of the executables called "hdfs_cat".  Anybody that wants to 
copy them into the same directory can rename them themselves.

For the cpp version let's use the C++ APIs (defined hdfspp/hdfspp.h) in for the 
config, filesystem, and files, which will let us skip the Builder interface.



> libhdfs++: reorder directories in src/main/libhdfspp/examples, and add C++ 
> version of cat tool
> --
>
> Key: HDFS-10672
> URL: https://issues.apache.org/jira/browse/HDFS-10672
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Anatoli Shein
>Assignee: Anatoli Shein
> Attachments: HDFS-10672.HDFS-8707.000.patch
>
>
> src/main/libhdfspp/examples should be structured like 
> examples/language/utility instead of examples/utility/language for easier 
> access by different developers.
> Additionally implementing C++ version of cat tool.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-10686) libhdfs++: implement token authorization

2016-07-25 Thread Bob Hansen (JIRA)
Bob Hansen created HDFS-10686:
-

 Summary: libhdfs++: implement token authorization
 Key: HDFS-10686
 URL: https://issues.apache.org/jira/browse/HDFS-10686
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Bob Hansen


The current libhdfs++ SASL implementation does a kerberos handshake for each 
connection.  HDFS includes support for issuing and using time-limited tokens to 
reduce the load on the kerberos server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10672) libhdfs++: reorder directories in src/main/libhdfspp/examples, and add C++ version of cat tool

2016-07-25 Thread Bob Hansen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15391741#comment-15391741
 ] 

Bob Hansen commented on HDFS-10672:
---

Moving around looks good.  It looks like this patch didn't catch the C++ 
implementation.

> libhdfs++: reorder directories in src/main/libhdfspp/examples, and add C++ 
> version of cat tool
> --
>
> Key: HDFS-10672
> URL: https://issues.apache.org/jira/browse/HDFS-10672
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Anatoli Shein
>Assignee: Anatoli Shein
> Attachments: HDFS-10672.HDFS-8707.000.patch
>
>
> src/main/libhdfspp/examples should be structured like 
> examples/language/utility instead of examples/utility/language for easier 
> access by different developers.
> Additionally implementing C++ version of cat tool.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10441) libhdfs++: HA namenode support

2016-07-25 Thread Bob Hansen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15391737#comment-15391737
 ] 

Bob Hansen commented on HDFS-10441:
---

A few minor points (not necessarily worth holding up this patch):

Should HandleRpcResponse return a value or just forward-propagate to 
CommsError?  In the implementation, we alternately do one, the other, or both.

RpcConnectionImpl::ConnectAndFlush logs entry at the INFO level.  I think we 
already log once when we're attempting to connect, do we not?  If this is the 
only place we keep it, we should include the endpoint we're connecting to in 
the log message.

> libhdfs++: HA namenode support
> --
>
> Key: HDFS-10441
> URL: https://issues.apache.org/jira/browse/HDFS-10441
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: James Clampffer
> Attachments: HDFS-10441.HDFS-8707.000.patch, 
> HDFS-10441.HDFS-8707.002.patch, HDFS-10441.HDFS-8707.003.patch, 
> HDFS-10441.HDFS-8707.004.patch, HDFS-10441.HDFS-8707.005.patch, 
> HDFS-10441.HDFS-8707.006.patch, HDFS-10441.HDFS-8707.007.patch, 
> HDFS-10441.HDFS-8707.008.patch, HDFS-10441.HDFS-8707.009.patch, 
> HDFS-10441.HDFS-8707.010.patch, HDFS-10441.HDFS-8707.011.patch, 
> HDFS-10441.HDFS-8707.012.patch, HDFS-10441.HDFS-8707.013.patch, 
> HDFS-10441.HDFS-8707.014.patch, HDFS-8707.HDFS-10441.001.patch
>
>
> If a cluster is HA enabled then do proper failover.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-10685) libhdfs++: return explicit error when non-secured client connects to secured server

2016-07-25 Thread Bob Hansen (JIRA)
Bob Hansen created HDFS-10685:
-

 Summary: libhdfs++: return explicit error when non-secured client 
connects to secured server
 Key: HDFS-10685
 URL: https://issues.apache.org/jira/browse/HDFS-10685
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Bob Hansen


When a non-secured client tries to connect to a secured server, the first 
indication is an error from RpcConnection::HandleRpcRespose complaining about 
"RPC response with Unknown call id -33".

We should insert code in HandleRpcResponse to detect if the unknown call id == 
RpcEngine::kCallIdSasl and return an informative error that you have an 
unsecured client connecting to a secured server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10441) libhdfs++: HA namenode support

2016-07-15 Thread Bob Hansen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15379522#comment-15379522
 ] 

Bob Hansen commented on HDFS-10441:
---

Just a _few_ more questions:
* In RpcConnectionImpl::OnRecvCompleted, if we detect that we've 
connected to the standby, it falls through to StartReading().  Should it bail 
out at that point?
* In RpcEngine::RpcCommsError, we call 
pendingRequests[i]->IncrementFailoverCount();  should that implicitly reset the 
retry count to 0?  Will we get into cases where it retries until it fails, then 
the retry count is already == max_retry?
* If a namenode is down when we try to resolve, we don't try again when it's 
time to fail over, do we?  We should capture that in another bug

For discussion, not necessarily to fix in this patch:
* In FixedDealyWithFailover::ShouldRetry(), should we failover on any other 
errors other than timeout?  Bad route to host?  DNS failure?
* In FixedDealyWithFailover::ShouldRetry(), we're always using a delay if 
retries < 3.  This should be configurable.  We can cover that in another bug





> libhdfs++: HA namenode support
> --
>
> Key: HDFS-10441
> URL: https://issues.apache.org/jira/browse/HDFS-10441
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: James Clampffer
> Attachments: HDFS-10441.HDFS-8707.000.patch, 
> HDFS-10441.HDFS-8707.002.patch, HDFS-10441.HDFS-8707.003.patch, 
> HDFS-10441.HDFS-8707.004.patch, HDFS-10441.HDFS-8707.005.patch, 
> HDFS-10441.HDFS-8707.006.patch, HDFS-10441.HDFS-8707.007.patch, 
> HDFS-10441.HDFS-8707.008.patch, HDFS-10441.HDFS-8707.009.patch, 
> HDFS-10441.HDFS-8707.010.patch, HDFS-10441.HDFS-8707.011.patch, 
> HDFS-10441.HDFS-8707.012.patch, HDFS-10441.HDFS-8707.013.patch, 
> HDFS-8707.HDFS-10441.001.patch
>
>
> If a cluster is HA enabled then do proper failover.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9271) Implement basic NN operations

2016-07-11 Thread Bob Hansen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15371081#comment-15371081
 ] 

Bob Hansen commented on HDFS-9271:
--

Thanks for all that hard work, [~anatoli.shein].

A few comments:
* In GetBlockLocations(hdfspp.h, filesystem.cc), use offset_t or uint64_t 
rather than long.  It's less ambiguous.
* In getAbsolutePath (hdfs.cc), how about returning optional(string) rather 
than an empty string on error.  It makes the error state explicit and 
explicitly checked.
* Make a new bug to capture supporting ".." semantics
* It appears the majority of hdfs_ext_test.c has been commented out.  What this 
intentional, or debugging dirt that slipped in?
* Can we add a test for relative paths for all the functions where we added 
them in?
* Can we implement hdfsMove and/or hdfsTruncateFile with just metadata 
operations?
* Move to libhdfspp implementations in hdfs_shim for GetDefaultBlocksize[AtPath]
* Implement hdfsUnbufferFile as a no-op?
* Do we support single-dot relative paths?  e.g. can I call hdfsGetPathInfo(fs, 
".")?  Do we have tests over that?
* Do we have tests that show that libhdfspp's getReadStatistics match libhdfs's 
getReadStatistics?

Minor little nits:
* For the absolute path, I personally prefer abs_path = getAbsolutPath(...) 
rather than abs_path(getAbsolutePath).  They both compile to the same thing 
(see https://en.wikipedia.org/wiki/Return_value_optimization); I think the 
whitespace with the assignment makes the _what_ and the _content_ separation 
cleaner
* Refactor CheckSystemAndHandle to use CheckHandle 
(https://en.wikipedia.org/wiki/Don't_repeat_yourself)



> Implement basic NN operations
> -
>
> Key: HDFS-9271
> URL: https://issues.apache.org/jira/browse/HDFS-9271
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
>Assignee: Anatoli Shein
> Attachments: HDFS-9271.HDFS-8707.000.patch, 
> HDFS-9271.HDFS-8707.001.patch, HDFS-9271.HDFS-8707.002.patch
>
>
> Expose via C and C++ API:
> * mkdirs
> * rename
> * delete
> * stat
> * chmod
> * chown
> * getListing
> * setOwner



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-10607) libhdfs++:hdfs_shim missing some hdfs++ functions

2016-07-11 Thread Bob Hansen (JIRA)
Bob Hansen created HDFS-10607:
-

 Summary: libhdfs++:hdfs_shim missing some hdfs++ functions
 Key: HDFS-10607
 URL: https://issues.apache.org/jira/browse/HDFS-10607
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Bob Hansen


The hdfsConfGetStr, hdfsConfGetInt, hdfsStrFree, hdfsSeek, and hdfsTell 
functions are all calling into the libhdfs implementations, not the libhdfs++ 
implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10441) libhdfs++: HA namenode support

2016-06-30 Thread Bob Hansen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15357832#comment-15357832
 ] 

Bob Hansen commented on HDFS-10441:
---

Small issues: all of these are small, but should probably be fixed before 
landing
* Do we pass information in the data pointer for the NN failover event?  If so, 
document it in events.h
* Comment in retry_policy.h for FixedDelayWithFailover still references static 
values of 3 and 2
* We should include blocks of comments in retry_policy.h describing the 
behavior in human terms because the backoff behavior is less than obvious
* status.cc: find the right value for kSnapshotException
* filesystem.cc: nn_.Connect() call: get rid of commented-out code
* rpc_connection.cc: HandleRpcResponse should push req back to the head of the 
queue; alternately, don't dequeue it if we got a standby exception.
* If HandleRpcResponse gets a kStandbyException, will CommsError be called 
twice (once in HandleRpcResponse and again in OnRecvComplete)?
* rpc_engine.cc: let's use both namenodes if servers.size() >= 2 rather than 
just bailing out.
* rpc_engine.h: IsCurrentActive/IsCurrentStandby are dangerous as designed: 
they're asking for race conditions as we acquire the lock, check, release the 
lock, then take action.  Just before we take action, someone else could change 
the value
* rpc_engine.cc: Remove RpcEngine::Start instead of deprecating it.
* Don't forget to file bugs to handle more than two namenodes.


Minor issues: it would be nice to see these fixed, but aren't blockers:
* status.h: is having both is_server_exception_  and exception_class_ redundant?
* hdfs_configuration.c: We have a (faster) split function in uri.cc; let's 
refactor that into a Util method
* HdfsConfiguration::LookupNameService: if the URI parsing failed, we should 
just ignore the URI as mal-formed, not bail out of the entire function.  There 
may be a well-formed URI in a later value.
* HdfsConfiguration: I'm a little uncomfortable using the URI parser to break 
apart host:port.  If the user enters "foo:bar@baz", it will interpret that as a 
password and silently drop everything before the baz.  Just using split(':') 
and converting the port to int if it exists is solid enough.
* status.cc: I don't think the java exception name should go in the 
(user-visible) output message.  A string describing the error ("Invalid 
Argument") would be nice, though.
* filesystem.cc: why do we call InitRpc before checking if there's an 
io_service_?
* rpc_engine.h: Are ha_persisted_info_ and ha_enabled_ redundant?


> libhdfs++: HA namenode support
> --
>
> Key: HDFS-10441
> URL: https://issues.apache.org/jira/browse/HDFS-10441
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: James Clampffer
> Attachments: HDFS-10441.HDFS-8707.000.patch, 
> HDFS-10441.HDFS-8707.002.patch, HDFS-10441.HDFS-8707.003.patch, 
> HDFS-10441.HDFS-8707.004.patch, HDFS-10441.HDFS-8707.005.patch, 
> HDFS-10441.HDFS-8707.006.patch, HDFS-10441.HDFS-8707.007.patch, 
> HDFS-10441.HDFS-8707.008.patch, HDFS-10441.HDFS-8707.009.patch, 
> HDFS-8707.HDFS-10441.001.patch
>
>
> If a cluster is HA enabled then do proper failover.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-10574) webhdfs fails with filenames including semicolons

2016-06-27 Thread Bob Hansen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bob Hansen resolved HDFS-10574.
---
Resolution: Invalid

Ah, it appears that my test cluster was running an old version of HDFS.  My 
reproducer also succeeds on trunk.

Thanks, [~yuanbo], for looking into it and setting me straight.  I apologize 
for adding to the noise floor.

http://izquotes.com/quotes-pictures/quote-the-boy-cried-wolf-wolf-and-the-villagers-came-out-to-help-him-aesop-205890.jpg

> webhdfs fails with filenames including semicolons
> -
>
> Key: HDFS-10574
> URL: https://issues.apache.org/jira/browse/HDFS-10574
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 2.7.0
>Reporter: Bob Hansen
>
> Via webhdfs or native HDFS, we can create files with semicolons in their 
> names:
> {code}
> bhansen@::1 /tmp$ hdfs dfs -copyFromLocal /tmp/data 
> "webhdfs://localhost:50070/foo;bar"
> bhansen@::1 /tmp$ hadoop fs -ls /
> Found 1 items
> -rw-r--r--   2 bhansen supergroup  9 2016-06-24 12:20 /foo;bar
> {code}
> Attempting to fetch the file via webhdfs fails:
> {code}
> bhansen@::1 /tmp$ curl -L 
> "http://localhost:50070/webhdfs/v1/foo%3Bbar?user.name=bhansen=OPEN;
> {"RemoteException":{"exception":"FileNotFoundException","javaClassName":"java.io.FileNotFoundException","message":"File
>  does not exist: /foo\n\tat 
> org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:66)\n\tat
>  
> org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:56)\n\tat
>  
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1891)\n\tat
>  
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1832)\n\tat
>  
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1812)\n\tat
>  
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1784)\n\tat
>  
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:542)\n\tat
>  
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:362)\n\tat
>  
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)\n\tat
>  
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)\n\tat
>  org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)\n\tat 
> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)\n\tat 
> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)\n\tat 
> java.security.AccessController.doPrivileged(Native Method)\n\tat 
> javax.security.auth.Subject.doAs(Subject.java:422)\n\tat 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)\n\tat
>  org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)\n"}}
> {code}
> It appears (from the attached TCP dump in curl_request.txt) that the 
> namenode's redirect unescapes the semicolon, and the DataNode's HTTP server 
> is splitting the request at the semicolon, and failing to find the file "foo".
> Interesting side notes:
> * In the attached dfs_copyfrom_local_traffic.txt, you can see the 
> copyFromLocal command writing the data to "foo;bar_COPYING_", which is then 
> redirected and just writes to "foo".  The subsequent rename attempts to 
> rename "foo;bar_COPYING_" to "foo;bar", but has the same parsing bug so 
> effectively renames "foo" to "foo;bar".
> Here is the full range of special characters that we initially started with 
> that led to the minimal reproducer above:
> {code}
> hdfs dfs -copyFromLocal /tmp/data webhdfs://localhost:50070/'~`!@#$%^& 
> ()-_=+|<.>]}",\\\[\{\*\?\;'\''data'
> curl -L 
> "http://localhost:50070/webhdfs/v1/%7E%60%21%40%23%24%25%5E%26+%28%29-_%3D%2B%7C%3C.%3E%5D%7D%22%2C%5C%5B%7B*%3F%3B%27data?user.name=bhansen=OPEN=0;
> {code}
> Thanks to [~anatoli.shein] for making a concise reproducer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10575) webhdfs fails with filenames including semicolons

2016-06-24 Thread Bob Hansen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bob Hansen updated HDFS-10575:
--
Attachment: curl_request.txt
dfs_copyfrom_local_traffic.txt

> webhdfs fails with filenames including semicolons
> -
>
> Key: HDFS-10575
> URL: https://issues.apache.org/jira/browse/HDFS-10575
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 2.7.0
>Reporter: Bob Hansen
> Attachments: curl_request.txt, dfs_copyfrom_local_traffic.txt
>
>
> Via webhdfs or native HDFS, we can create files with semicolons in their 
> names:
> {code}
> bhansen@::1 /tmp$ hdfs dfs -copyFromLocal /tmp/data 
> "webhdfs://localhost:50070/foo;bar"
> bhansen@::1 /tmp$ hadoop fs -ls /
> Found 1 items
> -rw-r--r--   2 bhansen supergroup  9 2016-06-24 12:20 /foo;bar
> {code}
> Attempting to fetch the file via webhdfs fails:
> {code}
> bhansen@::1 /tmp$ curl -L 
> "http://localhost:50070/webhdfs/v1/foo%3Bbar?user.name=bhansen=OPEN;
> {"RemoteException":{"exception":"FileNotFoundException","javaClassName":"java.io.FileNotFoundException","message":"File
>  does not exist: /foo\n\tat 
> org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:66)\n\tat
>  
> org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:56)\n\tat
>  
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1891)\n\tat
>  
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1832)\n\tat
>  
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1812)\n\tat
>  
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1784)\n\tat
>  
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:542)\n\tat
>  
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:362)\n\tat
>  
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)\n\tat
>  
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)\n\tat
>  org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)\n\tat 
> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)\n\tat 
> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)\n\tat 
> java.security.AccessController.doPrivileged(Native Method)\n\tat 
> javax.security.auth.Subject.doAs(Subject.java:422)\n\tat 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)\n\tat
>  org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)\n"}}
> {code}
> It appears (from the attached TCP dump in curl_request.txt) that the 
> namenode's redirect unescapes the semicolon, and the DataNode's HTTP server 
> is splitting the request at the semicolon, and failing to find the file "foo".
> Interesting side notes:
> * In the attached dfs_copyfrom_local_traffic.txt, you can see the 
> copyFromLocal command writing the data to "foo;bar_COPYING_", which is then 
> redirected and just writes to "foo".  The subsequent rename attempts to 
> rename "foo;bar_COPYING_" to "foo;bar", but has the same parsing bug so 
> effectively renames "foo" to "foo;bar".
> Here is the full range of special characters that we initially started with 
> that led to the minimal reproducer above:
> {code}
> hdfs dfs -copyFromLocal /tmp/data webhdfs://localhost:50070/'~`!@#$%^& 
> ()-_=+|<.>]}",\\\[\{\*\?\;'\''data'
> curl -L 
> "http://localhost:50070/webhdfs/v1/%7E%60%21%40%23%24%25%5E%26+%28%29-_%3D%2B%7C%3C.%3E%5D%7D%22%2C%5C%5B%7B*%3F%3B%27data?user.name=bhansen=OPEN=0;
> {code}
> Thanks to [~anatoli.shein] for making a concise reproducer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-10575) webhdfs fails with filenames including semicolons

2016-06-24 Thread Bob Hansen (JIRA)
Bob Hansen created HDFS-10575:
-

 Summary: webhdfs fails with filenames including semicolons
 Key: HDFS-10575
 URL: https://issues.apache.org/jira/browse/HDFS-10575
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: webhdfs
Affects Versions: 2.7.0
Reporter: Bob Hansen


Via webhdfs or native HDFS, we can create files with semicolons in their names:

{code}
bhansen@::1 /tmp$ hdfs dfs -copyFromLocal /tmp/data 
"webhdfs://localhost:50070/foo;bar"
bhansen@::1 /tmp$ hadoop fs -ls /
Found 1 items
-rw-r--r--   2 bhansen supergroup  9 2016-06-24 12:20 /foo;bar
{code}

Attempting to fetch the file via webhdfs fails:
{code}
bhansen@::1 /tmp$ curl -L 
"http://localhost:50070/webhdfs/v1/foo%3Bbar?user.name=bhansen=OPEN;
{"RemoteException":{"exception":"FileNotFoundException","javaClassName":"java.io.FileNotFoundException","message":"File
 does not exist: /foo\n\tat 
org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:66)\n\tat
 
org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:56)\n\tat
 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1891)\n\tat
 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1832)\n\tat
 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1812)\n\tat
 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1784)\n\tat
 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:542)\n\tat
 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:362)\n\tat
 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)\n\tat
 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)\n\tat
 org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)\n\tat 
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)\n\tat 
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)\n\tat 
java.security.AccessController.doPrivileged(Native Method)\n\tat 
javax.security.auth.Subject.doAs(Subject.java:422)\n\tat 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)\n\tat
 org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)\n"}}
{code}

It appears (from the attached TCP dump in curl_request.txt) that the namenode's 
redirect unescapes the semicolon, and the DataNode's HTTP server is splitting 
the request at the semicolon, and failing to find the file "foo".



Interesting side notes:
* In the attached dfs_copyfrom_local_traffic.txt, you can see the copyFromLocal 
command writing the data to "foo;bar_COPYING_", which is then redirected and 
just writes to "foo".  The subsequent rename attempts to rename 
"foo;bar_COPYING_" to "foo;bar", but has the same parsing bug so effectively 
renames "foo" to "foo;bar".

Here is the full range of special characters that we initially started with 
that led to the minimal reproducer above:
{code}
hdfs dfs -copyFromLocal /tmp/data webhdfs://localhost:50070/'~`!@#$%^& 
()-_=+|<.>]}",\\\[\{\*\?\;'\''data'
curl -L 
"http://localhost:50070/webhdfs/v1/%7E%60%21%40%23%24%25%5E%26+%28%29-_%3D%2B%7C%3C.%3E%5D%7D%22%2C%5C%5B%7B*%3F%3B%27data?user.name=bhansen=OPEN=0;
{code}

Thanks to [~anatoli.shein] for making a concise reproducer.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-10574) webhdfs fails with filenames including semicolons

2016-06-24 Thread Bob Hansen (JIRA)
Bob Hansen created HDFS-10574:
-

 Summary: webhdfs fails with filenames including semicolons
 Key: HDFS-10574
 URL: https://issues.apache.org/jira/browse/HDFS-10574
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: webhdfs
Affects Versions: 2.7.0
Reporter: Bob Hansen


Via webhdfs or native HDFS, we can create files with semicolons in their names:

{code}
bhansen@::1 /tmp$ hdfs dfs -copyFromLocal /tmp/data 
"webhdfs://localhost:50070/foo;bar"
bhansen@::1 /tmp$ hadoop fs -ls /
Found 1 items
-rw-r--r--   2 bhansen supergroup  9 2016-06-24 12:20 /foo;bar
{code}

Attempting to fetch the file via webhdfs fails:
{code}
bhansen@::1 /tmp$ curl -L 
"http://localhost:50070/webhdfs/v1/foo%3Bbar?user.name=bhansen=OPEN;
{"RemoteException":{"exception":"FileNotFoundException","javaClassName":"java.io.FileNotFoundException","message":"File
 does not exist: /foo\n\tat 
org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:66)\n\tat
 
org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:56)\n\tat
 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1891)\n\tat
 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1832)\n\tat
 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1812)\n\tat
 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1784)\n\tat
 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:542)\n\tat
 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:362)\n\tat
 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)\n\tat
 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)\n\tat
 org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)\n\tat 
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)\n\tat 
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)\n\tat 
java.security.AccessController.doPrivileged(Native Method)\n\tat 
javax.security.auth.Subject.doAs(Subject.java:422)\n\tat 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)\n\tat
 org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)\n"}}
{code}

It appears (from the attached TCP dump in curl_request.txt) that the namenode's 
redirect unescapes the semicolon, and the DataNode's HTTP server is splitting 
the request at the semicolon, and failing to find the file "foo".



Interesting side notes:
* In the attached dfs_copyfrom_local_traffic.txt, you can see the copyFromLocal 
command writing the data to "foo;bar_COPYING_", which is then redirected and 
just writes to "foo".  The subsequent rename attempts to rename 
"foo;bar_COPYING_" to "foo;bar", but has the same parsing bug so effectively 
renames "foo" to "foo;bar".

Here is the full range of special characters that we initially started with 
that led to the minimal reproducer above:
{code}
hdfs dfs -copyFromLocal /tmp/data webhdfs://localhost:50070/'~`!@#$%^& 
()-_=+|<.>]}",\\\[\{\*\?\;'\''data'
curl -L 
"http://localhost:50070/webhdfs/v1/%7E%60%21%40%23%24%25%5E%26+%28%29-_%3D%2B%7C%3C.%3E%5D%7D%22%2C%5C%5B%7B*%3F%3B%27data?user.name=bhansen=OPEN=0;
{code}

Thanks to [~anatoli.shein] for making a concise reproducer.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10511) libhdfs++: make error returning mechanism consistent across all hdfs operations

2016-06-21 Thread Bob Hansen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bob Hansen updated HDFS-10511:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> libhdfs++: make error returning mechanism consistent across all hdfs 
> operations
> ---
>
> Key: HDFS-10511
> URL: https://issues.apache.org/jira/browse/HDFS-10511
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Anatoli Shein
>Assignee: Anatoli Shein
> Attachments: HDFS-10511.HDFS-8707.000.patch, 
> HDFS-10511.HDFS-8707.000.patch, HDFS-10511.HDFS-8707.001.patch, 
> HDFS-10511.HDFS-8707.002.patch, HDFS-10511.HDFS-8707.003.patch
>
>
> Errno should always be set.
> If function is returning a code on stack, it should be consistent with errno.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10515) libhdfs++: Implement mkdirs, rmdir, rename, and remove

2016-06-21 Thread Bob Hansen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bob Hansen updated HDFS-10515:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> libhdfs++: Implement mkdirs, rmdir, rename, and remove
> --
>
> Key: HDFS-10515
> URL: https://issues.apache.org/jira/browse/HDFS-10515
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Anatoli Shein
>Assignee: Anatoli Shein
> Attachments: HDFS-10515.HDFS-8707.000.patch, 
> HDFS-10515.HDFS-8707.001.patch, HDFS-10515.HDFS-8707.002.patch, 
> HDFS-10515.HDFS-8707.003.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10524) libhdfs++: Implement chmod and chown

2016-06-21 Thread Bob Hansen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bob Hansen updated HDFS-10524:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> libhdfs++: Implement chmod and chown
> 
>
> Key: HDFS-10524
> URL: https://issues.apache.org/jira/browse/HDFS-10524
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Anatoli Shein
>Assignee: Anatoli Shein
> Attachments: HDFS-10524.HDFS-8707.000.patch, 
> HDFS-10524.HDFS-8707.001.patch, HDFS-10524.HDFS-8707.002.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10511) libhdfs++: make error returning mechanism consistent across all hdfs operations

2016-06-21 Thread Bob Hansen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15341525#comment-15341525
 ] 

Bob Hansen commented on HDFS-10511:
---

+1

> libhdfs++: make error returning mechanism consistent across all hdfs 
> operations
> ---
>
> Key: HDFS-10511
> URL: https://issues.apache.org/jira/browse/HDFS-10511
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Anatoli Shein
>Assignee: Anatoli Shein
> Attachments: HDFS-10511.HDFS-8707.000.patch, 
> HDFS-10511.HDFS-8707.000.patch, HDFS-10511.HDFS-8707.001.patch, 
> HDFS-10511.HDFS-8707.002.patch, HDFS-10511.HDFS-8707.003.patch
>
>
> Errno should always be set.
> If function is returning a code on stack, it should be consistent with errno.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10524) libhdfs++: Implement chmod and chown

2016-06-21 Thread Bob Hansen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15341523#comment-15341523
 ] 

Bob Hansen commented on HDFS-10524:
---

+1.  Thanks, [~anatoli.shein]

> libhdfs++: Implement chmod and chown
> 
>
> Key: HDFS-10524
> URL: https://issues.apache.org/jira/browse/HDFS-10524
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Anatoli Shein
>Assignee: Anatoli Shein
> Attachments: HDFS-10524.HDFS-8707.000.patch, 
> HDFS-10524.HDFS-8707.001.patch, HDFS-10524.HDFS-8707.002.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10515) libhdfs++: Implement mkdirs, rmdir, rename, and remove

2016-06-21 Thread Bob Hansen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15341517#comment-15341517
 ] 

Bob Hansen commented on HDFS-10515:
---

+1.  Thanks, [~anatoli.shein]

> libhdfs++: Implement mkdirs, rmdir, rename, and remove
> --
>
> Key: HDFS-10515
> URL: https://issues.apache.org/jira/browse/HDFS-10515
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Anatoli Shein
>Assignee: Anatoli Shein
> Attachments: HDFS-10515.HDFS-8707.000.patch, 
> HDFS-10515.HDFS-8707.001.patch, HDFS-10515.HDFS-8707.002.patch, 
> HDFS-10515.HDFS-8707.003.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-10526) libhdfs++: Add connect timeouts to async_connect calls

2016-06-20 Thread Bob Hansen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15340260#comment-15340260
 ] 

Bob Hansen edited comment on HDFS-10526 at 6/20/16 7:39 PM:


Committed at 71af40868aa3c2461


was (Author: bobhansen):
Committed at 13f4225ee75e81ac77476

> libhdfs++: Add connect timeouts to async_connect calls
> --
>
> Key: HDFS-10526
> URL: https://issues.apache.org/jira/browse/HDFS-10526
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
>Assignee: Bob Hansen
> Attachments: HDFS-10526.HDFS-8707.000.patch, 
> HDFS-10526.HDFS-8707.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10526) libhdfs++: Add connect timeouts to async_connect calls

2016-06-20 Thread Bob Hansen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bob Hansen updated HDFS-10526:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed at 13f4225ee75e81ac77476

> libhdfs++: Add connect timeouts to async_connect calls
> --
>
> Key: HDFS-10526
> URL: https://issues.apache.org/jira/browse/HDFS-10526
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
>Assignee: Bob Hansen
> Attachments: HDFS-10526.HDFS-8707.000.patch, 
> HDFS-10526.HDFS-8707.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10511) libhdfs++: make error returning mechanism consistent across all hdfs operations

2016-06-17 Thread Bob Hansen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15336837#comment-15336837
 ] 

Bob Hansen commented on HDFS-10511:
---

hadoop-hdfs-native-client/src/main/native/libhdfs/include/hdfs/hdfs.h is part 
of libhdfs, and we're fairly committed to not changing it.

The existing text of "nonzero error code otherwise" is sufficient to cover the 
behavior of returning -1 on error.

The functions in hdfs_ext.h are fair game, and your updated comments are 
well-received.



> libhdfs++: make error returning mechanism consistent across all hdfs 
> operations
> ---
>
> Key: HDFS-10511
> URL: https://issues.apache.org/jira/browse/HDFS-10511
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Anatoli Shein
>Assignee: Anatoli Shein
> Attachments: HDFS-10511.HDFS-8707.000.patch, 
> HDFS-10511.HDFS-8707.000.patch, HDFS-10511.HDFS-8707.001.patch, 
> HDFS-10511.HDFS-8707.002.patch
>
>
> Errno should always be set.
> If function is returning a code on stack, it should be consistent with errno.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10515) libhdfs++: Implement mkdirs, rmdir, rename, and remove

2016-06-17 Thread Bob Hansen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15336827#comment-15336827
 ] 

Bob Hansen commented on HDFS-10515:
---

In:
{code}
dirList = hdfsListDirectory(fs, listDirTest, );
 EXPECT_NONNULL(dirList);
if (numEntries != 1) {
 fprintf(stderr, "hdfsListDirectory set numEntries to "
"%d on directory containing 1 files.", numEntries);
return EIO;
 }
 hdfsFreeFileInfo(dirList, numEntries);
{code}
we should free the dirList before returning EIO.  We can put the free up above 
the if(numEntires...) check since we're not looking at the contents, just the 
returned file count.

Can we test privilege failures in TestMkdirs, TestDelete, and TestRename by 
settings privs to 000?



> libhdfs++: Implement mkdirs, rmdir, rename, and remove
> --
>
> Key: HDFS-10515
> URL: https://issues.apache.org/jira/browse/HDFS-10515
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Anatoli Shein
>Assignee: Anatoli Shein
> Attachments: HDFS-10515.HDFS-8707.000.patch, 
> HDFS-10515.HDFS-8707.001.patch, HDFS-10515.HDFS-8707.002.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10524) libhdfs++: Implement chmod and chown

2016-06-17 Thread Bob Hansen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15336809#comment-15336809
 ] 

Bob Hansen commented on HDFS-10524:
---

Thanks, [~anatoli.shein].  We're getting pretty close.

The implementations in hdfs_shim.c of hdfsConnect, 
hdfsConnectAsUserNewInstance, and hdfsConnectNewInstance are returning invalid 
pointers.  For the shim methods, we should be returning a shim hdfsFs_internal 
pointer, and your implementations of those are returning a libhdfs++ hdfsFS 
pointer.

hdfsFreeBuilder should free the bld passed-in pointer when it is finished.

hdfsDisconnect should try to disconnect the libhdfspp instance even if the 
libhdfs instance failed to disconnect.  It should only return 0 if both 
returned 0.




> libhdfs++: Implement chmod and chown
> 
>
> Key: HDFS-10524
> URL: https://issues.apache.org/jira/browse/HDFS-10524
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Anatoli Shein
>Assignee: Anatoli Shein
> Attachments: HDFS-10524.HDFS-8707.000.patch, 
> HDFS-10524.HDFS-8707.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10526) libhdfs++: Add connect timeouts to async_connect calls

2016-06-15 Thread Bob Hansen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bob Hansen updated HDFS-10526:
--
Attachment: HDFS-10526.HDFS-8707.001.patch

> libhdfs++: Add connect timeouts to async_connect calls
> --
>
> Key: HDFS-10526
> URL: https://issues.apache.org/jira/browse/HDFS-10526
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
>Assignee: Bob Hansen
> Attachments: HDFS-10526.HDFS-8707.000.patch, 
> HDFS-10526.HDFS-8707.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10526) libhdfs++: Add connect timeouts to async_connect calls

2016-06-15 Thread Bob Hansen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bob Hansen updated HDFS-10526:
--
Attachment: HDFS-10526.HDFS-8707.000.patch

Adds a timer in parallel with RpcConnection async_connect calls.

Needed to carefully manage state because we'll now get two calls to 
ConnectionComplete - one from TPC and one from the timeout.  The first to 
trigger should always cancel the second, but it will still call in with a "Hey! 
 I'm cancelled" error.

We mitigate that by either changing the current_endpoint_ or connection_ member 
variable when ConnectionComplete is called.  On entry, we grab the lock and 
check that both member variables are what we expect.

> libhdfs++: Add connect timeouts to async_connect calls
> --
>
> Key: HDFS-10526
> URL: https://issues.apache.org/jira/browse/HDFS-10526
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
>Assignee: Bob Hansen
> Attachments: HDFS-10526.HDFS-8707.000.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10526) libhdfs++: Add connect timeouts to async_connect calls

2016-06-15 Thread Bob Hansen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bob Hansen updated HDFS-10526:
--
Status: Patch Available  (was: Open)

> libhdfs++: Add connect timeouts to async_connect calls
> --
>
> Key: HDFS-10526
> URL: https://issues.apache.org/jira/browse/HDFS-10526
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
>Assignee: Bob Hansen
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10524) libhdfs++: Implement chmod and chown

2016-06-15 Thread Bob Hansen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15331888#comment-15331888
 ] 

Bob Hansen commented on HDFS-10524:
---

[~James C]: Good catch on the future/tuple thing.

I would respectfully disagree about having redundant parameter checking.  If 
it's not a performance or maintenance burden, defence in depth against future 
stupidity is a Good Thing.

[~anatoli.shein] - Are permissions of 01777 valid for Hadoop?  While it's valid 
in POSIX-land, I think in Hadoop-land, the maximum value is 0777.

In NameNodeOperations::SetPermission, whenever we're returning an error that a 
value is invalid, it is very helpful to the consumer to include the value that 
made it into the code.  Frequently, the value has been mangled somewhere in the 
consumer or library code, and it can be a big help in debugging.  In this case, 
when we return that the permissions are out of range, we should include the 
(octal) value of the permissions.  A stringstream would be helpful in 
constructing the error message.


> libhdfs++: Implement chmod and chown
> 
>
> Key: HDFS-10524
> URL: https://issues.apache.org/jira/browse/HDFS-10524
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Anatoli Shein
>Assignee: Anatoli Shein
> Attachments: HDFS-10524.HDFS-8707.000.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10515) libhdfs++: Implement mkdirs, rmdir, rename, and remove

2016-06-15 Thread Bob Hansen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15331872#comment-15331872
 ] 

Bob Hansen commented on HDFS-10515:
---

As James rightly pointed out in HDFS-10524, rather than having 
std::future for the FileSystemImpl blocking shims, just 
make them a std::future.

Same applies for the Snapshot functions.

> libhdfs++: Implement mkdirs, rmdir, rename, and remove
> --
>
> Key: HDFS-10515
> URL: https://issues.apache.org/jira/browse/HDFS-10515
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Anatoli Shein
>Assignee: Anatoli Shein
> Attachments: HDFS-10515.HDFS-8707.000.patch, 
> HDFS-10515.HDFS-8707.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10515) libhdfs++: Implement mkdirs, rmdir, rename, and remove

2016-06-15 Thread Bob Hansen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15331865#comment-15331865
 ] 

Bob Hansen commented on HDFS-10515:
---

Great work, [~anatoli.shein].  Thanks for the contribution.

For FileSystem::Delete, it is more efficient to pass a bool than a const bool &.

When passing on errors, it is valuable to provide context of where the error 
came from.  For example, in FileSystemImpl::Rename:
   handler(Status::InvalidArgument("Argument 'oldPath' cannot be empty"));
could better be
   handler(Status::InvalidArgument("rename: argument 'oldPath' cannot be 
empty"));

In NameNodeOperations::mkdirs and NameNodeOperations::delete, if the return 
message has an error, we throw away the error state and create a new 
PathNotFound Status.  Would it be better to just pass the returned error Status 
on to the handler?  You include a comment for Rename that the returned error is 
not informative; if the same is true for the others, include a comment there.  

If there's a permissions issue, do we get a good error message back to that 
effect?  If so, we should pass it back.

For the error message in NameNodeOperations::rename, perhaps "oldPath and 
parent directory of newPath must exist. newPath must not exist."

We missed it in the previous reviews, but since you're touching it here... for 
handlers in methods like AllowSnapshot where we're not doing any translation, 
we can pass the consumer handler directly into the NameNodeOps class instead of 
making a lambda that just calls into the consumer handler.

HdfsExtTest::TestMkDirs - does the java return OK if you try to createDirectory 
on an existent directory?

In HdfsExtTest::TestRename, don't forget to free the results of 
hdfsListDirectory.


Minor nit: we don't typically add the "const" flag to parameters passed by 
value (as in FileSystemImpl::mkdirs).



> libhdfs++: Implement mkdirs, rmdir, rename, and remove
> --
>
> Key: HDFS-10515
> URL: https://issues.apache.org/jira/browse/HDFS-10515
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Anatoli Shein
>Assignee: Anatoli Shein
> Attachments: HDFS-10515.HDFS-8707.000.patch, 
> HDFS-10515.HDFS-8707.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10511) libhdfs++: make error returning mechanism consistent across all hdfs operations

2016-06-14 Thread Bob Hansen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15330598#comment-15330598
 ] 

Bob Hansen commented on HDFS-10511:
---

Thanks, [~anatoli.shein].

A few more little places we should touch for consistency:

Let's have hdfsGetLastError return -1 or 0 also.
The hdfs*Builder* and hdfs*Conf* functions should set errno to 0 on entry.
hdfsBuilderConfGetInt should set errno to 0 on success and set errno and return 
-1 on failure.
hdfs*Logging* should  set errno to 0 on success and set errno and return -1 on 
failure.




> libhdfs++: make error returning mechanism consistent across all hdfs 
> operations
> ---
>
> Key: HDFS-10511
> URL: https://issues.apache.org/jira/browse/HDFS-10511
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Anatoli Shein
>Assignee: Anatoli Shein
> Attachments: HDFS-10511.HDFS-8707.000.patch, 
> HDFS-10511.HDFS-8707.000.patch
>
>
> Errno should always be set.
> If function is returning a code on stack, it should be consistent with errno.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-10526) libhdfs++: Add connect timeouts to async_connect calls

2016-06-14 Thread Bob Hansen (JIRA)
Bob Hansen created HDFS-10526:
-

 Summary: libhdfs++: Add connect timeouts to async_connect calls
 Key: HDFS-10526
 URL: https://issues.apache.org/jira/browse/HDFS-10526
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Bob Hansen
Assignee: Bob Hansen






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10511) libhdfs++: make error returning mechanism consistent across all hdfs operations

2016-06-10 Thread Bob Hansen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bob Hansen updated HDFS-10511:
--
Assignee: Anatoli Shein  (was: Bob Hansen)

> libhdfs++: make error returning mechanism consistent across all hdfs 
> operations
> ---
>
> Key: HDFS-10511
> URL: https://issues.apache.org/jira/browse/HDFS-10511
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Anatoli Shein
>Assignee: Anatoli Shein
> Attachments: HDFS-10511.HDFS-8707.000.patch, 
> HDFS-10511.HDFS-8707.000.patch
>
>
> Errno should always be set.
> If function is returning a code on stack, it should be consistent with errno.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10511) libhdfs++: make error returning mechanism consistent across all hdfs operations

2016-06-10 Thread Bob Hansen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bob Hansen updated HDFS-10511:
--
Attachment: HDFS-10511.HDFS-8707.000.patch

> libhdfs++: make error returning mechanism consistent across all hdfs 
> operations
> ---
>
> Key: HDFS-10511
> URL: https://issues.apache.org/jira/browse/HDFS-10511
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Anatoli Shein
>Assignee: Bob Hansen
> Attachments: HDFS-10511.HDFS-8707.000.patch, 
> HDFS-10511.HDFS-8707.000.patch
>
>
> Errno should always be set.
> If function is returning a code on stack, it should be consistent with errno.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-10511) libhdfs++: make error returning mechanism consistent across all hdfs operations

2016-06-10 Thread Bob Hansen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bob Hansen reassigned HDFS-10511:
-

Assignee: Bob Hansen  (was: Anatoli Shein)

> libhdfs++: make error returning mechanism consistent across all hdfs 
> operations
> ---
>
> Key: HDFS-10511
> URL: https://issues.apache.org/jira/browse/HDFS-10511
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Anatoli Shein
>Assignee: Bob Hansen
> Attachments: HDFS-10511.HDFS-8707.000.patch
>
>
> Errno should always be set.
> If function is returning a code on stack, it should be consistent with errno.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10494) libhdfs++: Implement snapshot operations and GetFsStats

2016-06-10 Thread Bob Hansen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bob Hansen updated HDFS-10494:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> libhdfs++: Implement snapshot operations and GetFsStats
> ---
>
> Key: HDFS-10494
> URL: https://issues.apache.org/jira/browse/HDFS-10494
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Anatoli Shein
>Assignee: Anatoli Shein
> Attachments: HDFS-10494.HDFS-8707.000.patch, 
> HDFS-10494.HDFS-8707.001.patch, HDFS-10494.HDFS-8707.002.patch, 
> HDFS-10494.HDFS-8707.003.patch, HDFS-10494.HDFS-8707.004.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10494) libhdfs++: Implement snapshot operations and GetFsStats

2016-06-10 Thread Bob Hansen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15324576#comment-15324576
 ] 

Bob Hansen commented on HDFS-10494:
---

+1

> libhdfs++: Implement snapshot operations and GetFsStats
> ---
>
> Key: HDFS-10494
> URL: https://issues.apache.org/jira/browse/HDFS-10494
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Anatoli Shein
>Assignee: Anatoli Shein
> Attachments: HDFS-10494.HDFS-8707.000.patch, 
> HDFS-10494.HDFS-8707.001.patch, HDFS-10494.HDFS-8707.002.patch, 
> HDFS-10494.HDFS-8707.003.patch, HDFS-10494.HDFS-8707.004.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10494) libhdfs++: Implement snapshot operations and GetFsStats

2016-06-09 Thread Bob Hansen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15323382#comment-15323382
 ] 

Bob Hansen commented on HDFS-10494:
---

[~anatoli.shein] - I mis-spoke (mis-typed) in my previous comment, and for that 
I apologize.  I _meant_ to ask you to put the comments in the FileSystem 
interface in hdfspp.h, not the FileSystemImpl header in filesystem.h.

If you can move those comments, I'll +1 and land the code.

> libhdfs++: Implement snapshot operations and GetFsStats
> ---
>
> Key: HDFS-10494
> URL: https://issues.apache.org/jira/browse/HDFS-10494
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Anatoli Shein
>Assignee: Anatoli Shein
> Attachments: HDFS-10494.HDFS-8707.000.patch, 
> HDFS-10494.HDFS-8707.001.patch, HDFS-10494.HDFS-8707.002.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10494) libhdfs++: Implement snapshot operations and GetFsStats

2016-06-09 Thread Bob Hansen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15323090#comment-15323090
 ] 

Bob Hansen commented on HDFS-10494:
---

Also, can we check the validity of inputs (path is not empty, etc.) in the 
FileSystemImpl public functions?

> libhdfs++: Implement snapshot operations and GetFsStats
> ---
>
> Key: HDFS-10494
> URL: https://issues.apache.org/jira/browse/HDFS-10494
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Anatoli Shein
>Assignee: Anatoli Shein
> Attachments: HDFS-10494.HDFS-8707.000.patch, 
> HDFS-10494.HDFS-8707.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10494) libhdfs++: Implement snapshot operations and GetFsStats

2016-06-09 Thread Bob Hansen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15323084#comment-15323084
 ] 

Bob Hansen commented on HDFS-10494:
---

[~anatoli.shein]: thanks for the updates.

For compatibility with libhdfs, our functions should return -1 on error (where 
appropriate) and set/clear errno for details of the error.

Can you add comments on usage/params in filesystem.h, please?

Other than those, it looks good.

> libhdfs++: Implement snapshot operations and GetFsStats
> ---
>
> Key: HDFS-10494
> URL: https://issues.apache.org/jira/browse/HDFS-10494
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Anatoli Shein
>Assignee: Anatoli Shein
> Attachments: HDFS-10494.HDFS-8707.000.patch, 
> HDFS-10494.HDFS-8707.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10511) libhdfs++: make error returning mechanism consistent across all hdfs operations

2016-06-09 Thread Bob Hansen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15322696#comment-15322696
 ] 

Bob Hansen commented on HDFS-10511:
---

Notably, for the C-API functions defined in hdfs.h and hdfs_ext.h, we should 
adopt the standard that errno is always set to 0 on success and non-0 on 
failure.

For functions that return an errorcode (as opposed to returning a value like 
hdfsSeek or a structure like hdfsStatAll), they should always return errno.

> libhdfs++: make error returning mechanism consistent across all hdfs 
> operations
> ---
>
> Key: HDFS-10511
> URL: https://issues.apache.org/jira/browse/HDFS-10511
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Anatoli Shein
>
> Errno should always be set.
> If function is returning a code on stack, it should be consistent with errno.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10441) libhdfs++: HA namenode support

2016-06-08 Thread Bob Hansen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bob Hansen updated HDFS-10441:
--
Assignee: Bob Hansen  (was: James Clampffer)

> libhdfs++: HA namenode support
> --
>
> Key: HDFS-10441
> URL: https://issues.apache.org/jira/browse/HDFS-10441
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: Bob Hansen
> Attachments: HDFS-10441.HDFS-8707.000.patch, 
> HDFS-10441.HDFS-8707.002.patch, HDFS-10441.HDFS-8707.003.patch, 
> HDFS-8707.HDFS-10441.001.patch
>
>
> If a cluster is HA enabled then do proper failover.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10441) libhdfs++: HA namenode support

2016-06-08 Thread Bob Hansen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bob Hansen updated HDFS-10441:
--
Attachment: HDFS-10441.HDFS-8707.003.patch

> libhdfs++: HA namenode support
> --
>
> Key: HDFS-10441
> URL: https://issues.apache.org/jira/browse/HDFS-10441
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: Bob Hansen
> Attachments: HDFS-10441.HDFS-8707.000.patch, 
> HDFS-10441.HDFS-8707.002.patch, HDFS-10441.HDFS-8707.003.patch, 
> HDFS-8707.HDFS-10441.001.patch
>
>
> If a cluster is HA enabled then do proper failover.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10441) libhdfs++: HA namenode support

2016-06-08 Thread Bob Hansen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bob Hansen updated HDFS-10441:
--
Attachment: HDFS-10441.HDFS-8707.002.patch

> libhdfs++: HA namenode support
> --
>
> Key: HDFS-10441
> URL: https://issues.apache.org/jira/browse/HDFS-10441
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: Bob Hansen
> Attachments: HDFS-10441.HDFS-8707.000.patch, 
> HDFS-10441.HDFS-8707.002.patch, HDFS-8707.HDFS-10441.001.patch
>
>
> If a cluster is HA enabled then do proper failover.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10441) libhdfs++: HA namenode support

2016-06-08 Thread Bob Hansen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bob Hansen updated HDFS-10441:
--
Assignee: James Clampffer  (was: Bob Hansen)

> libhdfs++: HA namenode support
> --
>
> Key: HDFS-10441
> URL: https://issues.apache.org/jira/browse/HDFS-10441
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: James Clampffer
> Attachments: HDFS-10441.HDFS-8707.000.patch, 
> HDFS-10441.HDFS-8707.002.patch, HDFS-8707.HDFS-10441.001.patch
>
>
> If a cluster is HA enabled then do proper failover.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-10441) libhdfs++: HA namenode support

2016-06-08 Thread Bob Hansen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bob Hansen reassigned HDFS-10441:
-

Assignee: Bob Hansen  (was: James Clampffer)

> libhdfs++: HA namenode support
> --
>
> Key: HDFS-10441
> URL: https://issues.apache.org/jira/browse/HDFS-10441
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: Bob Hansen
> Attachments: HDFS-10441.HDFS-8707.000.patch, 
> HDFS-8707.HDFS-10441.001.patch
>
>
> If a cluster is HA enabled then do proper failover.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-10507) libhfds++: if HA is available, authentication doesn't get parsed in configs

2016-06-08 Thread Bob Hansen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bob Hansen resolved HDFS-10507.
---
Resolution: Fixed

> libhfds++: if HA is available, authentication doesn't get parsed in configs
> ---
>
> Key: HDFS-10507
> URL: https://issues.apache.org/jira/browse/HDFS-10507
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
>Assignee: Bob Hansen
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-10507) libhfds++: if HA is available, authentication doesn't get parsed in configs

2016-06-08 Thread Bob Hansen (JIRA)
Bob Hansen created HDFS-10507:
-

 Summary: libhfds++: if HA is available, authentication doesn't get 
parsed in configs
 Key: HDFS-10507
 URL: https://issues.apache.org/jira/browse/HDFS-10507
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Bob Hansen
Assignee: Bob Hansen






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10494) libhdfs++: Implement snapshot operations

2016-06-08 Thread Bob Hansen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15321426#comment-15321426
 ] 

Bob Hansen commented on HDFS-10494:
---

This patch also includes all of HDFS-10491; we'll need to calculate the diffs 
from there once it lands.

* Let's document the expected return values / errno state for the new 
hdfs_ext.h functions.  
* Also document that name can be null for hdfsCreateSnapshot (that is 
counter-intuitive), and that it cannot be null for deleteSnapshot.  
* Check that name and path are non-null where it is required and return an 
error code rather than segfaulting.
* GetRandomClientID should be returning a 16-byte Class 4 UUID.  See 
https://github.com/c9n/hadoop/blob/master/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/ClientId.java
 and 
https://en.wikipedia.org/wiki/Universally_unique_identifier#Version_4_.28random.29
 for more info.  Be sure to randomly generate 16 bytes worth of data, then set 
the variant and version bits.
* Not specific to this code, but we should be ensuring we're connected when FS 
methods are called.  We can either transparently connect or complain if we're 
not.  We should capture this in another Jira.
* New FS methods should check their inputs, especially for empty strings
* Should new hdfs_ext functions set errno if there's a failure?
* Do we need to define hdfsGetCapacity -> libhdfspp_hdfsGetCapacity in wrapper 
defs?




Minor point:
* The extra lines for the sub-constructor for the nn_ variable in the 
FileSystemImpl ctor should be indented a couple of extra spaces.

> libhdfs++: Implement snapshot operations
> 
>
> Key: HDFS-10494
> URL: https://issues.apache.org/jira/browse/HDFS-10494
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Anatoli Shein
>Assignee: Anatoli Shein
> Attachments: HDFS-10494.HDFS-8707.000.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10454) libhdfspp: Move NameNodeOp to a separate file

2016-06-06 Thread Bob Hansen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bob Hansen updated HDFS-10454:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed with ed43f07ad921efcec743c9276ab8da11c4bddf77

> libhdfspp: Move NameNodeOp to a separate file
> -
>
> Key: HDFS-10454
> URL: https://issues.apache.org/jira/browse/HDFS-10454
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Anatoli Shein
>Assignee: Anatoli Shein
>Priority: Minor
> Attachments: HDFS-10454.HDFS-8707.000.patch, 
> HDFS-10454.HDFS-8707.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-10487) libhdfs++: support fs.default.name

2016-06-03 Thread Bob Hansen (JIRA)
Bob Hansen created HDFS-10487:
-

 Summary: libhdfs++: support fs.default.name
 Key: HDFS-10487
 URL: https://issues.apache.org/jira/browse/HDFS-10487
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Bob Hansen


Libhdfs++ connects to the value specified by fs.defaultFS in the configuration 
file.

Some older configurations may be using fs.default.name instead.  Either should 
be accepted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



  1   2   3   4   5   6   >