[jira] [Commented] (HADOOP-15016) Cost-Based RPC FairCallQueue with Reservation support
[ https://issues.apache.org/jira/browse/HADOOP-15016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673358#comment-16673358 ] Wei Yan commented on HADOOP-15016: -- [~xkrogen] , feel free to go ahead. cc [~csun] > Cost-Based RPC FairCallQueue with Reservation support > - > > Key: HADOOP-15016 > URL: https://issues.apache.org/jira/browse/HADOOP-15016 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Wei Yan >Assignee: Wei Yan >Priority: Major > Attachments: Adding reservation support to NameNode RPC resource.pdf, > Adding reservation support to NameNode RPC resource_v2.pdf, > HADOOP-15016_poc.patch > > > FairCallQueue is introduced to provide RPC resource fairness among different > users. In current implementation, each user is weighted equally, and the > processing priority for different RPC calls are based on how many requests > that user sent before. This works well when the cluster is shared among > several end-users. > However, this has some limitations when a cluster is shared among both > end-users and some service jobs, like some ETL jobs which run under a service > account and need to issue lots of RPC calls. When NameNode becomes quite > busy, this set of jobs can be easily backoffed and low-prioritied. We cannot > simply treat this type jobs as "bad" user who randomly issues too many calls, > as their calls are normal calls. Also, it is unfair to weight a end-user and > a heavy service user equally when allocating RPC resources. > One idea here is to introduce reservation support to RPC resources. That is, > for some services, we reserve some RPC resources for their calls. This idea > is very similar to how YARN manages CPU/memory resources among different > resource queues. A little more details here: Along with existing > FairCallQueue setup (like using 4 queues with different priorities), we would > add some additional special queues, one for each special service user. For > each special service user, we provide a guarantee RPC share (like 10% which > can be aligned with its YARN resource share), and this percentage can be > converted to a weight used in WeightedRoundRobinMultiplexer. A quick example, > we have 4 default queues with default weights (8, 4, 2, 1), and two special > service users (user1 with 10% share, and user2 with 15% share). So finally > we'll have 6 queues, 4 default queues (with weights 8, 4, 2, 1) and 2 special > queues (user1Queue weighted 15*10%/75%=2, and user2Queue weighted > 15*15%/75%=3). > For new coming RPC calls from special service users, they will be put > directly to the corresponding reserved queue; for other calls, just follow > current implementation. > By default, there is no special user and all RPC requests follow existing > FairCallQueue implementation. > Would like to hear more comments on this approach; also want to know any > other better solutions? Will put a detailed design once get some early > comments. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13144) Enhancing IPC client throughput via multiple connections per user
[ https://issues.apache.org/jira/browse/HADOOP-13144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16391984#comment-16391984 ] Wei Yan commented on HADOOP-13144: -- +1 from my side. The RPC.java is not active updated part. Maybe [~cnauroth] , [~steve_l] can help take a look? > Enhancing IPC client throughput via multiple connections per user > - > > Key: HADOOP-13144 > URL: https://issues.apache.org/jira/browse/HADOOP-13144 > Project: Hadoop Common > Issue Type: Improvement > Components: ipc >Reporter: Jason Kace >Assignee: Íñigo Goiri >Priority: Minor > Attachments: HADOOP-13144.000.patch, HADOOP-13144.001.patch > > > The generic IPC client ({{org.apache.hadoop.ipc.Client}}) utilizes a single > connection thread for each {{ConnectionId}}. The {{ConnectionId}} is unique > to the connection's remote address, ticket and protocol. Each ConnectionId > is 1:1 mapped to a connection thread by the client via a map cache. > The result is to serialize all IPC read/write activity through a single > thread for a each user/ticket + address. If a single user makes repeated > calls (1k-100k/sec) to the same destination, the IPC client becomes a > bottleneck. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13144) Enhancing IPC client throughput via multiple connections per user
[ https://issues.apache.org/jira/browse/HADOOP-13144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16391491#comment-16391491 ] Wei Yan commented on HADOOP-13144: -- Thanks for the patch [~elgoiri]. I tried it yesterday and it worked well. The Router RPC throughput has been largely improved, and RPC handlers are not blocked on the connection itself. BTW, it also needs to add new function implementation in classed ProtobufRpcEngine and TestRPC.StoppedRpcEngine. > Enhancing IPC client throughput via multiple connections per user > - > > Key: HADOOP-13144 > URL: https://issues.apache.org/jira/browse/HADOOP-13144 > Project: Hadoop Common > Issue Type: Improvement > Components: ipc >Reporter: Jason Kace >Priority: Minor > Attachments: HADOOP-13144.000.patch > > > The generic IPC client ({{org.apache.hadoop.ipc.Client}}) utilizes a single > connection thread for each {{ConnectionId}}. The {{ConnectionId}} is unique > to the connection's remote address, ticket and protocol. Each ConnectionId > is 1:1 mapped to a connection thread by the client via a map cache. > The result is to serialize all IPC read/write activity through a single > thread for a each user/ticket + address. If a single user makes repeated > calls (1k-100k/sec) to the same destination, the IPC client becomes a > bottleneck. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15016) Cost-Based RPC FairCallQueue with Reservation support
[ https://issues.apache.org/jira/browse/HADOOP-15016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16347132#comment-16347132 ] Wei Yan commented on HADOOP-15016: -- Sorry, [~xyao], I missed your previous comment... {quote}bq. 1. This can be a useful feature for multi-tenancy Hadoop cluster. The cost estimates for different RPC calls can be difficult. Instead of hardcode fixed value per RPC, I would suggest making it a pluggable interface so that we can customize it for different deployments. {quote} Agree. This cost calculation will be pluggable. {quote}bq. 2. The reserved share of call queue looks good. It is similar what we proposed in HADOOP-13128. What do we plan to handle the case when the reserved queue is full? blocking or backoff? {quote} Currently I'm thinking about backoff, the same behavior like how existing queues handle full. {quote}bq. 3. The feature might need many manual configurations and tune to work for specific deployment and workloads. Do you want to add a section to discuss configurations, CLI tools, etc. to make this easier to use? {quote} Yes. I'm looking for a mathmatical model to calculate cost for different RPC calls, based on historical access pattern. This could be a suggestion for users to use. Also, may need to build a similar simulation tool, to replay the historical RPC log to verify different configurations. {quote}bq. 4. It would be great if you could share some of the results achieved with the POC patch (e.g., RPC/second, average locking, process and queue time with/wo the patch). {quote} Is busy with some other projects. Will put some results around next month. > Cost-Based RPC FairCallQueue with Reservation support > - > > Key: HADOOP-15016 > URL: https://issues.apache.org/jira/browse/HADOOP-15016 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Wei Yan >Assignee: Wei Yan >Priority: Major > Attachments: Adding reservation support to NameNode RPC resource.pdf, > Adding reservation support to NameNode RPC resource_v2.pdf, > HADOOP-15016_poc.patch > > > FairCallQueue is introduced to provide RPC resource fairness among different > users. In current implementation, each user is weighted equally, and the > processing priority for different RPC calls are based on how many requests > that user sent before. This works well when the cluster is shared among > several end-users. > However, this has some limitations when a cluster is shared among both > end-users and some service jobs, like some ETL jobs which run under a service > account and need to issue lots of RPC calls. When NameNode becomes quite > busy, this set of jobs can be easily backoffed and low-prioritied. We cannot > simply treat this type jobs as "bad" user who randomly issues too many calls, > as their calls are normal calls. Also, it is unfair to weight a end-user and > a heavy service user equally when allocating RPC resources. > One idea here is to introduce reservation support to RPC resources. That is, > for some services, we reserve some RPC resources for their calls. This idea > is very similar to how YARN manages CPU/memory resources among different > resource queues. A little more details here: Along with existing > FairCallQueue setup (like using 4 queues with different priorities), we would > add some additional special queues, one for each special service user. For > each special service user, we provide a guarantee RPC share (like 10% which > can be aligned with its YARN resource share), and this percentage can be > converted to a weight used in WeightedRoundRobinMultiplexer. A quick example, > we have 4 default queues with default weights (8, 4, 2, 1), and two special > service users (user1 with 10% share, and user2 with 15% share). So finally > we'll have 6 queues, 4 default queues (with weights 8, 4, 2, 1) and 2 special > queues (user1Queue weighted 15*10%/75%=2, and user2Queue weighted > 15*15%/75%=3). > For new coming RPC calls from special service users, they will be put > directly to the corresponding reserved queue; for other calls, just follow > current implementation. > By default, there is no special user and all RPC requests follow existing > FairCallQueue implementation. > Would like to hear more comments on this approach; also want to know any > other better solutions? Will put a detailed design once get some early > comments. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15016) Cost-Based RPC FairCallQueue with Reservation support
[ https://issues.apache.org/jira/browse/HADOOP-15016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Yan updated HADOOP-15016: - Summary: Cost-Based RPC FairCallQueue with Reservation support (was: Add reservation support to RPC FairCallQueue) > Cost-Based RPC FairCallQueue with Reservation support > - > > Key: HADOOP-15016 > URL: https://issues.apache.org/jira/browse/HADOOP-15016 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Wei Yan >Assignee: Wei Yan > Attachments: Adding reservation support to NameNode RPC resource.pdf, > Adding reservation support to NameNode RPC resource_v2.pdf, > HADOOP-15016_poc.patch > > > FairCallQueue is introduced to provide RPC resource fairness among different > users. In current implementation, each user is weighted equally, and the > processing priority for different RPC calls are based on how many requests > that user sent before. This works well when the cluster is shared among > several end-users. > However, this has some limitations when a cluster is shared among both > end-users and some service jobs, like some ETL jobs which run under a service > account and need to issue lots of RPC calls. When NameNode becomes quite > busy, this set of jobs can be easily backoffed and low-prioritied. We cannot > simply treat this type jobs as "bad" user who randomly issues too many calls, > as their calls are normal calls. Also, it is unfair to weight a end-user and > a heavy service user equally when allocating RPC resources. > One idea here is to introduce reservation support to RPC resources. That is, > for some services, we reserve some RPC resources for their calls. This idea > is very similar to how YARN manages CPU/memory resources among different > resource queues. A little more details here: Along with existing > FairCallQueue setup (like using 4 queues with different priorities), we would > add some additional special queues, one for each special service user. For > each special service user, we provide a guarantee RPC share (like 10% which > can be aligned with its YARN resource share), and this percentage can be > converted to a weight used in WeightedRoundRobinMultiplexer. A quick example, > we have 4 default queues with default weights (8, 4, 2, 1), and two special > service users (user1 with 10% share, and user2 with 15% share). So finally > we'll have 6 queues, 4 default queues (with weights 8, 4, 2, 1) and 2 special > queues (user1Queue weighted 15*10%/75%=2, and user2Queue weighted > 15*15%/75%=3). > For new coming RPC calls from special service users, they will be put > directly to the corresponding reserved queue; for other calls, just follow > current implementation. > By default, there is no special user and all RPC requests follow existing > FairCallQueue implementation. > Would like to hear more comments on this approach; also want to know any > other better solutions? Will put a detailed design once get some early > comments. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15016) Add reservation support to RPC FairCallQueue
[ https://issues.apache.org/jira/browse/HADOOP-15016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Yan updated HADOOP-15016: - Attachment: Adding reservation support to NameNode RPC resource_v2.pdf [~xyao] [~jnp] I put a new design doc to cover both reservation support and cost-based features for FairCallQueue. Could u provide some feedbacks? > Add reservation support to RPC FairCallQueue > > > Key: HADOOP-15016 > URL: https://issues.apache.org/jira/browse/HADOOP-15016 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Wei Yan >Assignee: Wei Yan > Attachments: Adding reservation support to NameNode RPC resource.pdf, > Adding reservation support to NameNode RPC resource_v2.pdf, > HADOOP-15016_poc.patch > > > FairCallQueue is introduced to provide RPC resource fairness among different > users. In current implementation, each user is weighted equally, and the > processing priority for different RPC calls are based on how many requests > that user sent before. This works well when the cluster is shared among > several end-users. > However, this has some limitations when a cluster is shared among both > end-users and some service jobs, like some ETL jobs which run under a service > account and need to issue lots of RPC calls. When NameNode becomes quite > busy, this set of jobs can be easily backoffed and low-prioritied. We cannot > simply treat this type jobs as "bad" user who randomly issues too many calls, > as their calls are normal calls. Also, it is unfair to weight a end-user and > a heavy service user equally when allocating RPC resources. > One idea here is to introduce reservation support to RPC resources. That is, > for some services, we reserve some RPC resources for their calls. This idea > is very similar to how YARN manages CPU/memory resources among different > resource queues. A little more details here: Along with existing > FairCallQueue setup (like using 4 queues with different priorities), we would > add some additional special queues, one for each special service user. For > each special service user, we provide a guarantee RPC share (like 10% which > can be aligned with its YARN resource share), and this percentage can be > converted to a weight used in WeightedRoundRobinMultiplexer. A quick example, > we have 4 default queues with default weights (8, 4, 2, 1), and two special > service users (user1 with 10% share, and user2 with 15% share). So finally > we'll have 6 queues, 4 default queues (with weights 8, 4, 2, 1) and 2 special > queues (user1Queue weighted 15*10%/75%=2, and user2Queue weighted > 15*15%/75%=3). > For new coming RPC calls from special service users, they will be put > directly to the corresponding reserved queue; for other calls, just follow > current implementation. > By default, there is no special user and all RPC requests follow existing > FairCallQueue implementation. > Would like to hear more comments on this approach; also want to know any > other better solutions? Will put a detailed design once get some early > comments. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15016) Add reservation support to RPC FairCallQueue
[ https://issues.apache.org/jira/browse/HADOOP-15016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Yan updated HADOOP-15016: - Attachment: Adding reservation support to NameNode RPC resource.pdf HADOOP-15016_poc.patch Attach a design doc and a poc patch. Will split into sub-tasks once got more comments. > Add reservation support to RPC FairCallQueue > > > Key: HADOOP-15016 > URL: https://issues.apache.org/jira/browse/HADOOP-15016 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Wei Yan >Assignee: Wei Yan > Attachments: Adding reservation support to NameNode RPC resource.pdf, > HADOOP-15016_poc.patch > > > FairCallQueue is introduced to provide RPC resource fairness among different > users. In current implementation, each user is weighted equally, and the > processing priority for different RPC calls are based on how many requests > that user sent before. This works well when the cluster is shared among > several end-users. > However, this has some limitations when a cluster is shared among both > end-users and some service jobs, like some ETL jobs which run under a service > account and need to issue lots of RPC calls. When NameNode becomes quite > busy, this set of jobs can be easily backoffed and low-prioritied. We cannot > simply treat this type jobs as "bad" user who randomly issues too many calls, > as their calls are normal calls. Also, it is unfair to weight a end-user and > a heavy service user equally when allocating RPC resources. > One idea here is to introduce reservation support to RPC resources. That is, > for some services, we reserve some RPC resources for their calls. This idea > is very similar to how YARN manages CPU/memory resources among different > resource queues. A little more details here: Along with existing > FairCallQueue setup (like using 4 queues with different priorities), we would > add some additional special queues, one for each special service user. For > each special service user, we provide a guarantee RPC share (like 10% which > can be aligned with its YARN resource share), and this percentage can be > converted to a weight used in WeightedRoundRobinMultiplexer. A quick example, > we have 4 default queues with default weights (8, 4, 2, 1), and two special > service users (user1 with 10% share, and user2 with 15% share). So finally > we'll have 6 queues, 4 default queues (with weights 8, 4, 2, 1) and 2 special > queues (user1Queue weighted 15*10%/75%=2, and user2Queue weighted > 15*15%/75%=3). > For new coming RPC calls from special service users, they will be put > directly to the corresponding reserved queue; for other calls, just follow > current implementation. > By default, there is no special user and all RPC requests follow existing > FairCallQueue implementation. > Would like to hear more comments on this approach; also want to know any > other better solutions? Will put a detailed design once get some early > comments. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15016) Add reservation support to RPC FairCallQueue
[ https://issues.apache.org/jira/browse/HADOOP-15016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16238144#comment-16238144 ] Wei Yan commented on HADOOP-15016: -- Thanks for the comments, [~xyao]. {quote} Have you looking into RPC CallerID (HDFS-9184) that is designed to trace callers under different services (Yarn/Spark/Hive/Tez). You could extend a IdentifyProvider to leverage that and thus avoid punishing all the RPC calls from the same service user. {quote} Yes, we plan to enable HDFS-9184 to log out more detailed information for audit purpose. But here we would like to group calls by service users, no matter what engines it uses. {quote} Can you elaborate on how to quantify the cost of RPC calls, which are not equal in terms of the cost on NN? Same RPC call with different parameter may have significant difference in cost as well. Can you post more details of the proposal for discussion. {quote} We also looked into how to build a cost-based FairCallQueue, and have some early results. One rough idea now we have is to simply assign different weights to read/write, and some large listStatus calls, instead of tracking the detailed lockTime for each RPC call. As cost-based is separated from this jira, we'll open another ticket once we have some more results. > Add reservation support to RPC FairCallQueue > > > Key: HADOOP-15016 > URL: https://issues.apache.org/jira/browse/HADOOP-15016 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Wei Yan >Assignee: Wei Yan >Priority: Normal > > FairCallQueue is introduced to provide RPC resource fairness among different > users. In current implementation, each user is weighted equally, and the > processing priority for different RPC calls are based on how many requests > that user sent before. This works well when the cluster is shared among > several end-users. > However, this has some limitations when a cluster is shared among both > end-users and some service jobs, like some ETL jobs which run under a service > account and need to issue lots of RPC calls. When NameNode becomes quite > busy, this set of jobs can be easily backoffed and low-prioritied. We cannot > simply treat this type jobs as "bad" user who randomly issues too many calls, > as their calls are normal calls. Also, it is unfair to weight a end-user and > a heavy service user equally when allocating RPC resources. > One idea here is to introduce reservation support to RPC resources. That is, > for some services, we reserve some RPC resources for their calls. This idea > is very similar to how YARN manages CPU/memory resources among different > resource queues. A little more details here: Along with existing > FairCallQueue setup (like using 4 queues with different priorities), we would > add some additional special queues, one for each special service user. For > each special service user, we provide a guarantee RPC share (like 10% which > can be aligned with its YARN resource share), and this percentage can be > converted to a weight used in WeightedRoundRobinMultiplexer. A quick example, > we have 4 default queues with default weights (8, 4, 2, 1), and two special > service users (user1 with 10% share, and user2 with 15% share). So finally > we'll have 6 queues, 4 default queues (with weights 8, 4, 2, 1) and 2 special > queues (user1Queue weighted 15*10%/75%=2, and user2Queue weighted > 15*15%/75%=3). > For new coming RPC calls from special service users, they will be put > directly to the corresponding reserved queue; for other calls, just follow > current implementation. > By default, there is no special user and all RPC requests follow existing > FairCallQueue implementation. > Would like to hear more comments on this approach; also want to know any > other better solutions? Will put a detailed design once get some early > comments. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15016) Add reservation support to RPC FairCallQueue
[ https://issues.apache.org/jira/browse/HADOOP-15016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Yan updated HADOOP-15016: - Description: FairCallQueue is introduced to provide RPC resource fairness among different users. In current implementation, each user is weighted equally, and the processing priority for different RPC calls are based on how many requests that user sent before. This works well when the cluster is shared among several end-users. However, this has some limitations when a cluster is shared among both end-users and some service jobs, like some ETL jobs which run under a service account and need to issue lots of RPC calls. When NameNode becomes quite busy, this set of jobs can be easily backoffed and low-prioritied. We cannot simply treat this type jobs as "bad" user who randomly issues too many calls, as their calls are normal calls. Also, it is unfair to weight a end-user and a heavy service user equally when allocating RPC resources. One idea here is to introduce reservation support to RPC resources. That is, for some services, we reserve some RPC resources for their calls. This idea is very similar to how YARN manages CPU/memory resources among different resource queues. A little more details here: Along with existing FairCallQueue setup (like using 4 queues with different priorities), we would add some additional special queues, one for each special service user. For each special service user, we provide a guarantee RPC share (like 10% which can be aligned with its YARN resource share), and this percentage can be converted to a weight used in WeightedRoundRobinMultiplexer. A quick example, we have 4 default queues with default weights (8, 4, 2, 1), and two special service users (user1 with 10% share, and user2 with 15% share). So finally we'll have 6 queues, 4 default queues (with weights 8, 4, 2, 1) and 2 special queues (user1Queue weighted 15*10%/75%=2, and user2Queue weighted 15*15%/75%=3). For new coming RPC calls from special service users, they will be put directly to the corresponding reserved queue; for other calls, just follow current implementation. By default, there is no special user and all RPC requests follow existing FairCallQueue implementation. Would like to hear more comments on this approach; also want to know any other better solutions? Will put a detailed design once get some early comments. was: FairCallQueue is introduced to provide RPC resource fairness among different users. In current implementation, each user is weighted equally, and the processing priority for different RPC calls are based on how many requests that user sent before. This works well when the cluster is shared among several end-users. However, this has some limitations when a cluster is shared among both end-users and some service jobs, like some ETL jobs which run under a service account and need to issue lots of RPC calls. When NameNode becomes quite busy, this set of jobs can be easily backoffed and low-prioritied. We cannot simply treat this type jobs as "bad" user who randomly issues too many calls, as their calls are normal calls. Also, it is unfair to weight a end-user and a heavy service user equally when allocating RPC resources. One idea here is to introduce reservation support to RPC resources. That is, for some services, we reserve some RPC resources for their calls. This idea is very similar to how YARN manages CPU/memory resources among different resource queues. A little more details here: Along with existing FairCallQueue setup (like using 4 queues with different priorities), we would add some additional special queues, one for each special service user. For each special service user, we provide a guarantee RPC share (like 10% which can be aligned with its YARN resource share), and this percentage can be converted to a weight used in WeightedRoundRobinMultiplexer. A quick example, we have 4 default queues with default weights (8, 4, 2, 1), and two special service users (user1 with 10% share, and user2 with 15% share). So finally we'll have 6 queues, 4 default queues (with weights 8, 4, 2, 1) and 2 special queues (user1Queue weighted 15*10%/75%=2, and user2Queue weighted 15*15%/75%=3). For new coming RPC calls from special service users, they will be put directly to the corresponding reserved queue; for other calls, just follow current implementation. By default, there is no special user and all RPC requests follow existing FairCallQueue implementation. Would like to hear more comments on this approach; also want to know any other better solutions? > Add reservation support to RPC FairCallQueue > > > Key: HADOOP-15016 > URL: https://issues.apache.org/jira/browse/HADOOP-15016 > Project: Hadoop Common > Issue Type: Improvement >
[jira] [Commented] (HADOOP-15016) Add reservation support to RPC FairCallQueue
[ https://issues.apache.org/jira/browse/HADOOP-15016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16237817#comment-16237817 ] Wei Yan commented on HADOOP-15016: -- ping [~xyao], as this would be related to HADOOP-13128 resource coupon idea. > Add reservation support to RPC FairCallQueue > > > Key: HADOOP-15016 > URL: https://issues.apache.org/jira/browse/HADOOP-15016 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Wei Yan >Assignee: Wei Yan >Priority: Normal > > FairCallQueue is introduced to provide RPC resource fairness among different > users. In current implementation, each user is weighted equally, and the > processing priority for different RPC calls are based on how many requests > that user sent before. This works well when the cluster is shared among > several end-users. > However, this has some limitations when a cluster is shared among both > end-users and some service jobs, like some ETL jobs which run under a service > account and need to issue lots of RPC calls. When NameNode becomes quite > busy, this set of jobs can be easily backoffed and low-prioritied. We cannot > simply treat this type jobs as "bad" user who randomly issues too many calls, > as their calls are normal calls. Also, it is unfair to weight a end-user and > a heavy service user equally when allocating RPC resources. > One idea here is to introduce reservation support to RPC resources. That is, > for some services, we reserve some RPC resources for their calls. This idea > is very similar to how YARN manages CPU/memory resources among different > resource queues. A little more details here: Along with existing > FairCallQueue setup (like using 4 queues with different priorities), we would > add some additional special queues, one for each special service user. For > each special service user, we provide a guarantee RPC share (like 10% which > can be aligned with its YARN resource share), and this percentage can be > converted to a weight used in WeightedRoundRobinMultiplexer. A quick example, > we have 4 default queues with default weights (8, 4, 2, 1), and two special > service users (user1 with 10% share, and user2 with 15% share). So finally > we'll have 6 queues, 4 default queues (with weights 8, 4, 2, 1) and 2 special > queues (user1Queue weighted 15*10%/75%=2, and user2Queue weighted > 15*15%/75%=3). > For new coming RPC calls from special service users, they will be put > directly to the corresponding reserved queue; for other calls, just follow > current implementation. > By default, there is no special user and all RPC requests follow existing > FairCallQueue implementation. > Would like to hear more comments on this approach; also want to know any > other better solutions? -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Created] (HADOOP-15016) Add reservation support to RPC FairCallQueue
Wei Yan created HADOOP-15016: Summary: Add reservation support to RPC FairCallQueue Key: HADOOP-15016 URL: https://issues.apache.org/jira/browse/HADOOP-15016 Project: Hadoop Common Issue Type: Improvement Reporter: Wei Yan Assignee: Wei Yan Priority: Normal FairCallQueue is introduced to provide RPC resource fairness among different users. In current implementation, each user is weighted equally, and the processing priority for different RPC calls are based on how many requests that user sent before. This works well when the cluster is shared among several end-users. However, this has some limitations when a cluster is shared among both end-users and some service jobs, like some ETL jobs which run under a service account and need to issue lots of RPC calls. When NameNode becomes quite busy, this set of jobs can be easily backoffed and low-prioritied. We cannot simply treat this type jobs as "bad" user who randomly issues too many calls, as their calls are normal calls. Also, it is unfair to weight a end-user and a heavy service user equally when allocating RPC resources. One idea here is to introduce reservation support to RPC resources. That is, for some services, we reserve some RPC resources for their calls. This idea is very similar to how YARN manages CPU/memory resources among different resource queues. A little more details here: Along with existing FairCallQueue setup (like using 4 queues with different priorities), we would add some additional special queues, one for each special service user. For each special service user, we provide a guarantee RPC share (like 10% which can be aligned with its YARN resource share), and this percentage can be converted to a weight used in WeightedRoundRobinMultiplexer. A quick example, we have 4 default queues with default weights (8, 4, 2, 1), and two special service users (user1 with 10% share, and user2 with 15% share). So finally we'll have 6 queues, 4 default queues (with weights 8, 4, 2, 1) and 2 special queues (user1Queue weighted 15*10%/75%=2, and user2Queue weighted 15*15%/75%=3). For new coming RPC calls from special service users, they will be put directly to the corresponding reserved queue; for other calls, just follow current implementation. By default, there is no special user and all RPC requests follow existing FairCallQueue implementation. Would like to hear more comments on this approach; also want to know any other better solutions? -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13128) Manage Hadoop RPC resource usage via resource coupon
[ https://issues.apache.org/jira/browse/HADOOP-13128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16157642#comment-16157642 ] Wei Yan commented on HADOOP-13128: -- [~xyao] thanks for sharing the design. We have a very similar issue as you discussed in the doc and resource coupon is a very good idea. Our Hadoop cluster is shared among multiple different services/jobs/queries, and some services/jobs (ETL/ingestion) may send too many RPC calls to NN. Under current implementation, these jobs can be easily backoff and low-prioritied as they run under the same service account, and it's not straightforward to distribute these calls to multiple service accounts. Also, some of these jobs get guaranteed YARN resources, but sometimes these jobs still get delayed due to RPC starvation. Instead of using resource coupon idea to manage RPC resources, we're looking into some more static approaches (as the number of abovementioned services/jobs is very small, less than 10), and trying to allocate dedicated RPC share for certain service users. Along with existing FairCallQueue setup (like using 10 queues with different priorities), we would add some additional special queues, one for each special user. For each special user, we provide a guarantee RPC share (like 10% which can be aligned with its YARN resource share), and this percentage can be converted to a weight used in WeightedRoundRobinMultiplexer. A quick example, we have 4 default queues with default weights (8, 4, 2, 1), and two special service users (user1 with 10% share, and user2 with 15% share). So finally we'll have 6 queues, 4 default queues (with weights 8, 4, 2, 1) and 2 special queues (user1Queue weighted 15*10%/75%=2, and user2Queue weighted 15*15%/75%=3). For new coming RPC call, we'll add one additional check. If it comes from a special user, it will be put into the dedicated queue reserved for that user; for other calls, we'll follow current count decay mechanism and put into the default queues. >From the handler side, it fetches new calls from queue using the index >provided by WeightedRoundRobinMultiplexer. By default, there is no special user and all RPC requests follow existing FairCallQueue implementation. Would like to hear more comments on this approach; also want to know any other available approaches? > Manage Hadoop RPC resource usage via resource coupon > > > Key: HADOOP-13128 > URL: https://issues.apache.org/jira/browse/HADOOP-13128 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Xiaoyu Yao >Assignee: Xiaoyu Yao > Attachments: HADOOP-13128-Proposal-20160511.pdf > > > HADOOP-9640 added RPC Fair Call Queue and HADOOP-10597 added RPC backoff to > ensure the fairness usage of the HDFS namenode resources. YARN, the Hadoop > cluster resource manager currently manages the CPU and Memory resources for > jobs/tasks but not the storage resources such as HDFS namenode and datanode > usage directly. As a result of that, a high priority Yarn Job may send too > many RPC requests to HDFS namenode and get demoted into low priority call > queues due to lack of reservation/coordination. > To better support multi-tenancy use cases like above, we propose to manage > RPC server resource usage via coupon mechanism integrated with YARN. The idea > is to allow YARN request HDFS storage resource coupon (e.g., namenode RPC > calls, datanode I/O bandwidth) from namenode on behalf of the job upon > submission time. Once granted, the tasks will include the coupon identifier > in RPC header for the subsequent calls. HDFS namenode RPC scheduler maintains > the state of the coupon usage based on the scheduler policy (fairness or > priority) to match the RPC priority with the YARN scheduling priority. > I will post a proposal with more detail shortly. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-10318) Incorrect reference to nodeFile in RumenToSLSConverter error message
[ https://issues.apache.org/jira/browse/HADOOP-10318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14729278#comment-14729278 ] Wei Yan commented on HADOOP-10318: -- Thanks, [~ozawa]. > Incorrect reference to nodeFile in RumenToSLSConverter error message > > > Key: HADOOP-10318 > URL: https://issues.apache.org/jira/browse/HADOOP-10318 > Project: Hadoop Common > Issue Type: Bug >Reporter: Ted Yu >Assignee: Wei Yan >Priority: Minor > Labels: BB2015-05-TBR, newbie > Fix For: 2.8.0 > > Attachments: HADOOP-10318.patch > > > {code} > if (! nodeFile.getParentFile().exists() > && ! nodeFile.getParentFile().mkdirs()) { > System.err.println("ERROR: Cannot create output directory in path: " > + jsonFile.getParentFile().getAbsoluteFile()); > {code} > jsonFile on the last line should be nodeFile -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HADOOP-11213) Fix typos in html pages: SecureMode and EncryptedShuffle
Wei Yan created HADOOP-11213: Summary: Fix typos in html pages: SecureMode and EncryptedShuffle Key: HADOOP-11213 URL: https://issues.apache.org/jira/browse/HADOOP-11213 Project: Hadoop Common Issue Type: Bug Reporter: Wei Yan Assignee: Wei Yan Priority: Minor In SecureMode.html, {noformat} banned.users| hfds,yarn,mapred,bin {noformat} Here hfds should be hdfs. In EncryptedShuffle.html, {noformat} hadoop.ssl.server.conf | ss-server.xml hadoop.ssl.client.conf | ss-client.xml {noformat} Here the two xml files should be ssl-*. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-11213) Fix typos in html pages: SecureMode and EncryptedShuffle
[ https://issues.apache.org/jira/browse/HADOOP-11213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Yan updated HADOOP-11213: - Attachment: HADOOP-11213-1.patch Fix typos in html pages: SecureMode and EncryptedShuffle Key: HADOOP-11213 URL: https://issues.apache.org/jira/browse/HADOOP-11213 Project: Hadoop Common Issue Type: Bug Reporter: Wei Yan Assignee: Wei Yan Priority: Minor Attachments: HADOOP-11213-1.patch In SecureMode.html, {noformat} banned.users | hfds,yarn,mapred,bin {noformat} Here hfds should be hdfs. In EncryptedShuffle.html, {noformat} hadoop.ssl.server.conf| ss-server.xml hadoop.ssl.client.conf| ss-client.xml {noformat} Here the two xml files should be ssl-*. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-10988) Community build Apache Hadoop 2.5 fails on Ubuntu 14.04
[ https://issues.apache.org/jira/browse/HADOOP-10988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14104818#comment-14104818 ] Wei Yan commented on HADOOP-10988: -- I tried the community built hadoop-2.5.0 in ubuntu 14.04, and the pi job can run correctly. Here is the environment. JDK {code} ubuntu@master:~/hadoop-2.5.0$ java -version java version 1.7.0_55 OpenJDK Runtime Environment (IcedTea 2.4.7) (7u55-2.4.7-1ubuntu1) OpenJDK 64-Bit Server VM (build 24.51-b03, mixed mode) {code} Ubuntu {code} ubuntu@master:~/hadoop-2.5.0$ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description:Ubuntu 14.04 LTS Release:14.04 Codename: trusty {code} From your error message: bq. OpenJDK 64-Bit Server VM warning: You have loaded library /home/ubuntu/hadoop/hadoop-2.2.0/lib/native/libhadoop.so.1.0.0 which might have disabled stack guard. The VM will try to fix the stack guard now. why have hadoop-2.2.0 here? Community build Apache Hadoop 2.5 fails on Ubuntu 14.04 --- Key: HADOOP-10988 URL: https://issues.apache.org/jira/browse/HADOOP-10988 Project: Hadoop Common Issue Type: Bug Components: build Affects Versions: 2.2.0, 2.5.0, 2.4.1 Environment: x86_64, Ubuntu 14.04, OpenJDK 7 hadoop 2.5 tar file from: http://apache.mirrors.pair.com/hadoop/common/hadoop-2.5.0/ Reporter: Amir Sanjar Executing any mapreduce applications (i.e. PI) using community version of hadoop build from http://apache.mirrors.pair.com/hadoop/common/hadoop-2.5.0/ fails with below error message. Rebuilding from source on an Ubuntu system with flags clean -Pnative fixes the problem. OpenJDK 64-Bit Server VM warning: You have loaded library /home/ubuntu/hadoop/hadoop-2.2.0/lib/native/libhadoop.so.1.0.0 which might have disabled stack guard. The VM will try to fix the stack guard now. It's highly recommended that you fix the library with 'execstack -c libfile', or link it with '-z noexecstack'. 14/08/19 21:24:54 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable java.net.ConnectException: Call From node1.maas/127.0.1.1 to localhost:9000 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:783) at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:730) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10988) Community build Apache Hadoop 2.5 fails on Ubuntu 14.04
[ https://issues.apache.org/jira/browse/HADOOP-10988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14105043#comment-14105043 ] Wei Yan commented on HADOOP-10988: -- Hi, [~asanjar], I run both in a single-node and 3-node environments. IMO, the error msg seems related to the 32-bit libraries. If I remember correctly, the built libraries in hadoop are built with 32-bit architecture, so we may face some problems when running with 64-bit architecture, like when we enable the cgroups. But normally, the 32-bit built libraries shouldn't cause any problem. Community build Apache Hadoop 2.5 fails on Ubuntu 14.04 --- Key: HADOOP-10988 URL: https://issues.apache.org/jira/browse/HADOOP-10988 Project: Hadoop Common Issue Type: Bug Components: build Affects Versions: 2.2.0, 2.5.0, 2.4.1 Environment: x86_64, Ubuntu 14.04, OpenJDK 7 hadoop 2.5 tar file from: http://apache.mirrors.pair.com/hadoop/common/hadoop-2.5.0/ Reporter: Amir Sanjar Executing any mapreduce applications (i.e. PI) using community version of hadoop build from http://apache.mirrors.pair.com/hadoop/common/hadoop-2.5.0/ fails with below error message. Rebuilding from source on an Ubuntu system with flags clean -Pnative fixes the problem. OpenJDK 64-Bit Server VM warning: You have loaded library /home/ubuntu/hadoop/hadoop-2.2.0/lib/native/libhadoop.so.1.0.0 which might have disabled stack guard. The VM will try to fix the stack guard now. It's highly recommended that you fix the library with 'execstack -c libfile', or link it with '-z noexecstack'. 14/08/19 21:24:54 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable java.net.ConnectException: Call From node1.maas/127.0.1.1 to localhost:9000 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:783) at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:730) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10927) Ran `hadoop credential` expecting usage, got NPE instead
[ https://issues.apache.org/jira/browse/HADOOP-10927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14083225#comment-14083225 ] Wei Yan commented on HADOOP-10927: -- This exception happened due to the following code in CredentialShell.java. The init() function returns 0 instead of -1 if the args is empty, which causes the run() failed. {code} private int init(String[] args) throws IOException { for (int i = 0; i args.length; i++) { // parse command line . } return 0; } {code} Ran `hadoop credential` expecting usage, got NPE instead Key: HADOOP-10927 URL: https://issues.apache.org/jira/browse/HADOOP-10927 Project: Hadoop Common Issue Type: Bug Components: security Reporter: Josh Elser Priority: Minor {noformat} $ hadoop credential java.lang.NullPointerException at org.apache.hadoop.security.alias.CredentialShell.run(CredentialShell.java:67) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.security.alias.CredentialShell.main(CredentialShell.java:420) {noformat} Ran a no-arg version of {{hadoop credential}} expecting to get the usage/help message (like other commands act), and got the above exception instead. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-9530) DBInputSplit creates invalid ranges on Oracle
[ https://issues.apache.org/jira/browse/HADOOP-9530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Yan updated HADOOP-9530: Attachment: HADOOP-9530.patch [~jserdaru], we also run this problem. As you mentioned, Oracle sql is different from MySQL which uses row offset. Uploaded a patch to fix this problem. DBInputSplit creates invalid ranges on Oracle - Key: HADOOP-9530 URL: https://issues.apache.org/jira/browse/HADOOP-9530 Project: Hadoop Common Issue Type: Bug Affects Versions: 1.1.2 Reporter: Julien Serdaru Attachments: HADOOP-9530.patch The DBInputFormat on Oracle does not create valid ranges. The method getSplit line 263 is as follows: split = new DBInputSplit(i * chunkSize, (i * chunkSize) + chunkSize); So the first split will have a start value of 0 (0*chunkSize). However, the OracleDBRecordReader, line 84 is as follows: if (split.getLength() 0 split.getStart() 0){ Since the start value of the first range is equal to 0, we will skip the block that partitions the input set. As a result, one of the map task will process the entire data set, rather than the partition. I'm assuming the fix is trivial and would involve removing the second check in the if block. Also, I believe the OracleDBRecordReader paging query is incorrect. Line 92 should read: query.append( ) WHERE dbif_rno ).append(split.getStart()); instead of (note instead of =) query.append( ) WHERE dbif_rno = ).append(split.getStart()); Otherwise some rows will be ignored and some counted more than once. A map/reduce job that counts the number of rows based on a predicate will highlight the incorrect behavior. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (HADOOP-9530) DBInputSplit creates invalid ranges on Oracle
[ https://issues.apache.org/jira/browse/HADOOP-9530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Yan reassigned HADOOP-9530: --- Assignee: Wei Yan DBInputSplit creates invalid ranges on Oracle - Key: HADOOP-9530 URL: https://issues.apache.org/jira/browse/HADOOP-9530 Project: Hadoop Common Issue Type: Bug Affects Versions: 1.1.2 Reporter: Julien Serdaru Assignee: Wei Yan Attachments: HADOOP-9530.patch The DBInputFormat on Oracle does not create valid ranges. The method getSplit line 263 is as follows: split = new DBInputSplit(i * chunkSize, (i * chunkSize) + chunkSize); So the first split will have a start value of 0 (0*chunkSize). However, the OracleDBRecordReader, line 84 is as follows: if (split.getLength() 0 split.getStart() 0){ Since the start value of the first range is equal to 0, we will skip the block that partitions the input set. As a result, one of the map task will process the entire data set, rather than the partition. I'm assuming the fix is trivial and would involve removing the second check in the if block. Also, I believe the OracleDBRecordReader paging query is incorrect. Line 92 should read: query.append( ) WHERE dbif_rno ).append(split.getStart()); instead of (note instead of =) query.append( ) WHERE dbif_rno = ).append(split.getStart()); Otherwise some rows will be ignored and some counted more than once. A map/reduce job that counts the number of rows based on a predicate will highlight the incorrect behavior. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-9530) DBInputSplit creates invalid ranges on Oracle
[ https://issues.apache.org/jira/browse/HADOOP-9530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Yan updated HADOOP-9530: Status: Patch Available (was: Open) DBInputSplit creates invalid ranges on Oracle - Key: HADOOP-9530 URL: https://issues.apache.org/jira/browse/HADOOP-9530 Project: Hadoop Common Issue Type: Bug Affects Versions: 1.1.2 Reporter: Julien Serdaru Assignee: Wei Yan Attachments: HADOOP-9530.patch The DBInputFormat on Oracle does not create valid ranges. The method getSplit line 263 is as follows: split = new DBInputSplit(i * chunkSize, (i * chunkSize) + chunkSize); So the first split will have a start value of 0 (0*chunkSize). However, the OracleDBRecordReader, line 84 is as follows: if (split.getLength() 0 split.getStart() 0){ Since the start value of the first range is equal to 0, we will skip the block that partitions the input set. As a result, one of the map task will process the entire data set, rather than the partition. I'm assuming the fix is trivial and would involve removing the second check in the if block. Also, I believe the OracleDBRecordReader paging query is incorrect. Line 92 should read: query.append( ) WHERE dbif_rno ).append(split.getStart()); instead of (note instead of =) query.append( ) WHERE dbif_rno = ).append(split.getStart()); Otherwise some rows will be ignored and some counted more than once. A map/reduce job that counts the number of rows based on a predicate will highlight the incorrect behavior. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10318) Incorrect reference to nodeFile in RumenToSLSConverter error message
[ https://issues.apache.org/jira/browse/HADOOP-10318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Yan updated HADOOP-10318: - Attachment: HADOOP-10318.patch Incorrect reference to nodeFile in RumenToSLSConverter error message Key: HADOOP-10318 URL: https://issues.apache.org/jira/browse/HADOOP-10318 Project: Hadoop Common Issue Type: Bug Reporter: Ted Yu Priority: Minor Attachments: HADOOP-10318.patch {code} if (! nodeFile.getParentFile().exists() ! nodeFile.getParentFile().mkdirs()) { System.err.println(ERROR: Cannot create output directory in path: + jsonFile.getParentFile().getAbsoluteFile()); {code} jsonFile on the last line should be nodeFile -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Assigned] (HADOOP-10318) Incorrect reference to nodeFile in RumenToSLSConverter error message
[ https://issues.apache.org/jira/browse/HADOOP-10318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Yan reassigned HADOOP-10318: Assignee: Wei Yan Incorrect reference to nodeFile in RumenToSLSConverter error message Key: HADOOP-10318 URL: https://issues.apache.org/jira/browse/HADOOP-10318 Project: Hadoop Common Issue Type: Bug Reporter: Ted Yu Assignee: Wei Yan Priority: Minor Attachments: HADOOP-10318.patch {code} if (! nodeFile.getParentFile().exists() ! nodeFile.getParentFile().mkdirs()) { System.err.println(ERROR: Cannot create output directory in path: + jsonFile.getParentFile().getAbsoluteFile()); {code} jsonFile on the last line should be nodeFile -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HADOOP-10318) Incorrect reference to nodeFile in RumenToSLSConverter error message
[ https://issues.apache.org/jira/browse/HADOOP-10318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Yan updated HADOOP-10318: - Status: Patch Available (was: Open) Incorrect reference to nodeFile in RumenToSLSConverter error message Key: HADOOP-10318 URL: https://issues.apache.org/jira/browse/HADOOP-10318 Project: Hadoop Common Issue Type: Bug Reporter: Ted Yu Assignee: Wei Yan Priority: Minor Attachments: HADOOP-10318.patch {code} if (! nodeFile.getParentFile().exists() ! nodeFile.getParentFile().mkdirs()) { System.err.println(ERROR: Cannot create output directory in path: + jsonFile.getParentFile().getAbsoluteFile()); {code} jsonFile on the last line should be nodeFile -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HADOOP-10318) Incorrect reference to nodeFile in RumenToSLSConverter error message
[ https://issues.apache.org/jira/browse/HADOOP-10318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13887502#comment-13887502 ] Wei Yan commented on HADOOP-10318: -- Thanks, [~yuzhih...@gmail.com]. Just upload a patch to fix that bug. Incorrect reference to nodeFile in RumenToSLSConverter error message Key: HADOOP-10318 URL: https://issues.apache.org/jira/browse/HADOOP-10318 Project: Hadoop Common Issue Type: Bug Reporter: Ted Yu Priority: Minor Attachments: HADOOP-10318.patch {code} if (! nodeFile.getParentFile().exists() ! nodeFile.getParentFile().mkdirs()) { System.err.println(ERROR: Cannot create output directory in path: + jsonFile.getParentFile().getAbsoluteFile()); {code} jsonFile on the last line should be nodeFile -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HADOOP-10245) Hadoop command line always appends -Xmx option twice
[ https://issues.apache.org/jira/browse/HADOOP-10245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881057#comment-13881057 ] Wei Yan commented on HADOOP-10245: -- [~shanyu], sorry for the late reply. So you mean we only let users specify -Xmx through $JAVA_HEAP_MAX? Hadoop command line always appends -Xmx option twice -- Key: HADOOP-10245 URL: https://issues.apache.org/jira/browse/HADOOP-10245 Project: Hadoop Common Issue Type: Bug Components: bin Affects Versions: 2.2.0 Reporter: shanyu zhao Assignee: shanyu zhao Attachments: HADOOP-10245.patch The Hadoop command line scripts (hadoop.sh or hadoop.cmd) will call java with -Xmx options twice. The impact is that any user defined HADOOP_HEAP_SIZE env variable will take no effect because it is overwritten by the second -Xmx option. For example, here is the java cmd generated for command hadoop fs -ls /, Notice that there are two -Xmx options: -Xmx1000m and -Xmx512m in the command line: java -Xmx1000m -Dhadoop.log.dir=C:\tmp\logs -Dhadoop.log.file=hadoop.log -Dhadoop.root.logger=INFO,c onsole,DRFA -Xmx512m -Dhadoop.security.logger=INFO,RFAS -classpath XXX org.apache.hadoop.fs.FsShell -ls / Here is the root cause: The call flow is: hadoop.sh calls hadoop_config.sh, which in turn calls hadoop-env.sh. In hadoop.sh, the command line is generated by the following pseudo code: java $JAVA_HEAP_MAX $HADOOP_CLIENT_OPTS -classpath ... In hadoop-config.sh, $JAVA_HEAP_MAX is initialized as -Xmx1000m if user didn't set $HADOOP_HEAP_SIZE env variable. In hadoop-env.sh, $HADOOP_CLIENT_OPTS is set as this: export HADOOP_CLIENT_OPTS=-Xmx512m $HADOOP_CLIENT_OPTS To fix this problem, we should remove the -Xmx512m from HADOOP_CLIENT_OPTS. If we really want to change the memory settings we need to use $HADOOP_HEAP_SIZE env variable. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HADOOP-10245) Hadoop command line always appends -Xmx option twice
[ https://issues.apache.org/jira/browse/HADOOP-10245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13877837#comment-13877837 ] Wei Yan commented on HADOOP-10245: -- [~shanyu], as discussed, there are multiple places configuring -Xmx. In the lastest patch in HADOOP-9870 provided by [~jhsenjaliya], $HADOOP_HEAPSIZE is checked firstly; if not set, assign -Xmx512m. Additionally, in bin/hadoop, also check the -Xmx configuration, to avoid duplicate configurations. Simply remove -Xmx512m from HADOOP_CLIENT_OPTS may still generate multiple -Xmx, as bin/hadoop also has a default $JAVA_HEAP_MAX, which is 1000m. IMO, I think HADOOP-9870 has fixed this issue. Hadoop command line always appends -Xmx option twice -- Key: HADOOP-10245 URL: https://issues.apache.org/jira/browse/HADOOP-10245 Project: Hadoop Common Issue Type: Bug Components: bin Affects Versions: 2.2.0 Reporter: shanyu zhao Assignee: shanyu zhao Attachments: HADOOP-10245.patch The Hadoop command line scripts (hadoop.sh or hadoop.cmd) will call java with -Xmx options twice. The impact is that any user defined HADOOP_HEAP_SIZE env variable will take no effect because it is overwritten by the second -Xmx option. For example, here is the java cmd generated for command hadoop fs -ls /, Notice that there are two -Xmx options: -Xmx1000m and -Xmx512m in the command line: java -Xmx1000m -Dhadoop.log.dir=C:\tmp\logs -Dhadoop.log.file=hadoop.log -Dhadoop.root.logger=INFO,c onsole,DRFA -Xmx512m -Dhadoop.security.logger=INFO,RFAS -classpath XXX org.apache.hadoop.fs.FsShell -ls / Here is the root cause: The call flow is: hadoop.sh calls hadoop_config.sh, which in turn calls hadoop-env.sh. In hadoop.sh, the command line is generated by the following pseudo code: java $JAVA_HEAP_MAX $HADOOP_CLIENT_OPTS -classpath ... In hadoop-config.sh, $JAVA_HEAP_MAX is initialized as -Xmx1000m if user didn't set $HADOOP_HEAP_SIZE env variable. In hadoop-env.sh, $HADOOP_CLIENT_OPTS is set as this: export HADOOP_CLIENT_OPTS=-Xmx512m $HADOOP_CLIENT_OPTS To fix this problem, we should remove the -Xmx512m from HADOOP_CLIENT_OPTS. If we really want to change the memory settings we need to use $HADOOP_HEAP_SIZE env variable. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HADOOP-10245) Hadoop command line always appends -Xmx option twice
[ https://issues.apache.org/jira/browse/HADOOP-10245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13877002#comment-13877002 ] Wei Yan commented on HADOOP-10245: -- Hey, shanyu. Is this one related to HADOOP-9870? Hadoop command line always appends -Xmx option twice -- Key: HADOOP-10245 URL: https://issues.apache.org/jira/browse/HADOOP-10245 Project: Hadoop Common Issue Type: Bug Components: bin Affects Versions: 2.2.0 Reporter: shanyu zhao Assignee: shanyu zhao Attachments: HADOOP-10245.patch The Hadoop command line scripts (hadoop.sh or hadoop.cmd) will call java with -Xmx options twice. The impact is that any user defined HADOOP_HEAP_SIZE env variable will take no effect because it is overwritten by the second -Xmx option. For example, here is the java cmd generated for command hadoop fs -ls /, Notice that there are two -Xmx options: -Xmx1000m and -Xmx512m in the command line: java -Xmx1000m -Dhadoop.log.dir=C:\tmp\logs -Dhadoop.log.file=hadoop.log -Dhadoop.root.logger=INFO,c onsole,DRFA -Xmx512m -Dhadoop.security.logger=INFO,RFAS -classpath XXX org.apache.hadoop.fs.FsShell -ls / Here is the root cause: The call flow is: hadoop.sh calls hadoop_config.sh, which in turn calls hadoop-env.sh. In hadoop.sh, the command line is generated by the following pseudo code: java $JAVA_HEAP_MAX $HADOOP_CLIENT_OPTS -classpath ... In hadoop-config.sh, $JAVA_HEAP_MAX is initialized as -Xmx1000m if user didn't set $HADOOP_HEAP_SIZE env variable. In hadoop-env.sh, $HADOOP_CLIENT_OPTS is set as this: export HADOOP_CLIENT_OPTS=-Xmx512m $HADOOP_CLIENT_OPTS To fix this problem, we should remove the -Xmx512m from HADOOP_CLIENT_OPTS. If we really want to change the memory settings we need to use $HADOOP_HEAP_SIZE env variable. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HADOOP-9870) Mixed configurations for JVM -Xmx in hadoop command
[ https://issues.apache.org/jira/browse/HADOOP-9870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827354#comment-13827354 ] Wei Yan commented on HADOOP-9870: - [~vinayrpet]. As I said above, I haven't found any documents that said the jvm would pick the last one. Correct me if I'm wrong here. So this patch wants to make this clear, instead of letting jvm make the decision to choose which as the jvm setting. Mixed configurations for JVM -Xmx in hadoop command --- Key: HADOOP-9870 URL: https://issues.apache.org/jira/browse/HADOOP-9870 Project: Hadoop Common Issue Type: Bug Reporter: Wei Yan Attachments: HADOOP-9870.patch, HADOOP-9870.patch, HADOOP-9870.patch When we use hadoop command to launch a class, there are two places setting the -Xmx configuration. *1*. The first place is located in file {{hadoop-common-project/hadoop-common/src/main/bin/hadoop}}. {code} exec $JAVA $JAVA_HEAP_MAX $HADOOP_OPTS $CLASS $@ {code} Here $JAVA_HEAP_MAX is configured in hadoop-config.sh ({{hadoop-common-project/hadoop-common/src/main/bin/hadoop-config.sh}}). The default value is -Xmx1000m. *2*. The second place is set with $HADOOP_OPTS in file {{hadoop-common-project/hadoop-common/src/main/bin/hadoop}}. {code} HADOOP_OPTS=$HADOOP_OPTS $HADOOP_CLIENT_OPTS {code} Here $HADOOP_CLIENT_OPTS is set in hadoop-env.sh ({{hadoop-common-project/hadoop-common/src/main/conf/hadoop-env.sh}}) {code} export HADOOP_CLIENT_OPTS=-Xmx512m $HADOOP_CLIENT_OPTS {code} Currently the final default java command looks like: {code}java -Xmx1000m -Xmx512m CLASS_NAME ARGUMENTS{code} And if users also specify the -Xmx in the $HADOOP_CLIENT_OPTS, there will be three -Xmx configurations. The hadoop setup tutorial only discusses hadoop-env.sh, and it looks that users should not make any change in hadoop-config.sh. We should let hadoop smart to choose the right one before launching the java command, instead of leaving for jvm to make the decision. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HADOOP-10100) MiniKDC shouldn't use apacheds-all artifact
[ https://issues.apache.org/jira/browse/HADOOP-10100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13822164#comment-13822164 ] Wei Yan commented on HADOOP-10100: -- [~rkanter], thanks so much for fixing this!!! MiniKDC shouldn't use apacheds-all artifact --- Key: HADOOP-10100 URL: https://issues.apache.org/jira/browse/HADOOP-10100 Project: Hadoop Common Issue Type: Bug Affects Versions: 2.3.0 Reporter: Robert Kanter Assignee: Robert Kanter Attachments: HADOOP-10100.patch The MiniKDC currently depends on the {{apacheds-all}} artifact: {code:xml} dependency groupIdorg.apache.directory.server/groupId artifactIdapacheds-all/artifactId version2.0.0-M15/version scopecompile/scope /dependency {code} However, this artifact includes, inside of itself, a lot of other packages, including antlr, ehcache, apache commons, and mina (you can see a full list of the packages in the jar [here|http://mvnrepository.com/artifact/org.apache.directory.server/apacheds-all/2.0.0-M15]). This can be problematic if other projects (e.g. Oozie) try to use MiniKDC and have a different version of one of those dependencies (in my case, ehcache). Because the packages are included inside the {{apacheds-all}} jar, we can't override their version. Instead, we should remove {{apacheds-all}} and use dependencies that only include org.apache.directory.* packages; the other necessary dependencies should be included normally. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HADOOP-9870) Mixed configurations for JVM -Xmx in hadoop command
[ https://issues.apache.org/jira/browse/HADOOP-9870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13806492#comment-13806492 ] Wei Yan commented on HADOOP-9870: - Great, [~jhsenjaliya]. Mixed configurations for JVM -Xmx in hadoop command --- Key: HADOOP-9870 URL: https://issues.apache.org/jira/browse/HADOOP-9870 Project: Hadoop Common Issue Type: Bug Reporter: Wei Yan When we use hadoop command to launch a class, there are two places setting the -Xmx configuration. *1*. The first place is located in file {{hadoop-common-project/hadoop-common/src/main/bin/hadoop}}. {code} exec $JAVA $JAVA_HEAP_MAX $HADOOP_OPTS $CLASS $@ {code} Here $JAVA_HEAP_MAX is configured in hadoop-config.sh ({{hadoop-common-project/hadoop-common/src/main/bin/hadoop-config.sh}}). The default value is -Xmx1000m. *2*. The second place is set with $HADOOP_OPTS in file {{hadoop-common-project/hadoop-common/src/main/bin/hadoop}}. {code} HADOOP_OPTS=$HADOOP_OPTS $HADOOP_CLIENT_OPTS {code} Here $HADOOP_CLIENT_OPTS is set in hadoop-env.sh ({{hadoop-common-project/hadoop-common/src/main/conf/hadoop-env.sh}}) {code} export HADOOP_CLIENT_OPTS=-Xmx512m $HADOOP_CLIENT_OPTS {code} Currently the final default java command looks like: {code}java -Xmx1000m -Xmx512m CLASS_NAME ARGUMENTS{code} And if users also specify the -Xmx in the $HADOOP_CLIENT_OPTS, there will be three -Xmx configurations. The hadoop setup tutorial only discusses hadoop-env.sh, and it looks that users should not make any change in hadoop-config.sh. We should let hadoop smart to choose the right one before launching the java command, instead of leaving for jvm to make the decision. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HADOOP-9889) Refresh the Krb5 configuration when creating a new kdc in Hadoop-MiniKDC
[ https://issues.apache.org/jira/browse/HADOOP-9889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Yan updated HADOOP-9889: Attachment: HADOOP-9889.patch Refresh the Krb5 configuration when creating a new kdc in Hadoop-MiniKDC Key: HADOOP-9889 URL: https://issues.apache.org/jira/browse/HADOOP-9889 Project: Hadoop Common Issue Type: Bug Reporter: Wei Yan Assignee: Wei Yan Attachments: HADOOP-9889.patch, HAOOP-9889.patch, HAOOP-9889.patch Krb5 Config uses a singleton and once initialized it does not refresh automatically. Without refresh, there are failures if you are using MiniKDCs with different configurations (such as different realms) within the same test run or if the Krb5 Config singleton is called before the MiniKDC is started for the first time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-9889) Refresh the Krb5 configuration when creating a new kdc in Hadoop-MiniKDC
[ https://issues.apache.org/jira/browse/HADOOP-9889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13749753#comment-13749753 ] Wei Yan commented on HADOOP-9889: - Sure, I'll take care of it. Refresh the Krb5 configuration when creating a new kdc in Hadoop-MiniKDC Key: HADOOP-9889 URL: https://issues.apache.org/jira/browse/HADOOP-9889 Project: Hadoop Common Issue Type: Bug Reporter: Wei Yan Assignee: Wei Yan Attachments: HAOOP-9889.patch, HAOOP-9889.patch Krb5 Config uses a singleton and once initialized it does not refresh automatically. Without refresh, there are failures if you are using MiniKDCs with different configurations (such as different realms) within the same test run or if the Krb5 Config singleton is called before the MiniKDC is started for the first time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-9889) Refresh the Krb5 configuration when creating a new kdc in Hadoop-MiniKDC
[ https://issues.apache.org/jira/browse/HADOOP-9889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Yan updated HADOOP-9889: Attachment: HAOOP-9889.patch just trigger the jenkins Refresh the Krb5 configuration when creating a new kdc in Hadoop-MiniKDC Key: HADOOP-9889 URL: https://issues.apache.org/jira/browse/HADOOP-9889 Project: Hadoop Common Issue Type: Bug Reporter: Wei Yan Assignee: Wei Yan Attachments: HAOOP-9889.patch, HAOOP-9889.patch Krb5 Config uses a singleton and once initialized it does not refresh automatically. Without refresh, there are failures if you are using MiniKDCs with different configurations (such as different realms) within the same test run or if the Krb5 Config singleton is called before the MiniKDC is started for the first time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-9889) Refresh the Krb5 configuration when creating a new kdc in Hadoop-MiniKDC
Wei Yan created HADOOP-9889: --- Summary: Refresh the Krb5 configuration when creating a new kdc in Hadoop-MiniKDC Key: HADOOP-9889 URL: https://issues.apache.org/jira/browse/HADOOP-9889 Project: Hadoop Common Issue Type: Bug Reporter: Wei Yan Assignee: Wei Yan Krb5 Config uses a singleton and once initialized it does not refresh automatically. Without refresh, there are failures if you are using MiniKDCs with different configurations (such as different realms) within the same test run or if the Krb5 Config singleton is called before the MiniKDC is started for the first time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-9889) Refresh the Krb5 configuration when creating a new kdc in Hadoop-MiniKDC
[ https://issues.apache.org/jira/browse/HADOOP-9889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Yan updated HADOOP-9889: Status: Patch Available (was: Open) Refresh the Krb5 configuration when creating a new kdc in Hadoop-MiniKDC Key: HADOOP-9889 URL: https://issues.apache.org/jira/browse/HADOOP-9889 Project: Hadoop Common Issue Type: Bug Reporter: Wei Yan Assignee: Wei Yan Attachments: HAOOP-9889.patch Krb5 Config uses a singleton and once initialized it does not refresh automatically. Without refresh, there are failures if you are using MiniKDCs with different configurations (such as different realms) within the same test run or if the Krb5 Config singleton is called before the MiniKDC is started for the first time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-9889) Refresh the Krb5 configuration when creating a new kdc in Hadoop-MiniKDC
[ https://issues.apache.org/jira/browse/HADOOP-9889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Yan updated HADOOP-9889: Attachment: HAOOP-9889.patch Refresh the Krb5 configuration when creating a new kdc in Hadoop-MiniKDC Key: HADOOP-9889 URL: https://issues.apache.org/jira/browse/HADOOP-9889 Project: Hadoop Common Issue Type: Bug Reporter: Wei Yan Assignee: Wei Yan Attachments: HAOOP-9889.patch Krb5 Config uses a singleton and once initialized it does not refresh automatically. Without refresh, there are failures if you are using MiniKDCs with different configurations (such as different realms) within the same test run or if the Krb5 Config singleton is called before the MiniKDC is started for the first time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-9860) Remove class HackedKeytab and HackedKeytabEncoder from hadoop-minikdc once jira DIRSERVER-1882 solved
[ https://issues.apache.org/jira/browse/HADOOP-9860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13744050#comment-13744050 ] Wei Yan commented on HADOOP-9860: - Great, I'll update a patch for this jira. Remove class HackedKeytab and HackedKeytabEncoder from hadoop-minikdc once jira DIRSERVER-1882 solved - Key: HADOOP-9860 URL: https://issues.apache.org/jira/browse/HADOOP-9860 Project: Hadoop Common Issue Type: Improvement Reporter: Wei Yan Assignee: Wei Yan Remove class {{HackedKeytab}} and {{HackedKeytabEncoder}} from hadoop-minikdc (HADOOP-9848) once jira DIRSERVER-1882 solved. Also update the apacheds version in the pom.xml. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-9860) Remove class HackedKeytab and HackedKeytabEncoder from hadoop-minikdc once jira DIRSERVER-1882 solved
[ https://issues.apache.org/jira/browse/HADOOP-9860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Yan updated HADOOP-9860: Attachment: HADOOP-9860.patch patch for removing HackedKeytab and HackedKeytabEncoder from hadoop-minikdc Remove class HackedKeytab and HackedKeytabEncoder from hadoop-minikdc once jira DIRSERVER-1882 solved - Key: HADOOP-9860 URL: https://issues.apache.org/jira/browse/HADOOP-9860 Project: Hadoop Common Issue Type: Improvement Reporter: Wei Yan Assignee: Wei Yan Attachments: HADOOP-9860.patch Remove class {{HackedKeytab}} and {{HackedKeytabEncoder}} from hadoop-minikdc (HADOOP-9848) once jira DIRSERVER-1882 solved. Also update the apacheds version in the pom.xml. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-9860) Remove class HackedKeytab and HackedKeytabEncoder from hadoop-minikdc once jira DIRSERVER-1882 solved
[ https://issues.apache.org/jira/browse/HADOOP-9860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Yan updated HADOOP-9860: Status: Patch Available (was: Open) Remove class HackedKeytab and HackedKeytabEncoder from hadoop-minikdc once jira DIRSERVER-1882 solved - Key: HADOOP-9860 URL: https://issues.apache.org/jira/browse/HADOOP-9860 Project: Hadoop Common Issue Type: Improvement Reporter: Wei Yan Assignee: Wei Yan Attachments: HADOOP-9860.patch Remove class {{HackedKeytab}} and {{HackedKeytabEncoder}} from hadoop-minikdc (HADOOP-9848) once jira DIRSERVER-1882 solved. Also update the apacheds version in the pom.xml. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-9860) Remove class HackedKeytab and HackedKeytabEncoder from hadoop-minikdc once jira DIRSERVER-1882 solved
[ https://issues.apache.org/jira/browse/HADOOP-9860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Yan updated HADOOP-9860: Status: Open (was: Patch Available) Remove class HackedKeytab and HackedKeytabEncoder from hadoop-minikdc once jira DIRSERVER-1882 solved - Key: HADOOP-9860 URL: https://issues.apache.org/jira/browse/HADOOP-9860 Project: Hadoop Common Issue Type: Improvement Reporter: Wei Yan Assignee: Wei Yan Attachments: HADOOP-9860.patch Remove class {{HackedKeytab}} and {{HackedKeytabEncoder}} from hadoop-minikdc (HADOOP-9848) once jira DIRSERVER-1882 solved. Also update the apacheds version in the pom.xml. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-9860) Remove class HackedKeytab and HackedKeytabEncoder from hadoop-minikdc once jira DIRSERVER-1882 solved
[ https://issues.apache.org/jira/browse/HADOOP-9860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Yan updated HADOOP-9860: Status: Patch Available (was: Open) Remove class HackedKeytab and HackedKeytabEncoder from hadoop-minikdc once jira DIRSERVER-1882 solved - Key: HADOOP-9860 URL: https://issues.apache.org/jira/browse/HADOOP-9860 Project: Hadoop Common Issue Type: Improvement Reporter: Wei Yan Assignee: Wei Yan Attachments: HADOOP-9860.patch, HADOOP-9860.patch Remove class {{HackedKeytab}} and {{HackedKeytabEncoder}} from hadoop-minikdc (HADOOP-9848) once jira DIRSERVER-1882 solved. Also update the apacheds version in the pom.xml. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-9860) Remove class HackedKeytab and HackedKeytabEncoder from hadoop-minikdc once jira DIRSERVER-1882 solved
[ https://issues.apache.org/jira/browse/HADOOP-9860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Yan updated HADOOP-9860: Attachment: HADOOP-9860.patch Remove class HackedKeytab and HackedKeytabEncoder from hadoop-minikdc once jira DIRSERVER-1882 solved - Key: HADOOP-9860 URL: https://issues.apache.org/jira/browse/HADOOP-9860 Project: Hadoop Common Issue Type: Improvement Reporter: Wei Yan Assignee: Wei Yan Attachments: HADOOP-9860.patch, HADOOP-9860.patch Remove class {{HackedKeytab}} and {{HackedKeytabEncoder}} from hadoop-minikdc (HADOOP-9848) once jira DIRSERVER-1882 solved. Also update the apacheds version in the pom.xml. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-9860) Remove class HackedKeytab and HackedKeytabEncoder from hadoop-minikdc once jira DIRSERVER-1882 solved
[ https://issues.apache.org/jira/browse/HADOOP-9860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13744241#comment-13744241 ] Wei Yan commented on HADOOP-9860: - also fix the login mistake in TestMiniKdc. HADOOP-9881 Remove class HackedKeytab and HackedKeytabEncoder from hadoop-minikdc once jira DIRSERVER-1882 solved - Key: HADOOP-9860 URL: https://issues.apache.org/jira/browse/HADOOP-9860 Project: Hadoop Common Issue Type: Improvement Reporter: Wei Yan Assignee: Wei Yan Attachments: HADOOP-9860.patch, HADOOP-9860.patch Remove class {{HackedKeytab}} and {{HackedKeytabEncoder}} from hadoop-minikdc (HADOOP-9848) once jira DIRSERVER-1882 solved. Also update the apacheds version in the pom.xml. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-9881) Some questions and possible improvement for MiniKdc/TestMiniKdc
[ https://issues.apache.org/jira/browse/HADOOP-9881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13744242#comment-13744242 ] Wei Yan commented on HADOOP-9881: - Fix the 3rd problem (login/logout) in HADOOP-9860. Some questions and possible improvement for MiniKdc/TestMiniKdc --- Key: HADOOP-9881 URL: https://issues.apache.org/jira/browse/HADOOP-9881 Project: Hadoop Common Issue Type: Improvement Components: security Reporter: Kai Zheng In org.apache.hadoop.minikdc.TestMiniKdc: # In testKeytabGen(), it comments ??principals use \ instead of /??, does this mean the principal must use \ instead of / to use MiniKdc for test cases? If so, should *HADOOP_SECURITY_AUTH_TO_LOCAL* consider this? # In testKerberosLogin(), what’s the meant difference between client login and server login? I see isInitiator option is set true or false respectively, but I’m not sure about that. # Both in client login and server login, why loginContext.login() gets called again in the end? Perhaps loginContext.logout(). # It also considers IBM JDK. Ref current UGI implementation, looks like it needs to set KRB5CCNAME system property and useDefaultCcache option specifically. It’s good to test login using keytab as current provided facility and test does. Is it also possible to test login via ticket cache or how to automatically generate ticket cache with specified principal without execution of kinit? This is important to cover user Kerberos login (with kinit) if possible using MiniKdc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-9866) convert hadoop-auth testcases requiring kerberos to use minikdc
[ https://issues.apache.org/jira/browse/HADOOP-9866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Yan updated HADOOP-9866: Attachment: HADOOP-9866.patch new patch add timeout for all test functions. convert hadoop-auth testcases requiring kerberos to use minikdc --- Key: HADOOP-9866 URL: https://issues.apache.org/jira/browse/HADOOP-9866 Project: Hadoop Common Issue Type: Bug Components: test Affects Versions: 2.3.0 Reporter: Alejandro Abdelnur Assignee: Wei Yan Attachments: HADOOP-9866.patch, HADOOP-9866.patch, HADOOP-9866.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-9866) convert hadoop-auth testcases requiring kerberos to use minikdc
[ https://issues.apache.org/jira/browse/HADOOP-9866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742466#comment-13742466 ] Wei Yan commented on HADOOP-9866: - update a new pacth according [~tucu00] comments. For TestKerberosName, we don't need the krb5.conf. Only set system properties (realm and host) @Before and clear them in @After. convert hadoop-auth testcases requiring kerberos to use minikdc --- Key: HADOOP-9866 URL: https://issues.apache.org/jira/browse/HADOOP-9866 Project: Hadoop Common Issue Type: Bug Components: test Affects Versions: 2.3.0 Reporter: Alejandro Abdelnur Assignee: Wei Yan Attachments: HADOOP-9866.patch, HADOOP-9866.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-9860) Remove class HackedKeytab and HackedKeytabEncoder from hadoop-minikdc once jira DIRSERVER-1882 solved
[ https://issues.apache.org/jira/browse/HADOOP-9860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1374#comment-1374 ] Wei Yan commented on HADOOP-9860: - Thanks, [~elecharny]. Waiting for the release. Remove class HackedKeytab and HackedKeytabEncoder from hadoop-minikdc once jira DIRSERVER-1882 solved - Key: HADOOP-9860 URL: https://issues.apache.org/jira/browse/HADOOP-9860 Project: Hadoop Common Issue Type: Improvement Reporter: Wei Yan Assignee: Wei Yan Remove class {{HackedKeytab}} and {{HackedKeytabEncoder}} from hadoop-minikdc (HADOOP-9848) once jira DIRSERVER-1882 solved. Also update the apacheds version in the pom.xml. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-9870) Mixed configurations for JVM -Xmx in hadoop command
[ https://issues.apache.org/jira/browse/HADOOP-9870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739930#comment-13739930 ] Wei Yan commented on HADOOP-9870: - [~aw]Yes. I haven't found any document talking about multiple -Xmx configurations in the command line. Not sure whether all JDKs always utilizes the last -Xmx. Correct me if I'm wrong here. Mixed configurations for JVM -Xmx in hadoop command --- Key: HADOOP-9870 URL: https://issues.apache.org/jira/browse/HADOOP-9870 Project: Hadoop Common Issue Type: Bug Reporter: Wei Yan When we use hadoop command to launch a class, there are two places setting the -Xmx configuration. *1*. The first place is located in file {{hadoop-common-project/hadoop-common/src/main/bin/hadoop}}. {code} exec $JAVA $JAVA_HEAP_MAX $HADOOP_OPTS $CLASS $@ {code} Here $JAVA_HEAP_MAX is configured in hadoop-config.sh ({{hadoop-common-project/hadoop-common/src/main/bin/hadoop-config.sh}}). The default value is -Xmx1000m. *2*. The second place is set with $HADOOP_OPTS in file {{hadoop-common-project/hadoop-common/src/main/bin/hadoop}}. {code} HADOOP_OPTS=$HADOOP_OPTS $HADOOP_CLIENT_OPTS {code} Here $HADOOP_CLIENT_OPTS is set in hadoop-env.sh ({{hadoop-common-project/hadoop-common/src/main/conf/hadoop-env.sh}}) {code} export HADOOP_CLIENT_OPTS=-Xmx512m $HADOOP_CLIENT_OPTS {code} Currently the final default java command looks like: {code}java -Xmx1000m -Xmx512m CLASS_NAME ARGUMENTS{code} And if users also specify the -Xmx in the $HADOOP_CLIENT_OPTS, there will be three -Xmx configurations. The hadoop setup tutorial only discusses hadoop-env.sh, and it looks that users should not make any change in hadoop-config.sh. We should let hadoop smart to choose the right one before launching the java command, instead of leaving for jvm to make the decision. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-9866) convert hadoop-auth testcases requiring kerberos to use minikdc
[ https://issues.apache.org/jira/browse/HADOOP-9866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Yan updated HADOOP-9866: Attachment: HADOOP-9866.patch Upload a patch that enables and updates all kerberos testcases in hadoop-auth by using Hadoop-MiniKdc. convert hadoop-auth testcases requiring kerberos to use minikdc --- Key: HADOOP-9866 URL: https://issues.apache.org/jira/browse/HADOOP-9866 Project: Hadoop Common Issue Type: Bug Components: test Affects Versions: 2.3.0 Reporter: Alejandro Abdelnur Assignee: Wei Yan Attachments: HADOOP-9866.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-9866) convert hadoop-auth testcases requiring kerberos to use minikdc
[ https://issues.apache.org/jira/browse/HADOOP-9866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Yan updated HADOOP-9866: Status: Patch Available (was: Open) convert hadoop-auth testcases requiring kerberos to use minikdc --- Key: HADOOP-9866 URL: https://issues.apache.org/jira/browse/HADOOP-9866 Project: Hadoop Common Issue Type: Bug Components: test Affects Versions: 2.3.0 Reporter: Alejandro Abdelnur Assignee: Wei Yan Attachments: HADOOP-9866.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-9848) Create a MiniKDC for use with security testing
[ https://issues.apache.org/jira/browse/HADOOP-9848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13737997#comment-13737997 ] Wei Yan commented on HADOOP-9848: - [~szetszwo] The two javadoc warnings are not related to this patch. {code} [WARNING] org/apache/directory/api/ldap/model/name/Dn.class(org/apache/directory/api/ldap/model/name:Dn.class): warning: Cannot find annotation method 'value()' in type 'edu.umd.cs.findbugs.annotations.SuppressWarnings': class file for edu.umd.cs.findbugs.annotations.SuppressWarnings not found [WARNING] org/apache/directory/api/ldap/model/name/Dn.class(org/apache/directory/api/ldap/model/name:Dn.class): warning: Cannot find annotation method 'justification()' in type 'edu.umd.cs.findbugs.annotations.SuppressWarnings' {code} This patch uses class Dn from ApacheDS, but doesn't use findbugs-annotations. So I don't include findbugs-annotations in this patch. Actually, if we import findbugs-annotations here, it would introduce javac warnings (https://issues.apache.org/jira/browse/HADOOP-9848?focusedCommentId=13735059page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13735059) because of unavailable jar package checksum. This javadoc warnings also discussed in YARN-107, YARN-643. Create a MiniKDC for use with security testing -- Key: HADOOP-9848 URL: https://issues.apache.org/jira/browse/HADOOP-9848 Project: Hadoop Common Issue Type: New Feature Components: security, test Reporter: Wei Yan Assignee: Wei Yan Fix For: 2.3.0 Attachments: HADOOP-9848.patch, HADOOP-9848.patch, HADOOP-9848.patch, HADOOP-9848.patch, HADOOP-9848.patch Create a MiniKDC using Apache Directory Server. MiniKDC builds an embedded KDC (key distribution center), and allows to create principals and keytabs on the fly. MiniKDC can be integrated for Hadoop security unit testing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-9870) Mixed configurations for JVM -Xmx in hadoop command
Wei Yan created HADOOP-9870: --- Summary: Mixed configurations for JVM -Xmx in hadoop command Key: HADOOP-9870 URL: https://issues.apache.org/jira/browse/HADOOP-9870 Project: Hadoop Common Issue Type: Bug Reporter: Wei Yan When we use hadoop command to launch a class, there are two places setting the -Xmx configuration. *1*. The first place is located in file {{hadoop-common-project/hadoop-common/src/main/bin/hadoop}}. {code} exec $JAVA $JAVA_HEAP_MAX $HADOOP_OPTS $CLASS $@ {code} Here $JAVA_HEAP_MAX is configured in hadoop-config.sh ({{hadoop-common-project/hadoop-common/src/main/bin/hadoop-config.sh}}). The default value is -Xmx1000m. *2*. The second place is set with $HADOOP_OPTS in file {{hadoop-common-project/hadoop-common/src/main/bin/hadoop}}. {code} HADOOP_OPTS=$HADOOP_OPTS $HADOOP_CLIENT_OPTS {code} Here $HADOOP_CLIENT_OPTS is set in hadoop-env.sh ({{hadoop-common-project/hadoop-common/src/main/conf/hadoop-env.sh}}) {code} export HADOOP_CLIENT_OPTS=-Xmx512m $HADOOP_CLIENT_OPTS {code} Currently the final default java command looks like: {code}java -Xmx1000m -Xmx512m CLASS_NAME ARGUMENTS{code} And if users also specify the -Xmx in the $HADOOP_CLIENT_OPTS, there will be three -Xmx configurations. The hadoop setup tutorial only discusses hadoop-env.sh, and it looks that users should not make any change in hadoop-config.sh. We should let hadoop smart to choose the right one before launching the java command, instead of leaving for jvm to make the decision. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-9870) Mixed configurations for JVM -Xmx in hadoop command
[ https://issues.apache.org/jira/browse/HADOOP-9870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13738993#comment-13738993 ] Wei Yan commented on HADOOP-9870: - [~drankye] I agree with your approach. Just one concern: $HADOOP_CLIENT_OPTS tries to wrap all configurations from user-side and pass to hadoop command. If we introduce another variable $HADOOP_CLIENT_HEAPSIZE, it may complex the original mechanism. Another possible approach may be: 1. In {{hadoop-env.sh}}, before {{export HADOOP_CLIENT_OPTS=-Xmx512m $HADOOP_CLIENT_OPTS}}, we can check whether $HADOOP_CLIENT_OPTS contains -Xmx. If has, ignore the default -Xmx512; otherwise, take the -Xmx512m. 2. In {{hadoop}}, before {{exec $JAVA $JAVA_HEAP_MAX $HADOOP_OPTS $CLASS $@}}, we can check whether $HADOOP_OPTS containers -Xmx. If has, don't include $JAVA_HEAP_MAX in the command; otherwise, take the $JAVA_HEAP_MAX. Let's wait for some comments from other guys. Mixed configurations for JVM -Xmx in hadoop command --- Key: HADOOP-9870 URL: https://issues.apache.org/jira/browse/HADOOP-9870 Project: Hadoop Common Issue Type: Bug Reporter: Wei Yan When we use hadoop command to launch a class, there are two places setting the -Xmx configuration. *1*. The first place is located in file {{hadoop-common-project/hadoop-common/src/main/bin/hadoop}}. {code} exec $JAVA $JAVA_HEAP_MAX $HADOOP_OPTS $CLASS $@ {code} Here $JAVA_HEAP_MAX is configured in hadoop-config.sh ({{hadoop-common-project/hadoop-common/src/main/bin/hadoop-config.sh}}). The default value is -Xmx1000m. *2*. The second place is set with $HADOOP_OPTS in file {{hadoop-common-project/hadoop-common/src/main/bin/hadoop}}. {code} HADOOP_OPTS=$HADOOP_OPTS $HADOOP_CLIENT_OPTS {code} Here $HADOOP_CLIENT_OPTS is set in hadoop-env.sh ({{hadoop-common-project/hadoop-common/src/main/conf/hadoop-env.sh}}) {code} export HADOOP_CLIENT_OPTS=-Xmx512m $HADOOP_CLIENT_OPTS {code} Currently the final default java command looks like: {code}java -Xmx1000m -Xmx512m CLASS_NAME ARGUMENTS{code} And if users also specify the -Xmx in the $HADOOP_CLIENT_OPTS, there will be three -Xmx configurations. The hadoop setup tutorial only discusses hadoop-env.sh, and it looks that users should not make any change in hadoop-config.sh. We should let hadoop smart to choose the right one before launching the java command, instead of leaving for jvm to make the decision. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-9848) Create a MiniKDC for use with security testing
[ https://issues.apache.org/jira/browse/HADOOP-9848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Yan updated HADOOP-9848: Attachment: HADOOP-9848.patch Create a MiniKDC for use with security testing -- Key: HADOOP-9848 URL: https://issues.apache.org/jira/browse/HADOOP-9848 Project: Hadoop Common Issue Type: New Feature Components: security, test Reporter: Wei Yan Assignee: Wei Yan Attachments: HADOOP-9848.patch, HADOOP-9848.patch, HADOOP-9848.patch Create a MiniKDC using Apache Directory Server. MiniKDC builds an embedded KDC (key distribution center), and allows to create principals and keytabs on the fly. MiniKDC can be integrated for Hadoop security unit testing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-9848) Create a MiniKDC for use with security testing
[ https://issues.apache.org/jira/browse/HADOOP-9848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13737032#comment-13737032 ] Wei Yan commented on HADOOP-9848: - The javadoc warnings can be ignored. The javadoc warnings show that the findbugs-annotations cannot be found for ApacheDS. Our MiniKdc doesn't need findbugs-annoations. {code} [WARNING] org/apache/directory/api/ldap/model/name/Dn.class(org/apache/directory/api/ldap/model/name:Dn.class): warning: Cannot find annotation method 'value()' in type 'edu.umd.cs.findbugs.annotations.SuppressWarnings': class file for edu.umd.cs.findbugs.annotations.SuppressWarnings not found [WARNING] org/apache/directory/api/ldap/model/name/Dn.class(org/apache/directory/api/ldap/model/name:Dn.class): warning: Cannot find annotation method 'justification()' in type 'edu.umd.cs.findbugs.annotations.SuppressWarnings' {code} Create a MiniKDC for use with security testing -- Key: HADOOP-9848 URL: https://issues.apache.org/jira/browse/HADOOP-9848 Project: Hadoop Common Issue Type: New Feature Components: security, test Reporter: Wei Yan Assignee: Wei Yan Attachments: HADOOP-9848.patch, HADOOP-9848.patch, HADOOP-9848.patch Create a MiniKDC using Apache Directory Server. MiniKDC builds an embedded KDC (key distribution center), and allows to create principals and keytabs on the fly. MiniKDC can be integrated for Hadoop security unit testing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-9848) Create a MiniKDC for use with security testing
[ https://issues.apache.org/jira/browse/HADOOP-9848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Yan updated HADOOP-9848: Attachment: HADOOP-9848.patch After offline discussion with [~tucu00], here we keep using @Before and @After instead of @BeforeClass and @AfterClass. Using @BeforeClass and @AfterClass would cause the other two functions (createTestDir() and createMiniKdcConf()) to be static. That means users cannot override these two functions with new settings. With @Before and @After, users can inherit class KerberosSecurityTestcase and override functions createTestDir() and createMiniKdcConf() if needed. Note that the kdc will be created and stopped for each @Test function. Create a MiniKDC for use with security testing -- Key: HADOOP-9848 URL: https://issues.apache.org/jira/browse/HADOOP-9848 Project: Hadoop Common Issue Type: New Feature Components: security, test Reporter: Wei Yan Assignee: Wei Yan Attachments: HADOOP-9848.patch, HADOOP-9848.patch, HADOOP-9848.patch, HADOOP-9848.patch Create a MiniKDC using Apache Directory Server. MiniKDC builds an embedded KDC (key distribution center), and allows to create principals and keytabs on the fly. MiniKDC can be integrated for Hadoop security unit testing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-9848) Create a MiniKDC for use with security testing
[ https://issues.apache.org/jira/browse/HADOOP-9848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Yan updated HADOOP-9848: Attachment: HADOOP-9848.patch Create a MiniKDC for use with security testing -- Key: HADOOP-9848 URL: https://issues.apache.org/jira/browse/HADOOP-9848 Project: Hadoop Common Issue Type: New Feature Components: security, test Reporter: Wei Yan Assignee: Wei Yan Attachments: HADOOP-9848.patch, HADOOP-9848.patch, HADOOP-9848.patch, HADOOP-9848.patch, HADOOP-9848.patch Create a MiniKDC using Apache Directory Server. MiniKDC builds an embedded KDC (key distribution center), and allows to create principals and keytabs on the fly. MiniKDC can be integrated for Hadoop security unit testing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HADOOP-9866) convert hadoop-auth testcases requiring kerberos to use minikdc
[ https://issues.apache.org/jira/browse/HADOOP-9866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Yan reassigned HADOOP-9866: --- Assignee: Wei Yan convert hadoop-auth testcases requiring kerberos to use minikdc --- Key: HADOOP-9866 URL: https://issues.apache.org/jira/browse/HADOOP-9866 Project: Hadoop Common Issue Type: Bug Components: test Affects Versions: 2.3.0 Reporter: Alejandro Abdelnur Assignee: Wei Yan -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-9860) Remove class HackedKeytab and HackedKeytabEncoder from hadoop-minikdc once jira DIRSERVER-1882 solved
[ https://issues.apache.org/jira/browse/HADOOP-9860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13736428#comment-13736428 ] Wei Yan commented on HADOOP-9860: - Thanks so much, [~elecharny]. Remove class HackedKeytab and HackedKeytabEncoder from hadoop-minikdc once jira DIRSERVER-1882 solved - Key: HADOOP-9860 URL: https://issues.apache.org/jira/browse/HADOOP-9860 Project: Hadoop Common Issue Type: Improvement Reporter: Wei Yan Assignee: Wei Yan Remove class {{HackedKeytab}} and {{HackedKeytabEncoder}} from hadoop-minikdc (HADOOP-9848) once jira DIRSERVER-1882 solved. Also update the apacheds version in the pom.xml. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-9848) Create a MiniKDC for use with security testing
[ https://issues.apache.org/jira/browse/HADOOP-9848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13735059#comment-13735059 ] Wei Yan commented on HADOOP-9848: - The javac warnings are irrelevant to the patch. It is the checksum missing in jboss repository. {code} [WARNING] Checksum validation failed, no checksums available from the repository for http://repository.jboss.org/nexus/content/groups/public/com/github/stephenc/findbugs/findbugs-annotations/1.3.9-1/findbugs-annotations-1.3.9-1.pom [WARNING] Checksum validation failed, no checksums available from the repository for http://repository.jboss.org/nexus/content/groups/public/com/github/stephenc/findbugs/findbugs-annotations/1.3.9-1/findbugs-annotations-1.3.9-1.jar {code} Create a MiniKDC for use with security testing -- Key: HADOOP-9848 URL: https://issues.apache.org/jira/browse/HADOOP-9848 Project: Hadoop Common Issue Type: New Feature Components: security, test Reporter: Wei Yan Assignee: Wei Yan Attachments: HADOOP-9848.patch, HADOOP-9848.patch Create a MiniKDC using Apache Directory Server. MiniKDC builds an embedded KDC (key distribution center), and allows to create principals and keytabs on the fly. MiniKDC can be integrated for Hadoop security unit testing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-9848) Create a MiniKDC for use with security testing
[ https://issues.apache.org/jira/browse/HADOOP-9848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13735215#comment-13735215 ] Wei Yan commented on HADOOP-9848: - I'll update a patch combining previous comments. Create a MiniKDC for use with security testing -- Key: HADOOP-9848 URL: https://issues.apache.org/jira/browse/HADOOP-9848 Project: Hadoop Common Issue Type: New Feature Components: security, test Reporter: Wei Yan Assignee: Wei Yan Attachments: HADOOP-9848.patch, HADOOP-9848.patch Create a MiniKDC using Apache Directory Server. MiniKDC builds an embedded KDC (key distribution center), and allows to create principals and keytabs on the fly. MiniKDC can be integrated for Hadoop security unit testing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-9860) Remove class HackedKeytab and HackedKeytabEncoder from hadoop-minikdc once jira DIRSERVER-1882 solved
Wei Yan created HADOOP-9860: --- Summary: Remove class HackedKeytab and HackedKeytabEncoder from hadoop-minikdc once jira DIRSERVER-1882 solved Key: HADOOP-9860 URL: https://issues.apache.org/jira/browse/HADOOP-9860 Project: Hadoop Common Issue Type: Improvement Reporter: Wei Yan Assignee: Wei Yan Remove class {{HackedKeytab}} and {{HackedKeytabEncoder}} from hadoop-minikdc (HADOOP-9848) once jira DIRSERVER-1882 solved. Also update the apacheds version in the pom.xml. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-9848) Create a MiniKDC for use with security testing
[ https://issues.apache.org/jira/browse/HADOOP-9848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Yan updated HADOOP-9848: Attachment: HADOOP-9848.patch Upload a patch that creates a module hadoop-minikdc under hadoop-common-project. Testcases also included showing how to use MiniKdc. Create a MiniKDC for use with security testing -- Key: HADOOP-9848 URL: https://issues.apache.org/jira/browse/HADOOP-9848 Project: Hadoop Common Issue Type: New Feature Components: security, test Reporter: Wei Yan Assignee: Wei Yan Attachments: HADOOP-9848.patch Create a MiniKDC using Apache Directory Server. MiniKDC builds an embedded KDC (key distribution center), and allows to create principals and keytabs on the fly. MiniKDC can be integrated for Hadoop security unit testing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-9848) Create a MiniKDC for use with security testing
[ https://issues.apache.org/jira/browse/HADOOP-9848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Yan updated HADOOP-9848: Status: Patch Available (was: Open) Create a MiniKDC for use with security testing -- Key: HADOOP-9848 URL: https://issues.apache.org/jira/browse/HADOOP-9848 Project: Hadoop Common Issue Type: New Feature Components: security, test Reporter: Wei Yan Assignee: Wei Yan Attachments: HADOOP-9848.patch Create a MiniKDC using Apache Directory Server. MiniKDC builds an embedded KDC (key distribution center), and allows to create principals and keytabs on the fly. MiniKDC can be integrated for Hadoop security unit testing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-9848) Create a MiniKDC for use with security testing
[ https://issues.apache.org/jira/browse/HADOOP-9848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Yan updated HADOOP-9848: Attachment: HADOOP-9848.patch Update patch to remove javadoc annotation warnings. Create a MiniKDC for use with security testing -- Key: HADOOP-9848 URL: https://issues.apache.org/jira/browse/HADOOP-9848 Project: Hadoop Common Issue Type: New Feature Components: security, test Reporter: Wei Yan Assignee: Wei Yan Attachments: HADOOP-9848.patch, HADOOP-9848.patch Create a MiniKDC using Apache Directory Server. MiniKDC builds an embedded KDC (key distribution center), and allows to create principals and keytabs on the fly. MiniKDC can be integrated for Hadoop security unit testing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-9848) Create a MiniKDC for use with security testing
Wei Yan created HADOOP-9848: --- Summary: Create a MiniKDC for use with security testing Key: HADOOP-9848 URL: https://issues.apache.org/jira/browse/HADOOP-9848 Project: Hadoop Common Issue Type: New Feature Components: security, test Reporter: Wei Yan Assignee: Wei Yan Create a MiniKDC using Apache Directory Server. MiniKDC builds an embedded KDC (key distribution center), and allows to create principals and keytabs on the fly. MiniKDC can be integrated for Hadoop security unit testing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira