[jira] [Commented] (LIVY-731) Session Allocation with server-session Mapping
[ https://issues.apache.org/jira/browse/LIVY-731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17019352#comment-17019352 ] Yiheng Wang commented on LIVY-731: -- I'm working on it. > Session Allocation with server-session Mapping > -- > > Key: LIVY-731 > URL: https://issues.apache.org/jira/browse/LIVY-731 > Project: Livy > Issue Type: Sub-task >Reporter: Yiheng Wang >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (LIVY-724) Support Session Lazy Recover
[ https://issues.apache.org/jira/browse/LIVY-724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yiheng Wang updated LIVY-724: - Description: In this JIRA, we want to support recover a session from the state store to Livy server memory when the session is used. In the current single node recover, when a server restarts, it will recover the sessions in the state store in a batch. In the multi-active scenario, we propose a new session recovery solution. When a server leaves the cluster, its sessions will be transferred to other servers. Instead of transferring them in a batch, each Livy session gets recovered when a request for it arrives. It will save the session switch time. > Support Session Lazy Recover > > > Key: LIVY-724 > URL: https://issues.apache.org/jira/browse/LIVY-724 > Project: Livy > Issue Type: Sub-task >Reporter: Yiheng Wang >Priority: Major > > In this JIRA, we want to support recover a session from the state store to > Livy server memory when the session is used. > In the current single node recover, when a server restarts, it will recover > the sessions in the state store in a batch. > In the multi-active scenario, we propose a new session recovery solution. > When a server leaves the cluster, its sessions will be transferred to other > servers. Instead of transferring them in a batch, each Livy session gets > recovered when a request for it arrives. It will save the session switch time. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (LIVY-723) Server Registration
[ https://issues.apache.org/jira/browse/LIVY-723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yiheng Wang updated LIVY-723: - Description: In this JIRA, we want to implement a cluster manager which will monitor the cluster state change. Each Livy server can register itself to the cluster and get notified when a new node join or existing node leaves the cluster. The cluster manager can be used in places like session allocation, service discovery, etc. The current implementation is based on zookeeper. > Server Registration > --- > > Key: LIVY-723 > URL: https://issues.apache.org/jira/browse/LIVY-723 > Project: Livy > Issue Type: Sub-task >Reporter: Yiheng Wang >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > In this JIRA, we want to implement a cluster manager which will monitor the > cluster state change. Each Livy server can register itself to the cluster and > get notified when a new node join or existing node leaves the cluster. > The cluster manager can be used in places like session allocation, service > discovery, etc. > The current implementation is based on zookeeper. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (LIVY-723) Server Registration
[ https://issues.apache.org/jira/browse/LIVY-723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17015609#comment-17015609 ] Yiheng Wang commented on LIVY-723: -- I'm working on it. > Server Registration > --- > > Key: LIVY-723 > URL: https://issues.apache.org/jira/browse/LIVY-723 > Project: Livy > Issue Type: Sub-task >Reporter: Yiheng Wang >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (LIVY-744) Livy Job API support multi-active HA
Yiheng Wang created LIVY-744: Summary: Livy Job API support multi-active HA Key: LIVY-744 URL: https://issues.apache.org/jira/browse/LIVY-744 Project: Livy Issue Type: Sub-task Reporter: Yiheng Wang -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (LIVY-724) Support Session Lazy Recover
[ https://issues.apache.org/jira/browse/LIVY-724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yiheng Wang updated LIVY-724: - Summary: Support Session Lazy Recover (was: Support Server Leave/Join and Session Cross Server Recovery) > Support Session Lazy Recover > > > Key: LIVY-724 > URL: https://issues.apache.org/jira/browse/LIVY-724 > Project: Livy > Issue Type: Sub-task >Reporter: Yiheng Wang >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (LIVY-718) Support multi-active high availability in Livy
[ https://issues.apache.org/jira/browse/LIVY-718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17015606#comment-17015606 ] Yiheng Wang commented on LIVY-718: -- I updated the design doc based on the recent discussions in the JIRA. The major changes are: # Refine the solution architecture section # Add a new allocateServer method to allocator interface # Add details for node-session mapping allocation method [~mgaido] # update getAllSession [~mgaido] # Refine the section of comparing client-side routing and server-side routing # Add a new section "Load Balancer", which gives example how to put livy servers behind a load balancer when using client-side routing [~bikassaha] [~meisam] # Add a new section "Session Recover", which describe we recover a session object in a lazy way(when a request for that session arrives, which can be leveraged in multi-designated server solution) [~bikassaha] # Remove the session recover when there's server failover # Add multi-designate server to non-goal and add a new section "Multi-designate Server Solution Extension" to discuss how to continue to implement a multi-designate server solution from the current one [~bikassaha] > Support multi-active high availability in Livy > -- > > Key: LIVY-718 > URL: https://issues.apache.org/jira/browse/LIVY-718 > Project: Livy > Issue Type: Epic > Components: RSC, Server >Reporter: Yiheng Wang >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > In this JIRA we want to discuss how to implement multi-active high > availability in Livy. > Currently, Livy only supports single node recovery. This is not sufficient in > some production environments. In our scenario, the Livy server serves many > notebook and JDBC services. We want to make Livy service more fault-tolerant > and scalable. > There're already some proposals in the community for high availability. But > they're not so complete or just for active-standby high availability. So we > propose a multi-active high availability design to achieve the following > goals: > # One or more servers will serve the client requests at the same time. > # Sessions are allocated among different servers. > # When one node crashes, the affected sessions will be moved to other active > services. > Here's our design document, please review and comment: > https://docs.google.com/document/d/1bD3qYZpw14_NuCcSGUOfqQ0pqvSbCQsOLFuZp26Ohjc/edit?usp=sharing > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (LIVY-725) Support Route Request
[ https://issues.apache.org/jira/browse/LIVY-725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yiheng Wang reassigned LIVY-725: Assignee: (was: mingchao zhao) > Support Route Request > - > > Key: LIVY-725 > URL: https://issues.apache.org/jira/browse/LIVY-725 > Project: Livy > Issue Type: Sub-task >Reporter: Yiheng Wang >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (LIVY-718) Support multi-active high availability in Livy
[ https://issues.apache.org/jira/browse/LIVY-718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17011909#comment-17011909 ] Yiheng Wang commented on LIVY-718: -- Hi [~bikassaha] I agree that allows multiple servers to connect to one spark driver solution is more flexible and ideal. But I think the current one designated server solution is not conflicted with it. The difference is multiple server solution brings in the state consist problem, it needs extra effort to optimize the existing Livy server code. Other parts are compatible. As you comment, the ideal solution is 1. The user sends a request to a Livy cluster 2. The request may be handled by the current server or route to another server (based on configured strategy) 3. The server handles the request. It may take some time to hydrate the state if it's the first hit For step 2, the one designated server solution strategy is always route to one same server. It doesn't allow route to different servers to avoid the state consist problem. For step 3, we can change the time of initializing the session object and RPC connection in the server. We change it from doing it when server fail/new server join to session first hit, which also works in the one designated solution. I did some dig in these days to see how to optimize livy server code to fix the state consist problem. I list some work items to achieve it. ||Component||Object||Effort|| |Yarn Job|SparkYarnApp|SparkYarnApp contains yarn application log. It is stored in the original session create server memory, which may not be access by other servers. I think it is not suitable to put it in state store. Batch session also have this problem so we cannot push it to rsc driver.| |Session|last activity time|Push to driver for interactive session. Push it to state store for batch session as it always be start time in batch session| |Interactive session|operations(statement records)|push to driver| |Interactive session|operationCount|push to driver| |Session|appId|appId is empty when session metadata is persisted to state store frist time, it is changed after a while. We need to make sure the consist across servers| |Interactive Session|Rsc driver url|same with the above| |Session|session state|Session state is updated by SparkYarnApp thread. There's already some inconsistent between livy server and yarn. If there're multiple servers, the inconsistent will be amplified, as the thread check time is different across servers. One solution is query yarn but it will make many query much longer. Another solution is put it in state store. It also add overhead| |Thrift|Fetched row count|push to driver| |Thrift|mapping from thrift session to livy session|put to state store| |Thrift|operation states|push to driver, touch 1/2 thrift code| I think it's quite a lot of effort and touch existing code base which brings many risks. I suggest we make it as a separate goal. Another discussion is session id. I'm very worried the thing that it breaks the API compatible. We may need to upgrade the API version to V2. And it's a lot effort for people to migrate existing code. I strongly suggest to stick on the int session id and make it as a separate goal. Make a summary, my suggestion is: 1. Change the design to make it compatible with multi-server solution 2. Implement the one-designate solution 3. Make multi-servers solution as a separate goal and do it in another JIRA 3.1 make livy server behavior consist in the cluster(see the table) 3.2 add a new route strategy 4. Stick on int session id What do you think? > Support multi-active high availability in Livy > -- > > Key: LIVY-718 > URL: https://issues.apache.org/jira/browse/LIVY-718 > Project: Livy > Issue Type: Epic > Components: RSC, Server >Reporter: Yiheng Wang >Priority: Major > > In this JIRA we want to discuss how to implement multi-active high > availability in Livy. > Currently, Livy only supports single node recovery. This is not sufficient in > some production environments. In our scenario, the Livy server serves many > notebook and JDBC services. We want to make Livy service more fault-tolerant > and scalable. > There're already some proposals in the community for high availability. But > they're not so complete or just for active-standby high availability. So we > propose a multi-active high availability design to achieve the following > goals: > # One or more servers will serve the client requests at the same time. > # Sessions are allocated among different servers. > # When one node crashes, the affected sessions will be moved to other active > services. > Here's our design document, please review and comment: > https://docs.google.com/document/d/1bD3qYZpw14_NuCcSGUOfqQ0pqvSbCQsOLFuZp26Ohjc/edit?usp=sharing >
[jira] [Commented] (LIVY-718) Support multi-active high availability in Livy
[ https://issues.apache.org/jira/browse/LIVY-718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17006078#comment-17006078 ] Yiheng Wang commented on LIVY-718: -- [~bikassaha] Compared to the designated server solution, I think stateless server solution get more accessibility by sacrificing scalability. In this background, one concern is memory. We observed that when the running session number grows to 400~500, the Livy server process consuming about 2G memory. Another concern is Livy use long connections between server and spark drivers. Say there're M server and N session. In designate solution, there're N connections. In stateless solution, there're M x N connections. I'm afraid this may bring a lot of overhead in RPC communication(e.g. serialization, routing). > Support multi-active high availability in Livy > -- > > Key: LIVY-718 > URL: https://issues.apache.org/jira/browse/LIVY-718 > Project: Livy > Issue Type: Epic > Components: RSC, Server >Reporter: Yiheng Wang >Priority: Major > > In this JIRA we want to discuss how to implement multi-active high > availability in Livy. > Currently, Livy only supports single node recovery. This is not sufficient in > some production environments. In our scenario, the Livy server serves many > notebook and JDBC services. We want to make Livy service more fault-tolerant > and scalable. > There're already some proposals in the community for high availability. But > they're not so complete or just for active-standby high availability. So we > propose a multi-active high availability design to achieve the following > goals: > # One or more servers will serve the client requests at the same time. > # Sessions are allocated among different servers. > # When one node crashes, the affected sessions will be moved to other active > services. > Here's our design document, please review and comment: > https://docs.google.com/document/d/1bD3qYZpw14_NuCcSGUOfqQ0pqvSbCQsOLFuZp26Ohjc/edit?usp=sharing > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (LIVY-718) Support multi-active high availability in Livy
[ https://issues.apache.org/jira/browse/LIVY-718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17006069#comment-17006069 ] Yiheng Wang edited comment on LIVY-718 at 12/31/19 12:28 PM: - bq. When a server fails, its sessions become unavailable until other servers are designated to handle them. This was not acceptable behavior, at least for clusters that I worked with in my previous job. [~meisam] Currently, Livy only supports single node failure recover. Do you use Livy in that cluster? If so, how do you handle the downtime? was (Author: yihengw): bq. When a server fails, its sessions become unavailable until other servers are designated to handle them. This was not acceptable behavior, at least for clusters that I worked with in my previous job. [~meisam] Currently, Livy only supports single node failure recover. Do you use Livy in that cluster? If so, would you like to share your solution? > Support multi-active high availability in Livy > -- > > Key: LIVY-718 > URL: https://issues.apache.org/jira/browse/LIVY-718 > Project: Livy > Issue Type: Epic > Components: RSC, Server >Reporter: Yiheng Wang >Priority: Major > > In this JIRA we want to discuss how to implement multi-active high > availability in Livy. > Currently, Livy only supports single node recovery. This is not sufficient in > some production environments. In our scenario, the Livy server serves many > notebook and JDBC services. We want to make Livy service more fault-tolerant > and scalable. > There're already some proposals in the community for high availability. But > they're not so complete or just for active-standby high availability. So we > propose a multi-active high availability design to achieve the following > goals: > # One or more servers will serve the client requests at the same time. > # Sessions are allocated among different servers. > # When one node crashes, the affected sessions will be moved to other active > services. > Here's our design document, please review and comment: > https://docs.google.com/document/d/1bD3qYZpw14_NuCcSGUOfqQ0pqvSbCQsOLFuZp26Ohjc/edit?usp=sharing > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (LIVY-718) Support multi-active high availability in Livy
[ https://issues.apache.org/jira/browse/LIVY-718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17006069#comment-17006069 ] Yiheng Wang commented on LIVY-718: -- bq. When a server fails, its sessions become unavailable until other servers are designated to handle them. This was not acceptable behavior, at least for clusters that I worked with in my previous job. [~meisam] Currently, Livy only supports single node failure recover. Do you use Livy in that cluster? If so, would you like to share your solution? > Support multi-active high availability in Livy > -- > > Key: LIVY-718 > URL: https://issues.apache.org/jira/browse/LIVY-718 > Project: Livy > Issue Type: Epic > Components: RSC, Server >Reporter: Yiheng Wang >Priority: Major > > In this JIRA we want to discuss how to implement multi-active high > availability in Livy. > Currently, Livy only supports single node recovery. This is not sufficient in > some production environments. In our scenario, the Livy server serves many > notebook and JDBC services. We want to make Livy service more fault-tolerant > and scalable. > There're already some proposals in the community for high availability. But > they're not so complete or just for active-standby high availability. So we > propose a multi-active high availability design to achieve the following > goals: > # One or more servers will serve the client requests at the same time. > # Sessions are allocated among different servers. > # When one node crashes, the affected sessions will be moved to other active > services. > Here's our design document, please review and comment: > https://docs.google.com/document/d/1bD3qYZpw14_NuCcSGUOfqQ0pqvSbCQsOLFuZp26Ohjc/edit?usp=sharing > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (LIVY-718) Support multi-active high availability in Livy
[ https://issues.apache.org/jira/browse/LIVY-718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17005203#comment-17005203 ] Yiheng Wang commented on LIVY-718: -- Thanks for your comments [~bikassaha] and [~meisam]. I summary the discussing points and put my comments below(please point it out if I miss something). h4. Designated Server - Is it because there are issues with multiple servers handling multiple clients to the same session? The issue includes: 1. Livy server needs to monitor spark sessions. If we remove the designated server, each server may need to monitor all sessions. It's kind of waste and there may be some inconsistent among servers. 2. Besides service data, Livy server also stores other data like the application log and last active time in memory. Such information has a higher update rate. It's not suitable to store in some state-store backend like zookeeper. 3. If multiple servers serve one session, we need to add some kind of lock mechanism to handle the concurrent state-change request(e.g. stop session) h4. Session id - I would strongly suggest deprecating the integral session id I agree that the incremental integral session ID is not necessary for Livy. For changing it to UUID, my biggest concern is compatible with the earlier API(session-id data type may be needed to change from Int to String). It's a quite big move so we choose a conservative way in the design. h4. Dependency on ZK ZK is introduced to resolve the above two problems(server status change notification and unique id generation). If we don't need the designated server and incremental session-id, I think we can remove zk. h4. Service discovery, Ease of use of the API and the number of ports in the firewall that needs to be opened for Livy HA can become a security concern I think the point here is a single URL for the Livy cluster. It depends on the first questions. If we can remove the designated server, we just need to put a load balancer before all the servers. If we keep the designated server design, we can use a http load balancer which is aware of 307 responses. Currently, we use a 307 response to route the request to the designated server on the client-side. The response can be handled automatically by some load balancer. I think the key point of the discussion is the designated server. Please let me know your suggestions for my list issues and see if we can improve the design proposal. > Support multi-active high availability in Livy > -- > > Key: LIVY-718 > URL: https://issues.apache.org/jira/browse/LIVY-718 > Project: Livy > Issue Type: Epic > Components: RSC, Server >Reporter: Yiheng Wang >Priority: Major > > In this JIRA we want to discuss how to implement multi-active high > availability in Livy. > Currently, Livy only supports single node recovery. This is not sufficient in > some production environments. In our scenario, the Livy server serves many > notebook and JDBC services. We want to make Livy service more fault-tolerant > and scalable. > There're already some proposals in the community for high availability. But > they're not so complete or just for active-standby high availability. So we > propose a multi-active high availability design to achieve the following > goals: > # One or more servers will serve the client requests at the same time. > # Sessions are allocated among different servers. > # When one node crashes, the affected sessions will be moved to other active > services. > Here's our design document, please review and comment: > https://docs.google.com/document/d/1bD3qYZpw14_NuCcSGUOfqQ0pqvSbCQsOLFuZp26Ohjc/edit?usp=sharing > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (LIVY-735) Fix RPC Channel Closed When Multi Clients Connect to One Driver
[ https://issues.apache.org/jira/browse/LIVY-735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yiheng Wang updated LIVY-735: - Summary: Fix RPC Channel Closed When Multi Clients Connect to One Driver (was: Fix RPC Channel Closed When multi clients connect to one driver ) > Fix RPC Channel Closed When Multi Clients Connect to One Driver > > > Key: LIVY-735 > URL: https://issues.apache.org/jira/browse/LIVY-735 > Project: Livy > Issue Type: Sub-task >Reporter: Yiheng Wang >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (LIVY-735) Fix RPC Channel Closed When multi clients connect to one driver
[ https://issues.apache.org/jira/browse/LIVY-735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yiheng Wang updated LIVY-735: - Summary: Fix RPC Channel Closed When multi clients connect to one driver (was: Fix RPC Channel closed when multi clients connect to one driver ) > Fix RPC Channel Closed When multi clients connect to one driver > > > Key: LIVY-735 > URL: https://issues.apache.org/jira/browse/LIVY-735 > Project: Livy > Issue Type: Sub-task >Reporter: Yiheng Wang >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (LIVY-735) Fix RPC Channel closed when multi clients connect to one driver
Yiheng Wang created LIVY-735: Summary: Fix RPC Channel closed when multi clients connect to one driver Key: LIVY-735 URL: https://issues.apache.org/jira/browse/LIVY-735 Project: Livy Issue Type: Sub-task Reporter: Yiheng Wang -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (LIVY-734) Support getAllSessions in Cluster
Yiheng Wang created LIVY-734: Summary: Support getAllSessions in Cluster Key: LIVY-734 URL: https://issues.apache.org/jira/browse/LIVY-734 Project: Livy Issue Type: Sub-task Reporter: Yiheng Wang -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (LIVY-733) Support Service Discovery
Yiheng Wang created LIVY-733: Summary: Support Service Discovery Key: LIVY-733 URL: https://issues.apache.org/jira/browse/LIVY-733 Project: Livy Issue Type: Sub-task Reporter: Yiheng Wang -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (LIVY-725) Support Route Request
[ https://issues.apache.org/jira/browse/LIVY-725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yiheng Wang updated LIVY-725: - Summary: Support Route Request (was: Support Route Request and Server Discovery) > Support Route Request > - > > Key: LIVY-725 > URL: https://issues.apache.org/jira/browse/LIVY-725 > Project: Livy > Issue Type: Sub-task >Reporter: Yiheng Wang >Assignee: mingchao zhao >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (LIVY-732) A Common Zookeeper Wrapper Utility
Yiheng Wang created LIVY-732: Summary: A Common Zookeeper Wrapper Utility Key: LIVY-732 URL: https://issues.apache.org/jira/browse/LIVY-732 Project: Livy Issue Type: Sub-task Reporter: Yiheng Wang -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (LIVY-731) Session Allocation with server-session Mapping
Yiheng Wang created LIVY-731: Summary: Session Allocation with server-session Mapping Key: LIVY-731 URL: https://issues.apache.org/jira/browse/LIVY-731 Project: Livy Issue Type: Sub-task Reporter: Yiheng Wang -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (LIVY-725) Support Route Request and Server Discovery
Yiheng Wang created LIVY-725: Summary: Support Route Request and Server Discovery Key: LIVY-725 URL: https://issues.apache.org/jira/browse/LIVY-725 Project: Livy Issue Type: Sub-task Reporter: Yiheng Wang -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (LIVY-724) Support Server Leave/Join and Session Cross Server Recovery
Yiheng Wang created LIVY-724: Summary: Support Server Leave/Join and Session Cross Server Recovery Key: LIVY-724 URL: https://issues.apache.org/jira/browse/LIVY-724 Project: Livy Issue Type: Sub-task Reporter: Yiheng Wang -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (LIVY-723) Server Registration
Yiheng Wang created LIVY-723: Summary: Server Registration Key: LIVY-723 URL: https://issues.apache.org/jira/browse/LIVY-723 Project: Livy Issue Type: Sub-task Reporter: Yiheng Wang -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (LIVY-722) Session Allocation with Consistent Hashing
Yiheng Wang created LIVY-722: Summary: Session Allocation with Consistent Hashing Key: LIVY-722 URL: https://issues.apache.org/jira/browse/LIVY-722 Project: Livy Issue Type: Sub-task Reporter: Yiheng Wang -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (LIVY-721) Distributed Session ID Generation
[ https://issues.apache.org/jira/browse/LIVY-721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yiheng Wang updated LIVY-721: - Summary: Distributed Session ID Generation (was: Distributed Session ID generation) > Distributed Session ID Generation > - > > Key: LIVY-721 > URL: https://issues.apache.org/jira/browse/LIVY-721 > Project: Livy > Issue Type: Sub-task >Reporter: Yiheng Wang >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (LIVY-721) Distributed Session ID generation
Yiheng Wang created LIVY-721: Summary: Distributed Session ID generation Key: LIVY-721 URL: https://issues.apache.org/jira/browse/LIVY-721 Project: Livy Issue Type: Sub-task Reporter: Yiheng Wang -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (LIVY-719) Livy is not high available in aws EMR cluster
[ https://issues.apache.org/jira/browse/LIVY-719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16985791#comment-16985791 ] Yiheng Wang commented on LIVY-719: -- Hi We have proposed a JIRA for Livy HA. https://issues.apache.org/jira/browse/LIVY-718 > Livy is not high available in aws EMR cluster > - > > Key: LIVY-719 > URL: https://issues.apache.org/jira/browse/LIVY-719 > Project: Livy > Issue Type: Bug >Reporter: chiranjeevi patel >Priority: Major > > Hi, > We have been using the Livy in our EMR cluster to submit the batch jobs, now > after upgrading to EMR 5.0.24 with multi master capability, livy is not able > to handle the sessions on master failover. > Livy is installed on all three master nodes and running as independent > application, so a master node failure is bringing down the livy session, > where EMR was abel to switch the active master to standby one, but out jobs > were failing, since we lost the session with previous active master, and we > need to explicitly connect to the new active master with a different IP. > > Do you have any plans to support multi master (HA) clusters ? > I have asked the AWS support team, and they responded as livy is due with > many open source issues, and they can work on it only when these issues are > sorted. > Let me know if you need any further details > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (LIVY-718) Support multi-active high availability in Livy
Yiheng Wang created LIVY-718: Summary: Support multi-active high availability in Livy Key: LIVY-718 URL: https://issues.apache.org/jira/browse/LIVY-718 Project: Livy Issue Type: Epic Components: RSC, Server Reporter: Yiheng Wang In this JIRA we want to discuss how to implement multi-active high availability in Livy. Currently, Livy only supports single node recovery. This is not sufficient in some production environments. In our scenario, the Livy server serves many notebook and JDBC services. We want to make Livy service more fault-tolerant and scalable. There're already some proposals in the community for high availability. But they're not so complete or just for active-standby high availability. So we propose a multi-active high availability design to achieve the following goals: # One or more servers will serve the client requests at the same time. # Sessions are allocated among different servers. # When one node crashes, the affected sessions will be moved to other active services. Here's our design document, please review and comment: https://docs.google.com/document/d/1bD3qYZpw14_NuCcSGUOfqQ0pqvSbCQsOLFuZp26Ohjc/edit?usp=sharing -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (LIVY-712) EMR 5.23/5.27 - Livy does not recognise that Spark job failed
[ https://issues.apache.org/jira/browse/LIVY-712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16979272#comment-16979272 ] Yiheng Wang commented on LIVY-712: -- Can you provide a way to reproduce the issue? > EMR 5.23/5.27 - Livy does not recognise that Spark job failed > - > > Key: LIVY-712 > URL: https://issues.apache.org/jira/browse/LIVY-712 > Project: Livy > Issue Type: Bug > Components: API >Affects Versions: 0.5.0, 0.6.0 > Environment: AWS EMR 5.23/5.27, Scala >Reporter: Michal Sankot >Priority: Major > Labels: EMR, api, spark > > We've upgraded from AWS EMR 5.13 -> 5.23 (Livy 0.4.0 -> 0.5.0, Spark 2.3.0 -> > 2.4.0) and an issue appears that when there is an exception thrown during > Spark job execution, Spark shuts down as if there was no problem and job > appears as Completed in EMR. So we're not notified when system crashes. The > same problem appears in EMR 5.27 (Livy 0.6.0, Spark 2.4.4). > Is it something with Spark? Or a known issue with Livy? > In Livy logs I see that spark-submit exists with error code 1 > {quote}{{05:34:59 WARN BatchSession$: spark-submit exited with code 1}} > {quote} > And then Livy API states that batch state is > {quote}{{"state": "success"}} > {quote} > How can it be made work again? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (LIVY-712) EMR 5.23/5.27 - Livy does not recognise that Spark job failed
[ https://issues.apache.org/jira/browse/LIVY-712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16979271#comment-16979271 ] Yiheng Wang commented on LIVY-712: -- This code is changed in this patch: https://github.com/apache/incubator-livy/commit/ca4cad22968e1a2f88fa0ec262c1088812e3d251 [~jshao] Any suggestion about this? > EMR 5.23/5.27 - Livy does not recognise that Spark job failed > - > > Key: LIVY-712 > URL: https://issues.apache.org/jira/browse/LIVY-712 > Project: Livy > Issue Type: Bug > Components: API >Affects Versions: 0.5.0, 0.6.0 > Environment: AWS EMR 5.23/5.27, Scala >Reporter: Michal Sankot >Priority: Major > Labels: EMR, api, spark > > We've upgraded from AWS EMR 5.13 -> 5.23 (Livy 0.4.0 -> 0.5.0, Spark 2.3.0 -> > 2.4.0) and an issue appears that when there is an exception thrown during > Spark job execution, Spark shuts down as if there was no problem and job > appears as Completed in EMR. So we're not notified when system crashes. The > same problem appears in EMR 5.27 (Livy 0.6.0, Spark 2.4.4). > Is it something with Spark? Or a known issue with Livy? > In Livy logs I see that spark-submit exists with error code 1 > {quote}{{05:34:59 WARN BatchSession$: spark-submit exited with code 1}} > {quote} > And then Livy API states that batch state is > {quote}{{"state": "success"}} > {quote} > How can it be made work again? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (LIVY-569) BinaryClassificationMetrics give AttributeError
[ https://issues.apache.org/jira/browse/LIVY-569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969212#comment-16969212 ] Yiheng Wang commented on LIVY-569: -- >From the latest community version of livy, I cannot reprocude the error. Maybe you need to check with AWS EMR guys... > BinaryClassificationMetrics give AttributeError > --- > > Key: LIVY-569 > URL: https://issues.apache.org/jira/browse/LIVY-569 > Project: Livy > Issue Type: Bug >Reporter: Fred de Gier >Priority: Major > > When using the BinaryClassificationMetrics there is an error, possible > related to conversion of data types. This only occurs when using Sparkmagic > and Livy. > *Input:* > {code:java} > from pyspark import SparkContext > from pyspark.sql import SparkSession > import pyspark > from pyspark.mllib.evaluation import BinaryClassificationMetrics > a = sc.parallelize([ > (0.0, 1.0), (0.0, 1.0), (0.0, 1.0), (0.0, 1.0), (0.0, 1.0), > (0.0, 1.0), (0.0, 1.0), (0.0, 1.0), (0.0, 1.0), (0.0, 1.0), > (0.0, 1.0), (0.0, 1.0), (0.0, 0.0), (0.0, 0.0), (0.0, 0.0), > (0.0, 0.0), (0.0, 0.0), (0.0, 0.0), (0.0, 0.0), (0.0, 0.0), > (0.0, 0.0), (0.0, 0.0), (0.0, 0.0), (0.0, 0.0), (0.0, 0.0) > ]) > metrics = BinaryClassificationMetrics(a) > {code} > > *Output:* > {code:java} > 'StructField' object has no attribute '_get_object_id' {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (LIVY-569) BinaryClassificationMetrics give AttributeError
[ https://issues.apache.org/jira/browse/LIVY-569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16964538#comment-16964538 ] Yiheng Wang commented on LIVY-569: -- >From the comments of the stack overflow question, it looks like it's related >with AWS EMR Livy. Are you also using the AWS EMR livy? > BinaryClassificationMetrics give AttributeError > --- > > Key: LIVY-569 > URL: https://issues.apache.org/jira/browse/LIVY-569 > Project: Livy > Issue Type: Bug >Reporter: Fred de Gier >Priority: Major > > When using the BinaryClassificationMetrics there is an error, possible > related to conversion of data types. This only occurs when using Sparkmagic > and Livy. > *Input:* > {code:java} > from pyspark import SparkContext > from pyspark.sql import SparkSession > import pyspark > from pyspark.mllib.evaluation import BinaryClassificationMetrics > a = sc.parallelize([ > (0.0, 1.0), (0.0, 1.0), (0.0, 1.0), (0.0, 1.0), (0.0, 1.0), > (0.0, 1.0), (0.0, 1.0), (0.0, 1.0), (0.0, 1.0), (0.0, 1.0), > (0.0, 1.0), (0.0, 1.0), (0.0, 0.0), (0.0, 0.0), (0.0, 0.0), > (0.0, 0.0), (0.0, 0.0), (0.0, 0.0), (0.0, 0.0), (0.0, 0.0), > (0.0, 0.0), (0.0, 0.0), (0.0, 0.0), (0.0, 0.0), (0.0, 0.0) > ]) > metrics = BinaryClassificationMetrics(a) > {code} > > *Output:* > {code:java} > 'StructField' object has no attribute '_get_object_id' {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (LIVY-569) BinaryClassificationMetrics give AttributeError
[ https://issues.apache.org/jira/browse/LIVY-569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16962882#comment-16962882 ] Yiheng Wang commented on LIVY-569: -- Hi, which version of spark/livy are you using. I cannot reproduce this error. Please check the attachment. !屏幕快照 2019-10-30 下午6.03.04.png! > BinaryClassificationMetrics give AttributeError > --- > > Key: LIVY-569 > URL: https://issues.apache.org/jira/browse/LIVY-569 > Project: Livy > Issue Type: Bug >Reporter: Fred de Gier >Priority: Major > > When using the BinaryClassificationMetrics there is an error, possible > related to conversion of data types. This only occurs when using Sparkmagic > and Livy. > *Input:* > {code:java} > from pyspark import SparkContext > from pyspark.sql import SparkSession > import pyspark > from pyspark.mllib.evaluation import BinaryClassificationMetrics > a = sc.parallelize([ > (0.0, 1.0), (0.0, 1.0), (0.0, 1.0), (0.0, 1.0), (0.0, 1.0), > (0.0, 1.0), (0.0, 1.0), (0.0, 1.0), (0.0, 1.0), (0.0, 1.0), > (0.0, 1.0), (0.0, 1.0), (0.0, 0.0), (0.0, 0.0), (0.0, 0.0), > (0.0, 0.0), (0.0, 0.0), (0.0, 0.0), (0.0, 0.0), (0.0, 0.0), > (0.0, 0.0), (0.0, 0.0), (0.0, 0.0), (0.0, 0.0), (0.0, 0.0) > ]) > metrics = BinaryClassificationMetrics(a) > {code} > > *Output:* > {code:java} > 'StructField' object has no attribute '_get_object_id' {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (LIVY-569) BinaryClassificationMetrics give AttributeError
[ https://issues.apache.org/jira/browse/LIVY-569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16962882#comment-16962882 ] Yiheng Wang edited comment on LIVY-569 at 10/30/19 10:05 AM: - Hi, which version of spark/livy are you using. I cannot reproduce this error. was (Author: yihengw): Hi, which version of spark/livy are you using. I cannot reproduce this error. Please check the attachment. !屏幕快照 2019-10-30 下午6.03.04.png! > BinaryClassificationMetrics give AttributeError > --- > > Key: LIVY-569 > URL: https://issues.apache.org/jira/browse/LIVY-569 > Project: Livy > Issue Type: Bug >Reporter: Fred de Gier >Priority: Major > > When using the BinaryClassificationMetrics there is an error, possible > related to conversion of data types. This only occurs when using Sparkmagic > and Livy. > *Input:* > {code:java} > from pyspark import SparkContext > from pyspark.sql import SparkSession > import pyspark > from pyspark.mllib.evaluation import BinaryClassificationMetrics > a = sc.parallelize([ > (0.0, 1.0), (0.0, 1.0), (0.0, 1.0), (0.0, 1.0), (0.0, 1.0), > (0.0, 1.0), (0.0, 1.0), (0.0, 1.0), (0.0, 1.0), (0.0, 1.0), > (0.0, 1.0), (0.0, 1.0), (0.0, 0.0), (0.0, 0.0), (0.0, 0.0), > (0.0, 0.0), (0.0, 0.0), (0.0, 0.0), (0.0, 0.0), (0.0, 0.0), > (0.0, 0.0), (0.0, 0.0), (0.0, 0.0), (0.0, 0.0), (0.0, 0.0) > ]) > metrics = BinaryClassificationMetrics(a) > {code} > > *Output:* > {code:java} > 'StructField' object has no attribute '_get_object_id' {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (LIVY-696) Support recovery in livy thrift server
[ https://issues.apache.org/jira/browse/LIVY-696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yiheng Wang updated LIVY-696: - Description: This JIRA is to discuss support recovery in Livy Thrift server. Livy server support recovery. If user set *livy.server.recovery.mode* to *recovery* and set the *state-store*, livy server will persist its states. Once livy server crashes, the sessions will be restored after the service restart. The recovery is not yet supported in thrift server. After the service restarts, all JDBC connections and statements will be lost. Should we support recovery in the thrift server? There's one concern. In JDBC binary mode, the connection will be lost if the server crashes, and looks like hive-jdbc cannot reuse session-id when reconnecting to the server. But in JDBC http mode, we will benefit from the recovery. As http mode use short http connections instead of long connection. There're also two levels of recovery need to be considered. *Session recovery* let user can still use their JDBC connection after service restart. It should be straight forward to implement. We just need to persist the JDBC session/livy session mapping to the state-store and recover the mapping when restarting. *Statement recovery* let user can still use the JDBC statement(or fetch result) after service restart. This needs to persist ExecuteStatementOperation state. The concern is some states are not suitable to persist to state-store, e.g. operationMessages, rowOffset, state, operationException, backgroundHandle, lastAccessTime, operationComplete. They're complex type or change frequently. was: This JIRA is to discuss support recovery in Livy Thrift server. Livy server support recovery. If user set *livy.server.recovery.mode* to *recovery* and set the *state-store*, livy server will persist its states. Once livy server crashes, the sessions will be restored after the service restart. The recovery is not yet supported in thrift server. After the service restarts, all JDBC connections and statements will be lost. Should we support recovery in the thrift server? There's one concern. In JDBC binary mode, the connection will be lost if the server crashes, and looks like hive-jdbc cannot reuse session-id when reconnecting to the server. But in JDBC http mode, we will benefit from the recovery. As http mode use short http connections instead of long connection. There're also two levels of recovery need to be considered. *Session recovery* let user can still use their JDBC connection after service restart. It should be straight forward to implement. We just need to persist the JDBC session/livy session mapping to the state-store and recover the mapping when restarting. *Operation recovery* let user can still use the JDBC statement(or fetch result) after service restart. This needs to persist ExecuteStatementOperation state. The concern is some states are not suitable to persist to state-store, e.g. operationMessages, rowOffset, state, operationException, backgroundHandle, lastAccessTime, operationComplete. They're complex type or change frequently. > Support recovery in livy thrift server > -- > > Key: LIVY-696 > URL: https://issues.apache.org/jira/browse/LIVY-696 > Project: Livy > Issue Type: Improvement > Components: Thriftserver >Reporter: Yiheng Wang >Priority: Major > > This JIRA is to discuss support recovery in Livy Thrift server. > Livy server support recovery. If user set *livy.server.recovery.mode* to > *recovery* and set the *state-store*, livy server will persist its states. > Once livy server crashes, the sessions will be restored after the service > restart. > The recovery is not yet supported in thrift server. After the service > restarts, all JDBC connections and statements will be lost. > Should we support recovery in the thrift server? There's one concern. In JDBC > binary mode, the connection will be lost if the server crashes, and looks > like hive-jdbc cannot reuse session-id when reconnecting to the server. But > in JDBC http mode, we will benefit from the recovery. As http mode use short > http connections instead of long connection. > There're also two levels of recovery need to be considered. > *Session recovery* let user can still use their JDBC connection after service > restart. It should be straight forward to implement. We just need to persist > the JDBC session/livy session mapping to the state-store and recover the > mapping when restarting. > *Statement recovery* let user can still use the JDBC statement(or fetch > result) after service restart. This needs to persist > ExecuteStatementOperation state. The concern is some states are not suitable > to persist to state-store, e.g. operationMessages, rowOffset, state, > operationException,
[jira] [Updated] (LIVY-696) Support recovery in livy thrift server
[ https://issues.apache.org/jira/browse/LIVY-696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yiheng Wang updated LIVY-696: - Description: This JIRA is to discuss support recovery in Livy Thrift server. Livy server support recovery. If user set *livy.server.recovery.mode* to *recovery* and set the *state-store*, livy server will persist its states. Once livy server crashes, the sessions will be restored after the service restart. The recovery is not yet supported in thrift server. After the service restarts, all JDBC connections and statements will be lost. Should we support recovery in the thrift server? There's one concern. In JDBC binary mode, the connection will be lost if the server crashes, and looks like hive-jdbc cannot reuse session-id when reconnecting to the server. But in JDBC http mode, we will benefit from the recovery. As http mode use short http connections instead of long connection. There're also two levels of recovery need to be considered. *Session recovery* let user can still use their JDBC connection after service restart. It should be straight forward to implement. We just need to persist the JDBC session/livy session mapping to the state-store and recover the mapping when restarting. *Operation recovery* let user can still use the JDBC statement(or fetch result) after service restart. This needs to persist ExecuteStatementOperation state. The concern is some states are not suitable to persist to state-store, e.g. operationMessages, rowOffset, state, operationException, backgroundHandle, lastAccessTime, operationComplete. They're complex type or change frequently. was: This JIRA is to discuss support recovery in Livy Thrift server. Livy server support recovery. If user set *livy.server.recovery.mode* to *recovery* and set the *state-store*, livy server will persist its states. Once livy server crashes, the sessions will be restored after the service restart. The recovery is not yet supported in thrift server. After the service restarts, all JDBC connections and statements will be lost. Should we support recovery in the thrift server? There's one concern. In JDBC binary mode, the connection will be lost if the server crashes, and looks like hive-jdbc cannot reuse session-id when reconnecting to the server. But in JDBC http mode, we will benefit from the recovery. As http mode use short http connections instead of long connection. There're also two levels of recovery need to be considered. *Session recovery* let user can still use their JDBC connection after service restart. It should be straight forward to implement. We just need to persist the JDBC session/livy session mapping to the state-store and recover the mapping when restarting. *Operation recovery* let user can still use the JDBC statement(or fetch result) after service restart. This needs to persist ExecuteStatementOperation state. The concern is some states are not suitable to persist to state-store, e.g. operationMessages, rowOffset, state, operationException, backgroundHandle, lastAccessTime, operationComplete. > Support recovery in livy thrift server > -- > > Key: LIVY-696 > URL: https://issues.apache.org/jira/browse/LIVY-696 > Project: Livy > Issue Type: Improvement > Components: Thriftserver >Reporter: Yiheng Wang >Priority: Major > > This JIRA is to discuss support recovery in Livy Thrift server. > Livy server support recovery. If user set *livy.server.recovery.mode* to > *recovery* and set the *state-store*, livy server will persist its states. > Once livy server crashes, the sessions will be restored after the service > restart. > The recovery is not yet supported in thrift server. After the service > restarts, all JDBC connections and statements will be lost. > Should we support recovery in the thrift server? There's one concern. In JDBC > binary mode, the connection will be lost if the server crashes, and looks > like hive-jdbc cannot reuse session-id when reconnecting to the server. But > in JDBC http mode, we will benefit from the recovery. As http mode use short > http connections instead of long connection. > There're also two levels of recovery need to be considered. > *Session recovery* let user can still use their JDBC connection after service > restart. It should be straight forward to implement. We just need to persist > the JDBC session/livy session mapping to the state-store and recover the > mapping when restarting. > *Operation recovery* let user can still use the JDBC statement(or fetch > result) after service restart. This needs to persist > ExecuteStatementOperation state. The concern is some states are not suitable > to persist to state-store, e.g. operationMessages, rowOffset, state, > operationException, backgroundHandle, lastAccessTime,
[jira] [Updated] (LIVY-696) Support recovery in livy thrift server
[ https://issues.apache.org/jira/browse/LIVY-696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yiheng Wang updated LIVY-696: - Description: This JIRA is to discuss support recovery in Livy Thrift server. Livy server support recovery. If user set *livy.server.recovery.mode* to *recovery* and set the *state-store*, livy server will persist its states. Once livy server crashes, the sessions will be restored after the service restart. The recovery is not yet supported in thrift server. After the service restarts, all JDBC connections and statements will be lost. Should we support recovery in the thrift server? There's one concern. In JDBC binary mode, the connection will be lost if the server crashes, and looks like hive-jdbc cannot reuse session-id when reconnecting to the server. But in JDBC http mode, we will benefit from the recovery. As http mode use short http connections instead of long connection. There're also two levels of recovery need to be considered. *Session recovery* let user can still use their JDBC connection after service restart. It should be straight forward to implement. We just need to persist the JDBC session/livy session mapping to the state-store and recover the mapping when restarting. *Operation recovery* let user can still use the JDBC statement(or fetch result) after service restart. This needs to persist ExecuteStatementOperation state. The concern is some states are not suitable to persist to state-store, e.g. operationMessages, rowOffset, state, operationException, backgroundHandle, lastAccessTime, operationComplete. was: This JIRA is to discuss support recovery in Livy Thrift server. Livy server support recovery. If user set *livy.server.recovery.mode* to *recovery* and set the *state-store*, livy server will persist its states. Once livy server crashes, the sessions will be restored after the service restart. The recovery is not yet supported in thrift server. After the service restarts, all JDBC sessions and operations will be lost. Should we support recovery in the thrift server? There's one concern. In JDBC binary mode, the connection will be lost if the server crashes, and looks like hive-jdbc cannot reuse session-id when reconnecting to the server. But in JDBC http mode, we will benefit from the recovery. As http mode use short http connections instead of long connection. There're also two levels of recovery need to be considered. *Session recovery* let user can still use their JDBC connection after service restart. It should be straight forward to implement. We just need to persist the JDBC session/livy session mapping to the state-store and recover the mapping when restarting. *Operation recovery* let user can still use the JDBC statement(or fetch result) after service restart. This needs to persist ExecuteStatementOperation state. The concern is some states are not suitable to persist to state-store, e.g. operationMessages, rowOffset, state, operationException, backgroundHandle, lastAccessTime, operationComplete. > Support recovery in livy thrift server > -- > > Key: LIVY-696 > URL: https://issues.apache.org/jira/browse/LIVY-696 > Project: Livy > Issue Type: Improvement > Components: Thriftserver >Reporter: Yiheng Wang >Priority: Major > > This JIRA is to discuss support recovery in Livy Thrift server. > Livy server support recovery. If user set *livy.server.recovery.mode* to > *recovery* and set the *state-store*, livy server will persist its states. > Once livy server crashes, the sessions will be restored after the service > restart. > The recovery is not yet supported in thrift server. After the service > restarts, all JDBC connections and statements will be lost. > Should we support recovery in the thrift server? There's one concern. In JDBC > binary mode, the connection will be lost if the server crashes, and looks > like hive-jdbc cannot reuse session-id when reconnecting to the server. But > in JDBC http mode, we will benefit from the recovery. As http mode use short > http connections instead of long connection. > There're also two levels of recovery need to be considered. > *Session recovery* let user can still use their JDBC connection after service > restart. It should be straight forward to implement. We just need to persist > the JDBC session/livy session mapping to the state-store and recover the > mapping when restarting. > *Operation recovery* let user can still use the JDBC statement(or fetch > result) after service restart. This needs to persist > ExecuteStatementOperation state. The concern is some states are not suitable > to persist to state-store, e.g. operationMessages, rowOffset, state, > operationException, backgroundHandle, lastAccessTime, operationComplete. -- This message was sent by
[jira] [Created] (LIVY-696) Support recovery in livy thrift server
Yiheng Wang created LIVY-696: Summary: Support recovery in livy thrift server Key: LIVY-696 URL: https://issues.apache.org/jira/browse/LIVY-696 Project: Livy Issue Type: Improvement Components: Thriftserver Reporter: Yiheng Wang This JIRA is opened to discuss support recovery in Livy Thrift server. Livy server support recovery. If user set *livy.server.recovery.mode* to *recovery* and set the *state-store*, livy server will persist its states. Once livy server crashes, the sessions will be restored after the service restart. The recovery is not yet supported in thrift server. After the service restarts, all JDBC sessions and operations will be lost. Should we support recovery in the thrift server? There's one concern. In JDBC binary mode, the connection will be lost if the server crashes, and looks like hive-jdbc cannot reuse session-id when reconnecting to the server. But in JDBC http mode, we will benefit from the recovery. As http mode use short http connections instead of long connection. There're also two levels of recovery need to be considered. *Session recovery* let user can still use their JDBC connection after service restart. It should be straight forward to implement. We just need to persist the JDBC session/livy session mapping to the state-store and recover the mapping when restarting. *Operation recovery* let user can still use the JDBC statement(or fetch result) after service restart. This needs to persist ExecuteStatementOperation state. The concern is some states are not suitable to persist to state-store, e.g. operationMessages, rowOffset, state, operationException, backgroundHandle, lastAccessTime, operationComplete. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (LIVY-691) Duplicate jars in assembly jar folder
Yiheng Wang created LIVY-691: Summary: Duplicate jars in assembly jar folder Key: LIVY-691 URL: https://issues.apache.org/jira/browse/LIVY-691 Project: Livy Issue Type: Bug Reporter: Yiheng Wang We found some jars of different version will be generated both in the assembly jar folder. It may cause some problem if there's some API incompatible. So far we found -rw-r--r-- 1 root root 966102 Sep 29 14:53 apache-jsp-8.0.33.jar -rw-r--r-- 1 root root11012 Sep 29 14:53 apache-jsp-9.3.24.v20180605.jar ... -rw-r--r-- 1 root root38015 Sep 29 14:53 commons-logging-1.0.4.jar -rw-r--r-- 1 root root61829 Sep 29 14:52 commons-logging-1.2.jar ... -rw-r--r-- 1 root root 177131 Sep 29 14:52 jetty-util-6.1.26.jar -rw-r--r-- 1 root root 458642 Sep 29 14:52 jetty-util-9.3.24.v20180605.jar ... -rw-r--r-- 1 root root50493 Sep 29 14:53 jsp-api-2.0.jar -rw-r--r-- 1 root root 100636 Sep 29 14:52 jsp-api-2.1.jar ... -rw-r--r-- 1 root root 1208356 Sep 29 14:53 netty-3.7.0.Final.jar -rw-r--r-- 1 root root 1330219 Sep 29 14:52 netty-3.9.9.Final.jar -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (LIVY-690) Exclude curator in thrift server pom to avoid conflict jars
[ https://issues.apache.org/jira/browse/LIVY-690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16940216#comment-16940216 ] Yiheng Wang commented on LIVY-690: -- This will be fixed by this patch: https://github.com/apache/incubator-livy/pull/239 > Exclude curator in thrift server pom to avoid conflict jars > --- > > Key: LIVY-690 > URL: https://issues.apache.org/jira/browse/LIVY-690 > Project: Livy > Issue Type: Bug > Components: Thriftserver >Affects Versions: 0.6.0 >Reporter: Yiheng Wang >Priority: Major > Fix For: 0.7.0 > > > Currently, thrift server has a dependency of curator-client:2.12.0 through > the hive service. After the build, a curator-client-2.12.0.jar file will be > generated in the jars folder. It is conflicted with the > curator-client-2.7.1.jar file, which is used by livy server. > We observed that in some JDK, the curator-client-2.12.0.jar is loaded before > the curator-client-2.7.1.jar, and will crash the recovery enabled livy server. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (LIVY-690) Exclude curator in thrift server pom to avoid conflict jars
Yiheng Wang created LIVY-690: Summary: Exclude curator in thrift server pom to avoid conflict jars Key: LIVY-690 URL: https://issues.apache.org/jira/browse/LIVY-690 Project: Livy Issue Type: Bug Components: Thriftserver Affects Versions: 0.6.0 Reporter: Yiheng Wang Fix For: 0.7.0 Currently, thrift server has a dependency of curator-client:2.12.0 through the hive service. After the build, a curator-client-2.12.0.jar file will be generated in the jars folder. It is conflicted with the curator-client-2.7.1.jar file, which is used by livy server. We observed that in some JDK, the curator-client-2.12.0.jar is loaded before the curator-client-2.7.1.jar, and will crash the recovery enabled livy server. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (LIVY-667) Support query a lot of data.
[ https://issues.apache.org/jira/browse/LIVY-667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16935829#comment-16935829 ] Yiheng Wang commented on LIVY-667: -- Hi Marco. I think Spark compute on the partition data through an iterator interface. The executor may not load the whole partition data into memory. > Support query a lot of data. > > > Key: LIVY-667 > URL: https://issues.apache.org/jira/browse/LIVY-667 > Project: Livy > Issue Type: Bug > Components: Thriftserver >Affects Versions: 0.6.0 >Reporter: runzhiwang >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > > When enable livy.server.thrift.incrementalCollect, thrift use toLocalIterator > to load one partition at each time instead of the whole rdd to avoid > OutOfMemory. However, if the largest partition is too big, the OutOfMemory > still occurs. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (LIVY-684) Livy server support zookeeper service discovery
[ https://issues.apache.org/jira/browse/LIVY-684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16933110#comment-16933110 ] Yiheng Wang commented on LIVY-684: -- This is duplicate with https://issues.apache.org/jira/browse/LIVY-616 > Livy server support zookeeper service discovery > --- > > Key: LIVY-684 > URL: https://issues.apache.org/jira/browse/LIVY-684 > Project: Livy > Issue Type: New Feature >Reporter: Zhefeng Wang >Priority: Minor > > Livy server hasn't support service discovery, which is widely used in highly > available. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (LIVY-664) Spark application still running when Livy session creating was rejected
[ https://issues.apache.org/jira/browse/LIVY-664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16931300#comment-16931300 ] Yiheng Wang commented on LIVY-664: -- session stop code: [https://github.com/apache/incubator-livy/blob/master/server/src/main/scala/org/apache/livy/sessions/SessionManager.scala#L103] > Spark application still running when Livy session creating was rejected > > > Key: LIVY-664 > URL: https://issues.apache.org/jira/browse/LIVY-664 > Project: Livy > Issue Type: Bug >Reporter: Oleksandr Shevchenko >Priority: Major > Attachments: image-2019-09-08-20-38-50-195.png, > image-2019-09-08-20-39-18-569.png > > Time Spent: 10m > Remaining Estimate: 0h > > Steps for reproduce: > 1. Create a session with some name > 2. Create a second session with the same name > 2.1 Second session creating will be rejected since duplicated session name > is not allowed. > 2.2 Spark application will be submitted but Livy session won't be created > > Result: Spark application was submitted but Livy session is not created > Expected result: Livy session creating rejected AND Spark application should > be finished with failed or killed state or even should not be submitted. > !image-2019-09-08-20-38-50-195.png! > !image-2019-09-08-20-39-18-569.png! -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Issue Comment Deleted] (LIVY-664) Spark application still running when Livy session creating was rejected
[ https://issues.apache.org/jira/browse/LIVY-664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yiheng Wang updated LIVY-664: - Comment: was deleted (was: Oh, I check the wrong branch... Sorry for the confusion. ) > Spark application still running when Livy session creating was rejected > > > Key: LIVY-664 > URL: https://issues.apache.org/jira/browse/LIVY-664 > Project: Livy > Issue Type: Bug >Reporter: Oleksandr Shevchenko >Priority: Major > Attachments: image-2019-09-08-20-38-50-195.png, > image-2019-09-08-20-39-18-569.png > > Time Spent: 10m > Remaining Estimate: 0h > > Steps for reproduce: > 1. Create a session with some name > 2. Create a second session with the same name > 2.1 Second session creating will be rejected since duplicated session name > is not allowed. > 2.2 Spark application will be submitted but Livy session won't be created > > Result: Spark application was submitted but Livy session is not created > Expected result: Livy session creating rejected AND Spark application should > be finished with failed or killed state or even should not be submitted. > !image-2019-09-08-20-38-50-195.png! > !image-2019-09-08-20-39-18-569.png! -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (LIVY-664) Spark application still running when Livy session creating was rejected
[ https://issues.apache.org/jira/browse/LIVY-664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16931261#comment-16931261 ] Yiheng Wang commented on LIVY-664: -- Oh, I check the wrong branch... Sorry for the confusion. > Spark application still running when Livy session creating was rejected > > > Key: LIVY-664 > URL: https://issues.apache.org/jira/browse/LIVY-664 > Project: Livy > Issue Type: Bug >Reporter: Oleksandr Shevchenko >Priority: Major > Attachments: image-2019-09-08-20-38-50-195.png, > image-2019-09-08-20-39-18-569.png > > Time Spent: 10m > Remaining Estimate: 0h > > Steps for reproduce: > 1. Create a session with some name > 2. Create a second session with the same name > 2.1 Second session creating will be rejected since duplicated session name > is not allowed. > 2.2 Spark application will be submitted but Livy session won't be created > > Result: Spark application was submitted but Livy session is not created > Expected result: Livy session creating rejected AND Spark application should > be finished with failed or killed state or even should not be submitted. > !image-2019-09-08-20-38-50-195.png! > !image-2019-09-08-20-39-18-569.png! -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Comment Edited] (LIVY-664) Spark application still running when Livy session creating was rejected
[ https://issues.apache.org/jira/browse/LIVY-664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16928486#comment-16928486 ] Yiheng Wang edited comment on LIVY-664 at 9/12/19 12:18 PM: In the code, it will stop the duplicated session when finding there're duplicated names. In my test, I find the stop session may fail. I guess it may due to some race condition. I'm looking into it. Another suggestion I think it's great. We should check the name duplication before submitting the application. I'll raise a patch to fix that. was (Author: yihengw): In the code, it will stop the duplicated session when finding there're duplicated names. In my test, I find the stop session may fail. I guess it may due to some race condition. I'm looking into it. Another suggestion I think it good. We should check the name duplication before submitting the application. I'll raise a patch to fix that. > Spark application still running when Livy session creating was rejected > > > Key: LIVY-664 > URL: https://issues.apache.org/jira/browse/LIVY-664 > Project: Livy > Issue Type: Bug >Reporter: Oleksandr Shevchenko >Priority: Major > Attachments: image-2019-09-08-20-38-50-195.png, > image-2019-09-08-20-39-18-569.png > > Time Spent: 10m > Remaining Estimate: 0h > > Steps for reproduce: > 1. Create a session with some name > 2. Create a second session with the same name > 2.1 Second session creating will be rejected since duplicated session name > is not allowed. > 2.2 Spark application will be submitted but Livy session won't be created > > Result: Spark application was submitted but Livy session is not created > Expected result: Livy session creating rejected AND Spark application should > be finished with failed or killed state or even should not be submitted. > !image-2019-09-08-20-38-50-195.png! > !image-2019-09-08-20-39-18-569.png! -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (LIVY-664) Spark application still running when Livy session creating was rejected
[ https://issues.apache.org/jira/browse/LIVY-664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16928486#comment-16928486 ] Yiheng Wang commented on LIVY-664: -- In the code, it will stop the duplicated session when finding there're duplicated names. In my test, I find the stop session may fail. I guess it may due to some race condition. I'm looking into it. Another suggestion I think it good. We should check the name duplication before submitting the application. I'll raise a patch to fix that. > Spark application still running when Livy session creating was rejected > > > Key: LIVY-664 > URL: https://issues.apache.org/jira/browse/LIVY-664 > Project: Livy > Issue Type: Bug >Reporter: Oleksandr Shevchenko >Priority: Major > Attachments: image-2019-09-08-20-38-50-195.png, > image-2019-09-08-20-39-18-569.png > > > Steps for reproduce: > 1. Create a session with some name > 2. Create a second session with the same name > 2.1 Second session creating will be rejected since duplicated session name > is not allowed. > 2.2 Spark application will be submitted but Livy session won't be created > > Result: Spark application was submitted but Livy session is not created > Expected result: Livy session creating rejected AND Spark application should > be finished with failed or killed state or even should not be submitted. > !image-2019-09-08-20-38-50-195.png! > !image-2019-09-08-20-39-18-569.png! -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (LIVY-660) How can we use YARN and all the nodes in our cluster when submiting a pySpark job
[ https://issues.apache.org/jira/browse/LIVY-660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16923112#comment-16923112 ] Yiheng Wang commented on LIVY-660: -- I think it's better to post such question in livy user email group. You just post your livy-client.conf. Do you modify your livy.conf? > How can we use YARN and all the nodes in our cluster when submiting a pySpark > job > - > > Key: LIVY-660 > URL: https://issues.apache.org/jira/browse/LIVY-660 > Project: Livy > Issue Type: Question > Components: Server >Affects Versions: 0.6.0 >Reporter: Sebastian Rama >Priority: Minor > > How can we use YARN and all the nodes in our cluster when submiting a pySpark > job? > We have edited all the required .conf files but nothing happens. =( > > > [root@cdh-node06 conf]# cat livy-client.conf > # > # Licensed to the Apache Software Foundation (ASF) under one or more > # contributor license agreements. See the NOTICE file distributed with > # this work for additional information regarding copyright ownership. > # The ASF licenses this file to You under the Apache License, Version 2.0 > # (the "License"); you may not use this file except in compliance with > # the License. You may obtain a copy of the License at > # > # http://www.apache.org/licenses/LICENSE-2.0 > # > # Unless required by applicable law or agreed to in writing, software > # distributed under the License is distributed on an "AS IS" BASIS, > # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. > # See the License for the specific language governing permissions and > # limitations under the License. > # > # Use this keystore for the SSL certificate and key. > # livy.keystore = > dew0wf-e > # Specify the keystore password. > # livy.keystore.password = > # > welfka > # Specify the key password. > # livy.key-password = > > # Hadoop Credential Provider Path to get "livy.keystore.password" and > "livy.key-password". > # Credential Provider can be created using command as follow: > # hadoop credential create "livy.keystore.password" -value "secret" -provider > jceks://hdfs/path/to/livy.jceks > # livy.hadoop.security.credential.provider.path = > > # What host address to start the server on. By default, Livy will bind to all > network interfaces. > # livy.server.host = 0.0.0.0 > > # What port to start the server on. > # livy.server.port = 8998 > > # What base path ui should work on. By default UI is mounted on "/". > # E.g.: livy.ui.basePath = /my_livy - result in mounting UI on /my_livy/ > # livy.ui.basePath = "" > > # What spark master Livy sessions should use. > livy.spark.master = yarn > > # What spark deploy mode Livy sessions should use. > livy.spark.deploy-mode = cluster > > # Configure Livy server http request and response header size. > # livy.server.request-header.size = 131072 > # livy.server.response-header.size = 131072 > > # Enabled to check whether timeout Livy sessions should be stopped. > # livy.server.session.timeout-check = true > > # Time in milliseconds on how long Livy will wait before timing out an idle > session. > # livy.server.session.timeout = 1h > # > # How long a finished session state should be kept in LivyServer for query. > # livy.server.session.state-retain.sec = 600s > > # If livy should impersonate the requesting users when creating a new session. > # livy.impersonation.enabled = true > > # Logs size livy can cache for each session/batch. 0 means don't cache the > logs. > # livy.cache-log.size = 200 > > # Comma-separated list of Livy RSC jars. By default Livy will upload jars > from its installation > # directory every time a session is started. By caching these files in HDFS, > for example, startup > # time of sessions on YARN can be reduced. > # livy.rsc.jars = > > # Comma-separated list of Livy REPL jars. By default Livy will upload jars > from its installation > # directory every time a session is started. By caching these files in HDFS, > for example, startup > # time of sessions on YARN can be reduced. Please list all the repl > dependencies including > # Scala version-specific livy-repl jars, Livy will automatically pick the > right dependencies > # during session creation. > # livy.repl.jars = > > # Location of PySpark archives. By default Livy will upload the file from > SPARK_HOME, but > # by caching the file in HDFS, startup time of PySpark sessions on YARN can > be reduced. > # livy.pyspark.archives = > > # Location of the SparkR package. By default Livy will upload the file from > SPARK_HOME, but > # by caching the file in HDFS, startup time of R sessions on YARN can be > reduced. > # livy.sparkr.package = > > # List of local directories from where files are allowed to be added to user >
[jira] [Commented] (LIVY-633) session should not be gc-ed for long running queries
[ https://issues.apache.org/jira/browse/LIVY-633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16921215#comment-16921215 ] Yiheng Wang commented on LIVY-633: -- Yes, it's a different problem. It should be a bug in Livy. I'm working on a patch to fix it. > session should not be gc-ed for long running queries > > > Key: LIVY-633 > URL: https://issues.apache.org/jira/browse/LIVY-633 > Project: Livy > Issue Type: Bug > Components: Server >Affects Versions: 0.6.0 >Reporter: Liju >Priority: Major > > If you have set a relatively small session timeout eg 15 mins and query > execution is taking > 15 mins , the session gets gc-ed , which is incorrect > wrt user experience as the user was still active on session and waiting for > result -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (LIVY-644) Flaky test: Failed to execute goal org.jacoco:jacoco-maven-plugin:0.8.2:report-aggregate (jacoco-report) on project livy-coverage-report
Yiheng Wang created LIVY-644: Summary: Flaky test: Failed to execute goal org.jacoco:jacoco-maven-plugin:0.8.2:report-aggregate (jacoco-report) on project livy-coverage-report Key: LIVY-644 URL: https://issues.apache.org/jira/browse/LIVY-644 Project: Livy Issue Type: Improvement Components: Tests Reporter: Yiheng Wang Recently a lot of Travis job failed when generating coverage report: [https://travis-ci.org/apache/incubator-livy/jobs/575142847] [https://travis-ci.org/apache/incubator-livy/jobs/561700903] [https://travis-ci.org/apache/incubator-livy/jobs/508574433] [https://travis-ci.org/apache/incubator-livy/jobs/508574435] [https://travis-ci.org/apache/incubator-livy/jobs/508066760] [https://travis-ci.org/apache/incubator-livy/jobs/507989073] [https://travis-ci.org/apache/incubator-livy/jobs/574702251] [https://travis-ci.org/apache/incubator-livy/jobs/574686891] [https://travis-ci.org/apache/incubator-livy/jobs/574363881] [https://travis-ci.org/apache/incubator-livy/jobs/574215174] [https://travis-ci.org/apache/incubator-livy/jobs/573689926] Here is the error stack: [ERROR] Failed to execute goal org.jacoco:jacoco-maven-plugin:0.8.2:report-aggregate (jacoco-report) on project livy-coverage-report: An error has occurred in JaCoCo Aggregate report generation. Error while creating report: null: EOFException -> [Help 1] 2988org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal org.jacoco:jacoco-maven-plugin:0.8.2:report-aggregate (jacoco-report) on project livy-coverage-report: An error has occurred in JaCoCo Aggregate report generation. 2989at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:213) 2990at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:154) 2991at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:146) 2992at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:117) 2993at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:81) 2994at org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build (SingleThreadedBuilder.java:51) 2995at org.apache.maven.lifecycle.internal.LifecycleStarter.execute (LifecycleStarter.java:128) 2996at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:309) 2997at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:194) 2998at org.apache.maven.DefaultMaven.execute (DefaultMaven.java:107) 2999at org.apache.maven.cli.MavenCli.execute (MavenCli.java:955) 3000at org.apache.maven.cli.MavenCli.doMain (MavenCli.java:290) 3001at org.apache.maven.cli.MavenCli.main (MavenCli.java:194) 3002at sun.reflect.NativeMethodAccessorImpl.invoke0 (Native Method) 3003at sun.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62) 3004at sun.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43) 3005at java.lang.reflect.Method.invoke (Method.java:498) 3006at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced (Launcher.java:289) 3007at org.codehaus.plexus.classworlds.launcher.Launcher.launch (Launcher.java:229) 3008at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode (Launcher.java:415) 3009at org.codehaus.plexus.classworlds.launcher.Launcher.main (Launcher.java:356) 3010Caused by: org.apache.maven.plugin.MojoExecutionException: An error has occurred in JaCoCo Aggregate report generation. 3011at org.jacoco.maven.AbstractReportMojo.execute (AbstractReportMojo.java:167) 3012at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo (DefaultBuildPluginManager.java:134) 3013at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:208) 3014at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:154) 3015at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:146) 3016at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:117) 3017at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:81) 3018at org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build (SingleThreadedBuilder.java:51) 3019at org.apache.maven.lifecycle.internal.LifecycleStarter.execute (LifecycleStarter.java:128) 3020at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:309) 3021at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:194) 3022at org.apache.maven.DefaultMaven.execute (DefaultMaven.java:107) 3023at org.apache.maven.cli.MavenCli.execute (MavenCli.java:955) 3024at org.apache.maven.cli.MavenCli.doMain (MavenCli.java:290) 3025at org.apache.maven.cli.MavenCli.main (MavenCli.java:194) 3026at sun.reflect.NativeMethodAccessorImpl.invoke0
[jira] [Commented] (LIVY-178) PySpark sessions crash on invalid string
[ https://issues.apache.org/jira/browse/LIVY-178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16910924#comment-16910924 ] Yiheng Wang commented on LIVY-178: -- This was reported in 0.2. Maybe it has been fixed in the latest version. > PySpark sessions crash on invalid string > > > Key: LIVY-178 > URL: https://issues.apache.org/jira/browse/LIVY-178 > Project: Livy > Issue Type: Bug > Components: Interpreter >Affects Versions: 0.2 > Environment: {code} > print "\xHH" > {code} > will crash the session >Reporter: Alex Man >Assignee: Alex Man >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (LIVY-515) Livy session is idle even if spark context is unavailable
[ https://issues.apache.org/jira/browse/LIVY-515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16910923#comment-16910923 ] Yiheng Wang commented on LIVY-515: -- This is reported in 0.4.0. Maybe this issue has been fixed in latest version. > Livy session is idle even if spark context is unavailable > - > > Key: LIVY-515 > URL: https://issues.apache.org/jira/browse/LIVY-515 > Project: Livy > Issue Type: Bug > Components: Core >Affects Versions: 0.4.0 >Reporter: Raghavendra >Priority: Major > Attachments: AfterKillingLivy.png, BeforeKillingLivy.png, > command.history, livy-log, livyUI.png > > > When using livy in the session mode, Livy is not able to figure out if the > application master is available. > Say suppose livy was in the idle state, now for some reason the application > master died. Ideally, even the session associated with this application > master should either be terminated or should have handled this situation. > Instead, the session continues to remain in the Idle state. This is wrong. > > Steps to reproduce. > # Bring up livy in session mode(I am using 5 sessions). > # Kill the application master using "yarn application -kill " > # Now list the application in yarn "yarn application -list" you will see 4 > application master(I had 5, Killed 1 so 4) > # Now reload livy UI. You will still see 5 sessions in idle. > Attached the session logs whose application master was killed. > Attached are some images depicting the issue -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (LIVY-573) Add tests for operation logs retrieval
[ https://issues.apache.org/jira/browse/LIVY-573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16907199#comment-16907199 ] Yiheng Wang commented on LIVY-573: -- Have created a patch. [~mgaido] can you please help review it? Thanks > Add tests for operation logs retrieval > -- > > Key: LIVY-573 > URL: https://issues.apache.org/jira/browse/LIVY-573 > Project: Livy > Issue Type: Improvement > Components: Tests, Thriftserver >Reporter: Marco Gaido >Priority: Trivial > Time Spent: 10m > Remaining Estimate: 0h > > Our current tests do not cover the retrieval of operation logs. We should try > and add coverage for it if possible. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (LIVY-638) get sql.AnalysisException when create table using thriftserver
[ https://issues.apache.org/jira/browse/LIVY-638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16907028#comment-16907028 ] Yiheng Wang commented on LIVY-638: -- It's weird that even I enable hive support in spark configuration, livy thrift server will still throw this exception... > get sql.AnalysisException when create table using thriftserver > -- > > Key: LIVY-638 > URL: https://issues.apache.org/jira/browse/LIVY-638 > Project: Livy > Issue Type: Bug > Components: Thriftserver >Affects Versions: 0.6.0 >Reporter: mingchao zhao >Priority: Major > Attachments: create table.png > > > org.apache.spark.sql.AnalysisExceptionoccurs when I use thriftserver to > execute the following SQL. When I do not use hive as metastore, thriftserver > does not support create table ? > 0: jdbc:hive2://localhost:10090> CREATE TABLE test(key INT, val STRING); > Error: java.util.concurrent.ExecutionException: java.lang.RuntimeException: > org.apache.spark.sql.AnalysisException: Hive support is required to CREATE > Hive TABLE (AS SELECT);; > 'CreateTable `test`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, > ErrorIfExists > org.apache.spark.sql.execution.datasources.HiveOnlyCheck$$anonfun$apply$12.apply(rules.scala:392) > > org.apache.spark.sql.execution.datasources.HiveOnlyCheck$$anonfun$apply$12.apply(rules.scala:390) > org.apache.spark.sql.catalyst.trees.TreeNode.foreach(TreeNode.scala:117) > > org.apache.spark.sql.execution.datasources.HiveOnlyCheck$.apply(rules.scala:390) > > org.apache.spark.sql.execution.datasources.HiveOnlyCheck$.apply(rules.scala:388) > > org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$2.apply(CheckAnalysis.scala:386) > > org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$2.apply(CheckAnalysis.scala:386) > > scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) > scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) > > org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.checkAnalysis(CheckAnalysis.scala:386) > > org.apache.spark.sql.catalyst.analysis.Analyzer.checkAnalysis(Analyzer.scala:95) > > org.apache.spark.sql.catalyst.analysis.Analyzer$$anonfun$executeAndCheck$1.apply(Analyzer.scala:108) > > org.apache.spark.sql.catalyst.analysis.Analyzer$$anonfun$executeAndCheck$1.apply(Analyzer.scala:105) > > org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.markInAnalyzer(AnalysisHelper.scala:201) > > org.apache.spark.sql.catalyst.analysis.Analyzer.executeAndCheck(Analyzer.scala:105) > > org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:57) > > org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:55) > > org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:47) > org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:78) > org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642) > org.apache.livy.thriftserver.session.SqlJob.executeSql(SqlJob.java:74) > org.apache.livy.thriftserver.session.SqlJob.call(SqlJob.java:64) > org.apache.livy.thriftserver.session.SqlJob.call(SqlJob.java:35) > org.apache.livy.rsc.driver.JobWrapper.call(JobWrapper.java:64) > org.apache.livy.rsc.driver.JobWrapper.call(JobWrapper.java:31) > java.util.concurrent.FutureTask.run(FutureTask.java:266) > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > java.lang.Thread.run(Thread.java:748) (state=,code=0) > 0: jdbc:hive2://localhost:10090> > -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (LIVY-635) Travis failed to build
[ https://issues.apache.org/jira/browse/LIVY-635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yiheng Wang updated LIVY-635: - Component/s: Tests > Travis failed to build > -- > > Key: LIVY-635 > URL: https://issues.apache.org/jira/browse/LIVY-635 > Project: Livy > Issue Type: Bug > Components: Tests, Thriftserver >Affects Versions: 0.6.0 >Reporter: jiewang >Priority: Major > > [ERROR] Failed to execute goal on project livy-thriftserver: Could not > resolve dependencies for project > org.apache.livy:livy-thriftserver:jar:0.7.0-incubating-SNAPSHOT: Failed to > collect dependencies at org.apache.hive:hive-jdbc:jar:3.0.0 -> > org.apache.hive:hive-service:jar:3.0.0 -> > org.apache.hive:hive-llap-server:jar:3.0.0 -> > org.apache.hbase:hbase-server:jar:2.0.0-alpha4 -> > org.glassfish.web:javax.servlet.jsp:jar:2.3.2 -> > org.glassfish:javax.el:jar:3.0.1-b08-SNAPSHOT: Failed to read artifact > descriptor for org.glassfish:javax.el:jar:3.0.1-b08-SNAPSHOT: Could not > transfer artifact org.glassfish:javax.el:pom:3.0.1-b08-SNAPSHOT from/to > apache-snapshots (https://repository.apache.org/snapshots/): Connect to > repository.apache.org:443 [repository.apache.org/207.244.88.140] failed: > Connection timed out (Connection timed out) -> [Help 1] > 2258org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute > goal on project livy-thriftserver: Could not resolve dependencies for project > org.apache.livy:livy-thriftserver:jar:0.7.0-incubating-SNAPSHOT: Failed to > collect dependencies at org.apache.hive:hive-jdbc:jar:3.0.0 -> > org.apache.hive:hive-service:jar:3.0.0 -> > org.apache.hive:hive-llap-server:jar:3.0.0 -> > org.apache.hbase:hbase-server:jar:2.0.0-alpha4 -> > org.glassfish.web:javax.servlet.jsp:jar:2.3.2 -> > org.glassfish:javax.el:jar:3.0.1-b08-SNAPSHOT > 2259 at > org.apache.maven.lifecycle.internal.LifecycleDependencyResolver.getDependencies > (LifecycleDependencyResolver.java:249) > 2260 at > org.apache.maven.lifecycle.internal.LifecycleDependencyResolver.resolveProjectDependencies > (LifecycleDependencyResolver.java:145) > 2261 at > org.apache.maven.lifecycle.internal.MojoExecutor.ensureDependenciesAreResolved > (MojoExecutor.java:246) > 2262 at org.apache.maven.lifecycle.internal.MojoExecutor.execute > (MojoExecutor.java:200) > 2263 at org.apache.maven.lifecycle.internal.MojoExecutor.execute > (MojoExecutor.java:154) > 2264 at org.apache.maven.lifecycle.internal.MojoExecutor.execute > (MojoExecutor.java:146) > 2265 at > org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject > (LifecycleModuleBuilder.java:117) > 2266 at > org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject > (LifecycleModuleBuilder.java:81) > 2267 at > org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build > (SingleThreadedBuilder.java:51) > 2268 at org.apache.maven.lifecycle.internal.LifecycleStarter.execute > (LifecycleStarter.java:128) > 2269 at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:309) > 2270 at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:194) > 2271 at org.apache.maven.DefaultMaven.execute (DefaultMaven.java:107) > 2272 at org.apache.maven.cli.MavenCli.execute (MavenCli.java:955) > 2273 at org.apache.maven.cli.MavenCli.doMain (MavenCli.java:290) > 2274 at org.apache.maven.cli.MavenCli.main (MavenCli.java:194) > 2275 at sun.reflect.NativeMethodAccessorImpl.invoke0 (Native Method) > 2276 at sun.reflect.NativeMethodAccessorImpl.invoke > (NativeMethodAccessorImpl.java:62) > 2277 at sun.reflect.DelegatingMethodAccessorImpl.invoke > (DelegatingMethodAccessorImpl.java:43) > 2278 at java.lang.reflect.Method.invoke (Method.java:498) > 2279 at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced > (Launcher.java:289) > 2280 at org.codehaus.plexus.classworlds.launcher.Launcher.launch > (Launcher.java:229) > 2281 at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode > (Launcher.java:415) > 2282 at org.codehaus.plexus.classworlds.launcher.Launcher.main > (Launcher.java:356) -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (LIVY-623) Implement GetTables metadata operation
[ https://issues.apache.org/jira/browse/LIVY-623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16902987#comment-16902987 ] Yiheng Wang commented on LIVY-623: -- [https://github.com/apache/incubator-livy/pull/194] > Implement GetTables metadata operation > -- > > Key: LIVY-623 > URL: https://issues.apache.org/jira/browse/LIVY-623 > Project: Livy > Issue Type: Sub-task > Components: Thriftserver >Reporter: Yiheng Wang >Priority: Minor > > We should support GetTables metadata operation in Livy thrift server. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (LIVY-624) Implement GetColumns metadata operation
[ https://issues.apache.org/jira/browse/LIVY-624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16902988#comment-16902988 ] Yiheng Wang commented on LIVY-624: -- [https://github.com/apache/incubator-livy/pull/194] > Implement GetColumns metadata operation > --- > > Key: LIVY-624 > URL: https://issues.apache.org/jira/browse/LIVY-624 > Project: Livy > Issue Type: Sub-task > Components: Thriftserver >Reporter: Yiheng Wang >Priority: Minor > > We should support GetColumns metadata operation in Livy thrift server. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (LIVY-625) Implement GetFunctions metadata operation
[ https://issues.apache.org/jira/browse/LIVY-625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16902989#comment-16902989 ] Yiheng Wang commented on LIVY-625: -- [https://github.com/apache/incubator-livy/pull/194] > Implement GetFunctions metadata operation > - > > Key: LIVY-625 > URL: https://issues.apache.org/jira/browse/LIVY-625 > Project: Livy > Issue Type: Sub-task > Components: Thriftserver >Reporter: Yiheng Wang >Priority: Minor > > We should support GetFunctions metadata operation in Livy thrift server. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (LIVY-623) Implement GetTables metadata operation
[ https://issues.apache.org/jira/browse/LIVY-623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16900967#comment-16900967 ] Yiheng Wang commented on LIVY-623: -- Working on it. > Implement GetTables metadata operation > -- > > Key: LIVY-623 > URL: https://issues.apache.org/jira/browse/LIVY-623 > Project: Livy > Issue Type: Sub-task > Components: Thriftserver >Reporter: Yiheng Wang >Priority: Minor > > We should support GetTables metadata operation in Livy thrift server. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (LIVY-622) Implement GetSchemas metadata operation
[ https://issues.apache.org/jira/browse/LIVY-622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16900966#comment-16900966 ] Yiheng Wang commented on LIVY-622: -- Working on it. > Implement GetSchemas metadata operation > --- > > Key: LIVY-622 > URL: https://issues.apache.org/jira/browse/LIVY-622 > Project: Livy > Issue Type: Sub-task > Components: Thriftserver >Reporter: Yiheng Wang >Priority: Minor > > We should support GetSchemas metadata operation in Livy thrift server. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (LIVY-625) Implement GetFunctions metadata operation
[ https://issues.apache.org/jira/browse/LIVY-625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16900969#comment-16900969 ] Yiheng Wang commented on LIVY-625: -- Working on it. > Implement GetFunctions metadata operation > - > > Key: LIVY-625 > URL: https://issues.apache.org/jira/browse/LIVY-625 > Project: Livy > Issue Type: Sub-task > Components: Thriftserver >Reporter: Yiheng Wang >Priority: Minor > > We should support GetFunctions metadata operation in Livy thrift server. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Assigned] (LIVY-632) Implement SetClientInfo metadata operation
[ https://issues.apache.org/jira/browse/LIVY-632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yiheng Wang reassigned LIVY-632: Assignee: (was: Yiheng Wang) > Implement SetClientInfo metadata operation > -- > > Key: LIVY-632 > URL: https://issues.apache.org/jira/browse/LIVY-632 > Project: Livy > Issue Type: Sub-task > Components: Thriftserver >Reporter: Yiheng Wang >Priority: Minor > > We should support SetClientInfo metadata operation in Livy thrift server. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Assigned] (LIVY-630) Implement RenewDelegationToken metadata operation
[ https://issues.apache.org/jira/browse/LIVY-630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yiheng Wang reassigned LIVY-630: Assignee: (was: Yiheng Wang) > Implement RenewDelegationToken metadata operation > - > > Key: LIVY-630 > URL: https://issues.apache.org/jira/browse/LIVY-630 > Project: Livy > Issue Type: Sub-task > Components: Thriftserver >Reporter: Yiheng Wang >Priority: Minor > > We should support RenewDelegationToken metadata operation in Livy thrift > server. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (LIVY-631) Implement GetQueryId metadata operation
Yiheng Wang created LIVY-631: Summary: Implement GetQueryId metadata operation Key: LIVY-631 URL: https://issues.apache.org/jira/browse/LIVY-631 Project: Livy Issue Type: Sub-task Components: Thriftserver Reporter: Yiheng Wang Assignee: Yiheng Wang We should support GetSchemas metadata operation in Livy thrift server. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (LIVY-631) Implement GetQueryId metadata operation
[ https://issues.apache.org/jira/browse/LIVY-631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yiheng Wang updated LIVY-631: - Description: We should support GetQueryId metadata operation in Livy thrift server. (was: We should support GetSchemas metadata operation in Livy thrift server.) > Implement GetQueryId metadata operation > --- > > Key: LIVY-631 > URL: https://issues.apache.org/jira/browse/LIVY-631 > Project: Livy > Issue Type: Sub-task > Components: Thriftserver >Reporter: Yiheng Wang >Assignee: Yiheng Wang >Priority: Minor > > We should support GetQueryId metadata operation in Livy thrift server. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (LIVY-629) Implement CancelDelegationToken metadata operation
Yiheng Wang created LIVY-629: Summary: Implement CancelDelegationToken metadata operation Key: LIVY-629 URL: https://issues.apache.org/jira/browse/LIVY-629 Project: Livy Issue Type: Sub-task Components: Thriftserver Reporter: Yiheng Wang Assignee: Yiheng Wang We should support CancelDelegationToken metadata operation in Livy thrift server. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (LIVY-628) Implement GetDelegationToken metadata operation
Yiheng Wang created LIVY-628: Summary: Implement GetDelegationToken metadata operation Key: LIVY-628 URL: https://issues.apache.org/jira/browse/LIVY-628 Project: Livy Issue Type: Sub-task Components: Thriftserver Reporter: Yiheng Wang Assignee: Yiheng Wang We should support GetDelegationToken metadata operation in Livy thrift server. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (LIVY-627) Implement GetCrossReference metadata operation
Yiheng Wang created LIVY-627: Summary: Implement GetCrossReference metadata operation Key: LIVY-627 URL: https://issues.apache.org/jira/browse/LIVY-627 Project: Livy Issue Type: Sub-task Components: Thriftserver Reporter: Yiheng Wang Assignee: Yiheng Wang We should support GetCrossReference metadata operation in Livy thrift server. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (LIVY-626) Implement GetPrimaryKeys metadata operation
Yiheng Wang created LIVY-626: Summary: Implement GetPrimaryKeys metadata operation Key: LIVY-626 URL: https://issues.apache.org/jira/browse/LIVY-626 Project: Livy Issue Type: Sub-task Components: Thriftserver Reporter: Yiheng Wang Assignee: Yiheng Wang We should support GetPrimaryKeys metadata operation in Livy thrift server. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (LIVY-624) Implement GetColumns metadata operation
Yiheng Wang created LIVY-624: Summary: Implement GetColumns metadata operation Key: LIVY-624 URL: https://issues.apache.org/jira/browse/LIVY-624 Project: Livy Issue Type: Sub-task Components: Thriftserver Reporter: Yiheng Wang Assignee: Yiheng Wang We should support GetColumns metadata operation in Livy thrift server. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (LIVY-625) Implement GetFunctions metadata operation
Yiheng Wang created LIVY-625: Summary: Implement GetFunctions metadata operation Key: LIVY-625 URL: https://issues.apache.org/jira/browse/LIVY-625 Project: Livy Issue Type: Sub-task Components: Thriftserver Reporter: Yiheng Wang Assignee: Yiheng Wang We should support GetFunctions metadata operation in Livy thrift server. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (LIVY-622) Implement GetSchemas metadata operation
Yiheng Wang created LIVY-622: Summary: Implement GetSchemas metadata operation Key: LIVY-622 URL: https://issues.apache.org/jira/browse/LIVY-622 Project: Livy Issue Type: Sub-task Components: Thriftserver Reporter: Yiheng Wang Assignee: Yiheng Wang -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (LIVY-615) livy.ui.basePath does not seem to work correctly
[ https://issues.apache.org/jira/browse/LIVY-615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16896746#comment-16896746 ] Yiheng Wang commented on LIVY-615: -- Can you raise a PR for your patch? > livy.ui.basePath does not seem to work correctly > > > Key: LIVY-615 > URL: https://issues.apache.org/jira/browse/LIVY-615 > Project: Livy > Issue Type: Bug > Components: Server >Affects Versions: 0.6.0 >Reporter: Ferdinand de Antoni >Priority: Major > Original Estimate: 5m > Remaining Estimate: 5m > > When setting the property {{livy.ui.basePath}}, a HTTP error 404 is returned. > To resolve this problem, the context path in {{WebServer.scala}} should be > set as well, e.g.: > {code:java} > context.setContextPath(livyConf.get(LivyConf.SERVER_BASE_PATH)){code} > Adding this seems to resolve this problem. Note that this of course also > changes the context path of the API, not just the UI, but I presumed that was > also the intention of the {{livy.ui.basePath}} property. > Below patch of suggested changes: > {noformat} > Index: server/src/main/scala/org/apache/livy/server/WebServer.scala > IDEA additional info: > Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP > <+>UTF-8 > === > --- server/src/main/scala/org/apache/livy/server/WebServer.scala (revision > 92062e1659db2af85711b1f35c50ff4050fec675) > +++ server/src/main/scala/org/apache/livy/server/WebServer.scala (revision > bdfb75d08bf34633ff23d7e4db380aca1fdf4d8e) > @@ -81,7 +81,7 @@ >val context = new ServletContextHandler() > - context.setContextPath("/") > + context.setContextPath(livyConf.get(LivyConf.SERVER_BASE_PATH)) >context.addServlet(classOf[DefaultServlet], "/") >val handlers = new HandlerCollection > @@ -114,7 +114,7 @@ > } > port = connector.getLocalPort > -info("Starting server on %s://%s:%d" format (protocol, host, port)) > +info("Starting server on %s://%s:%d/%s" format (protocol, host, port, > livyConf.get(LivyConf.SERVER_BASE_PATH))) >} >def join(): Unit = {{noformat} > -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (LIVY-575) Implement missing metadata operations
[ https://issues.apache.org/jira/browse/LIVY-575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16890653#comment-16890653 ] Yiheng Wang commented on LIVY-575: -- [~mgaido] Looks like PR-182 is to fix Jira-571 instead of this one, right? > Implement missing metadata operations > - > > Key: LIVY-575 > URL: https://issues.apache.org/jira/browse/LIVY-575 > Project: Livy > Issue Type: Improvement > Components: Thriftserver >Reporter: Marco Gaido >Priority: Minor > Fix For: 0.7.0 > > > Many metadata operations (eg. table list retrieval, schema retrieval, ...) > are currently not implemented. We should implement them. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (LIVY-575) Implement missing metadata operations
[ https://issues.apache.org/jira/browse/LIVY-575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16887786#comment-16887786 ] Yiheng Wang commented on LIVY-575: -- I can help create sub-tasks > Implement missing metadata operations > - > > Key: LIVY-575 > URL: https://issues.apache.org/jira/browse/LIVY-575 > Project: Livy > Issue Type: Improvement > Components: Thriftserver >Reporter: Marco Gaido >Priority: Minor > > Many metadata operations (eg. table list retrieval, schema retrieval, ...) > are currently not implemented. We should implement them. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (LIVY-575) Implement missing metadata operations
[ https://issues.apache.org/jira/browse/LIVY-575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16887692#comment-16887692 ] Yiheng Wang commented on LIVY-575: -- Hi Marco I'm interested in contributing to this. There're 11 operations, will submit several PRs for it. Thanks > Implement missing metadata operations > - > > Key: LIVY-575 > URL: https://issues.apache.org/jira/browse/LIVY-575 > Project: Livy > Issue Type: Improvement > Components: Thriftserver >Reporter: Marco Gaido >Priority: Minor > > Many metadata operations (eg. table list retrieval, schema retrieval, ...) > are currently not implemented. We should implement them. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (LIVY-583) python build needs configparser
[ https://issues.apache.org/jira/browse/LIVY-583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16883769#comment-16883769 ] Yiheng Wang commented on LIVY-583: -- configparser is already in the setup.py, [https://github.com/apache/incubator-livy/blob/master/python-api/setup.py#L32] > python build needs configparser > --- > > Key: LIVY-583 > URL: https://issues.apache.org/jira/browse/LIVY-583 > Project: Livy > Issue Type: Bug >Affects Versions: 0.6.0 >Reporter: Felix Cheung >Priority: Major > > pip install configparser just to build. it won't build until I manually pip > install. > > (run into this in my 2nd environment) -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (LIVY-582) python test_create_new_session_without_default_config test fails consistently
[ https://issues.apache.org/jira/browse/LIVY-582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16883767#comment-16883767 ] Yiheng Wang commented on LIVY-582: -- Just change the upper case characters in your hostname into lower case and the issue will go.:P The root cause is the python Request lib will change the request url to lower case, while the mock lib is case sensitive... > python test_create_new_session_without_default_config test fails consistently > - > > Key: LIVY-582 > URL: https://issues.apache.org/jira/browse/LIVY-582 > Project: Livy > Issue Type: Bug >Affects Versions: 0.6.0 >Reporter: Felix Cheung >Priority: Major > > {code:java} > test_create_new_session_without_default_config > def test_create_new_session_without_default_config(): > > mock_and_validate_create_new_session(False) > src/test/python/livy-tests/client_test.py:105: > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > :3: in wrapper > ??? > src/test/python/livy-tests/client_test.py:48: in > mock_and_validate_create_new_session > load_defaults=defaults) > src/main/python/livy/client.py:88: in __init__ > session_conf_dict).json()['id'] > src/main/python/livy/client.py:388: in _create_new_session > headers=self._conn._JSON_HEADERS, data=data) > src/main/python/livy/client.py:500: in send_request > json=data, auth=self._spnego_auth()) > .eggs/requests-2.21.0-py2.7.egg/requests/api.py:60: in request > return session.request(method=method, url=url, **kwargs) > .eggs/requests-2.21.0-py2.7.egg/requests/sessions.py:533: in request > resp = self.send(prep, **send_kwargs) > .eggs/requests-2.21.0-py2.7.egg/requests/sessions.py:646: in send > r = adapter.send(request, **kwargs) > .eggs/responses-0.10.6-py2.7.egg/responses.py:626: in unbound_on_send > return self._on_request(adapter, request, *a, **kwargs) > self = > adapter = > request = > kwargs = {'cert': None, 'proxies': OrderedDict(), 'stream': False, 'timeout': > 10, ...} > match = None, resp_callback = None > error_msg = "Connection refused by Responses: POST > http://machine:8998/sessions/ doesn't match Responses Mock" > response = ConnectionError(u"Connection refused by Responses: POST > http://machine:8998/sessions/doesn't match Responses Mock",) > {code} > Not sure why. this fails 100% and I don't see anything listening to this > port. Need some help to troubleshoot this. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Issue Comment Deleted] (LIVY-583) python build needs configparser
[ https://issues.apache.org/jira/browse/LIVY-583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yiheng Wang updated LIVY-583: - Comment: was deleted (was: It is configured in the setup.py. [https://github.com/apache/incubator-livy/blob/master/python-api/setup.py#L32]) > python build needs configparser > --- > > Key: LIVY-583 > URL: https://issues.apache.org/jira/browse/LIVY-583 > Project: Livy > Issue Type: Bug >Affects Versions: 0.6.0 >Reporter: Felix Cheung >Priority: Major > > pip install configparser just to build. it won't build until I manually pip > install. > > (run into this in my 2nd environment) -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (LIVY-583) python build needs configparser
[ https://issues.apache.org/jira/browse/LIVY-583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16883563#comment-16883563 ] Yiheng Wang commented on LIVY-583: -- It is configured in the setup.py. [https://github.com/apache/incubator-livy/blob/master/python-api/setup.py#L32] > python build needs configparser > --- > > Key: LIVY-583 > URL: https://issues.apache.org/jira/browse/LIVY-583 > Project: Livy > Issue Type: Bug >Affects Versions: 0.6.0 >Reporter: Felix Cheung >Priority: Major > > pip install configparser just to build. it won't build until I manually pip > install. > > (run into this in my 2nd environment) -- This message was sent by Atlassian JIRA (v7.6.14#76016)